Pandas Number Of Business Days Between A Datetimeindex And A Timestamp
Solution 1:
TimedeltaIndex
es represent fixed spans of time. They can be added to Pandas Timestamps to increment them by fixed amounts. Their behavior is never dependent on whether or not the Timestamp is a business day.
The TimedeltaIndex itself is never business-day aware.
Since the ultimate goal is to count the number of days between a DatetimeIndex and a Timestamp, I would look in another direction than conversion to TimedeltaIndex.
Unfortunately, date calculations are rather complicated, and a number of data structures have sprung up to deal with them -- Python datetime.dates
, datetime.datetime
s, Pandas Timestamps
, NumPy datetime64
s.
They each have their strengths, but no one of them is good for all purposes. To take advantage of their strengths, it is sometime necessary to convert between these types.
To use np.busday_count
you need to convert the DatetimeIndex and Timestamp to
some type np.busday_count
understands. What you call kludginess is the code
required to convert types. There is no way around that assuming we want to use np.busday_count
-- and I know of no better tool for this job than np.busday_count
.
So, although I don't think there is a more succinct way to count business days
than than the method you propose, there is a far more performant way:
Convert to datetime64[D]
's instead of Python datetime.date
objects:
import pandas as pd
import numpy as np
drg = pd.date_range('2000-07-31', '2015-08-05', freq='B')
timestamp = pd.Timestamp('2015-08-05', 'B')
defusing_astype(drg, timestamp):
A = drg.values.astype('<M8[D]')
B = timestamp.asm8.astype('<M8[D]')
return np.busday_count(A, B)
defusing_datetimes(drg, timestamp):
A = [d.date() for d in drg]
B = pd.Timestamp('2015-08-05', 'B').date()
return np.busday_count(A, B)
This is over 100x faster for the example above (where len(drg)
is close to 4000):
In [88]: %timeit using_astype(drg, timestamp)
10000 loops, best of3: 95.4 µs per loop
In [89]: %timeit using_datetimes(drg, timestamp)
100 loops, best of3: 10.3 ms per loop
np.busday_count
converts its input to datetime64[D]
s anyway, so avoiding this extra conversion to and from datetime.date
s is far more efficient.
Post a Comment for "Pandas Number Of Business Days Between A Datetimeindex And A Timestamp"