Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
The main column is
volume
.
In the timestamp sequence, weekends and some other weekdays are not present. I want to resample my time index to have the aggregate sum of volume per minute. So I do the following:
df = df.resample('60S', how=sum)
There are some missing minutes. In other words, there are minutes where there are no trades. I want to include these missing minutes and add a 0 to the column
volume
.
To solve this, I would usually do something like:
new_range = pd.date_range('20110104 09:30:00','20111231 16:00:00',
freq='60s')+df.index
df = df.reindex(new_range)
df = df.between_time(start_time='10:00', end_time='16:00') # time interval per day that I want
df = df.fillna(0)
But now I am stuck with unwanted dates like the weekends and some other days. How can I get rid of the dates that were not originally in my timestamp index?
Out[10]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-10 16:00:00]
Length: 447, Freq: 30T, Timezone: None
Eliminate times out of range
In [11]: rng = rng.take(rng.indexer_between_time('09:30','16:00'))
In [12]: rng
Out[12]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:30:00, ..., 2013-01-10 16:00:00]
Length: 140, Freq: None, Timezone: None
Eliminate non-weekdays
In [13]: rng = rng[rng.weekday<5]
In [14]: rng
Out[14]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:30:00, ..., 2013-01-10 16:00:00]
Length: 112, Freq: None, Timezone: None
Just looking at the values, you prob want df.reindex(index=rng)
In [15]: rng.to_series()
Out[15]:
2013-01-01 09:30:00 2013-01-01 09:30:00
2013-01-01 10:00:00 2013-01-01 10:00:00
2013-01-01 10:30:00 2013-01-01 10:30:00
2013-01-01 11:00:00 2013-01-01 11:00:00
2013-01-01 11:30:00 2013-01-01 11:30:00
2013-01-01 12:00:00 2013-01-01 12:00:00
2013-01-01 12:30:00 2013-01-01 12:30:00
2013-01-01 13:00:00 2013-01-01 13:00:00
2013-01-01 13:30:00 2013-01-01 13:30:00
2013-01-01 14:00:00 2013-01-01 14:00:00
2013-01-01 14:30:00 2013-01-01 14:30:00
2013-01-01 15:00:00 2013-01-01 15:00:00
2013-01-01 15:30:00 2013-01-01 15:30:00
2013-01-01 16:00:00 2013-01-01 16:00:00
2013-01-02 09:30:00 2013-01-02 09:30:00
2013-01-09 16:00:00 2013-01-09 16:00:00
2013-01-10 09:30:00 2013-01-10 09:30:00
2013-01-10 10:00:00 2013-01-10 10:00:00
2013-01-10 10:30:00 2013-01-10 10:30:00
2013-01-10 11:00:00 2013-01-10 11:00:00
2013-01-10 11:30:00 2013-01-10 11:30:00
2013-01-10 12:00:00 2013-01-10 12:00:00
2013-01-10 12:30:00 2013-01-10 12:30:00
2013-01-10 13:00:00 2013-01-10 13:00:00
2013-01-10 13:30:00 2013-01-10 13:30:00
2013-01-10 14:00:00 2013-01-10 14:00:00
2013-01-10 14:30:00 2013-01-10 14:30:00
2013-01-10 15:00:00 2013-01-10 15:00:00
2013-01-10 15:30:00 2013-01-10 15:30:00
2013-01-10 16:00:00 2013-01-10 16:00:00
Length: 112
You could also start with a constructed business day freq series (and/or add custom business day if you want holidays, new in 0.14.0, see here
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.