Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I am import into python where it will automatically create a date time object.
However I want the first column to be a datetime object in Python. Data looks like
Date,cost
41330.66667,100
41331.66667,101
41332.66667,102
41333.66667,103
Current code looks like:
from datetime import datetime
import pandas as pd
data = pd.read_csv(r"F:\Sam\PJ\CSV2.csv")
data['Date'].apply(lambda x: datetime.strptime(x, '%d/%m/%Y'))
print(data)
–
This looks like an excel datetime format. This is called a serial date. To convert from that serial date you can do this:
data['Date'].apply(lambda x: datetime.fromtimestamp( (x - 25569) *86400.0))
Which outputs:
>>> data['Date'].apply(lambda x: datetime.fromtimestamp( (x - 25569) *86400.0))
0 2013-02-25 10:00:00.288
1 2013-02-26 10:00:00.288
2 2013-02-27 10:00:00.288
3 2013-02-28 10:00:00.288
To assign it to data['Date']
you just do:
data['Date'] = data['Date'].apply(lambda x: datetime.fromtimestamp( (x - 25569) *86400.0))
Date cost
0 2013-02-25 16:00:00.288 100
1 2013-02-26 16:00:00.288 101
2 2013-02-27 16:00:00.288 102
3 2013-02-28 16:00:00.288 103
Unfortunately, read_csv does not cope with date columns given as numbers.
But the good news is that Pandas does have a suitable function to do it.
After read_csv call:
df.Date = pd.to_datetime(df.Date - 25569, unit='D').dt.round('ms')
As I undestand, your Date is actually the number of days since 30.12.1899
(plus fractional part of the day).
The above "correction factor" (25569) works OK. For Date == 0 it gives
just the above start of Excel epoch date.
Rounding to miliseconds (or maybe even seconds) is advisable.
Otherwise you will get weird effects resulting from inaccurate rounding
of fractional parts of day.
E.g. 0.33333333 corresponding to 8 hours can be computed as
07:59:59.999712.
We don't know what data and columns the CSV has, but in order for pandas to pick up the date as a column, it must be a column on that csv file.
Apply doesn't work in place. You would have to assign the result of apply back to date, as
data['Date'] = data['Date'].apply(lambda x: datetime.strptime(x, '%d/%m/%Y'))
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.