Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am import into python where it will automatically create a date time object.

However I want the first column to be a datetime object in Python. Data looks like

Date,cost
41330.66667,100
41331.66667,101
41332.66667,102
41333.66667,103

Current code looks like:

from datetime import datetime
import pandas as pd
data = pd.read_csv(r"F:\Sam\PJ\CSV2.csv")
data['Date'].apply(lambda x: datetime.strptime(x, '%d/%m/%Y'))
print(data)
                The date column in the sample dataframe is not in any date format or the one specified in the code @Sam Harper
– Vishnudev Krishnadas
                Jan 6, 2020 at 18:28

This looks like an excel datetime format. This is called a serial date. To convert from that serial date you can do this:

data['Date'].apply(lambda x: datetime.fromtimestamp( (x - 25569) *86400.0))

Which outputs:

>>> data['Date'].apply(lambda x: datetime.fromtimestamp( (x - 25569) *86400.0))
0   2013-02-25 10:00:00.288
1   2013-02-26 10:00:00.288
2   2013-02-27 10:00:00.288
3   2013-02-28 10:00:00.288

To assign it to data['Date'] you just do:

data['Date'] = data['Date'].apply(lambda x: datetime.fromtimestamp( (x - 25569) *86400.0))
                     Date  cost
0 2013-02-25 16:00:00.288   100
1 2013-02-26 16:00:00.288   101
2 2013-02-27 16:00:00.288   102
3 2013-02-28 16:00:00.288   103

Unfortunately, read_csv does not cope with date columns given as numbers. But the good news is that Pandas does have a suitable function to do it. After read_csv call:

df.Date = pd.to_datetime(df.Date - 25569, unit='D').dt.round('ms')

As I undestand, your Date is actually the number of days since 30.12.1899 (plus fractional part of the day). The above "correction factor" (25569) works OK. For Date == 0 it gives just the above start of Excel epoch date.

Rounding to miliseconds (or maybe even seconds) is advisable. Otherwise you will get weird effects resulting from inaccurate rounding of fractional parts of day. E.g. 0.33333333 corresponding to 8 hours can be computed as 07:59:59.999712.

  • We don't know what data and columns the CSV has, but in order for pandas to pick up the date as a column, it must be a column on that csv file.

  • Apply doesn't work in place. You would have to assign the result of apply back to date, as data['Date'] = data['Date'].apply(lambda x: datetime.strptime(x, '%d/%m/%Y'))

  • Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.