Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I want to convert the Date column from yyyy-mm-dd (e.g. 2015-06-01 ) to yyyy-ww (e.g. 2015-23 ), so I run the following piece of code:

dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%V')

Then I group by my Sales based on weeks, i.e.

data = dataset.groupby(['Date'])["Sales"].sum().reset_index()
data.head(10)
    Date    Sales
0   2015-01 67797.0
1   2015-02 102714.0
2   2015-03 107011.0
3   2015-04 121480.0
4   2015-05 148098.0
5   2015-06 132152.0
6   2015-07 133914.0
7   2015-08 136160.0
8   2015-09 185471.0
9   2015-10 190793.0

Now I want to create a date range based on the Date column, since I'm predicting sales based on weeks:

ds = data.Date.values
ds_pred = pd.date_range(start=ds.min(), periods=len(ds) + num_pred_weeks, 
freq="W")

However I'm getting the following error: could not convert string to Timestamp which I'm not really sure how to fix. So, if I use 2015-01-01 as the starting date of my date-import I get no error, which makes me realize that I'm using the functions wrong. However, I'm not sure how?

I would like to basically have a date range that spans weekly from the current week and then 52 weeks into the future.

Seems like you are refering to the wrong object, in your question above you insert the new dataframe in data, but in your pd.data_range line you call ds. Try this: ds_pred = pd.date_range(start=data.min(), periods=len(data) + num_pred_weeks, freq="W") – Erfan May 27, 2019 at 13:08 Could you check your Date column type after this your step: data = dataset.groupby(['Date'])["Sales"].sum().reset_index() is it object type? – AT_asks May 27, 2019 at 13:10

I think problem is want create minimum of dataset["Date"] column filled by strings in format YYYY-VV. But for pass to date_range need format YYYY-MM-DD or datetime object.

I found this:

Several additional directives not required by the C89 standard are included for convenience. These parameters all correspond to ISO 8601 date values. These may not be available on all platforms when used with the strftime() method. The ISO 8601 year and ISO 8601 week directives are not interchangeable with the year and week number directives above. Calling strptime() with incomplete or ambiguous ISO 8601 directives will raise a ValueError.

%V ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4.

Pandas 0.24.2 bug with YYYY-VV format:

dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02']})
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%V')
print (dataset)
0  2015-23
1  2015-23
ds = pd.to_datetime(dataset['Date'], format='%Y-%V')
print (ds)

ValueError: 'V' is a bad directive in format '%Y-%V'

Possible solution is use %U or %W, check this:

%U Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.
%W Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.

dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02']})
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%U')
print (dataset)
0  2015-22
1  2015-22
ds = pd.to_datetime(dataset['Date'] + '-1', format='%Y-%U-%w')
print (ds)
0   2015-06-01
1   2015-06-01
Name: Date, dtype: datetime64[ns]

Or using data from original DataFrame in datetimes:

dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02'],
                        'Sales':[10,20]})
dataset["Date"] = pd.to_datetime(dataset["Date"])
print (dataset)
        Date  Sales
0 2015-06-01     10
1 2015-06-02     20
data = dataset.groupby(dataset['Date'].dt.strftime('%Y-%V'))["Sales"].sum().reset_index()
print (data)
      Date  Sales
0  2015-23     30
num_pred_weeks = 5
ds = data.Date.values
ds_pred = pd.date_range(start=dataset["Date"].min(), periods=len(ds) + num_pred_weeks, freq="W")
print (ds_pred)
DatetimeIndex(['2015-06-07', '2015-06-14', '2015-06-21', 
               '2015-06-28',
               '2015-07-05', '2015-07-12'],
              dtype='datetime64[ns]', freq='W-SUN')
                @Erfan is right - I merely want to increment on weekly data (using pd.date_range or some other method). It seems like it doesn't recognize the formatting as a yyyy-ww format .
– Artem
                May 27, 2019 at 13:23

If ds contains dates as string formatted as '2015-01' which should be '%Y-%W' (or '%G-%V' in datetime library) you have to add a day number to obtain a day. Here, assuming that you want the monday you should to:

ds_pred = pd.date_range(start=pd.to_datetime(ds.min() + '-1', format='%Y-%W-%w',
                                             periods=len(ds) + num_pred_weeks, freq="W")
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.