Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I am using the code below but get an error after pivoting the DataFrame:
dataframe:
name day value time
0 MAC000002 2012-12-16 0.147 09:30:00
1 MAC000002 2012-12-16 0.110 10:00:00
2 MAC000002 2012-12-16 0.736 10:30:00
3 MAC000002 2012-12-16 0.404 11:00:00
4 MAC000003 2012-12-16 0.845 00:30:00
Read in data, and pivot
ddf = dd.read_csv('data.csv')
#I added this but didnt fix the error below
ddf.index.name = 'index'
#dask requires string as category type
ddf['name'] = ddf['name'].astype('category')
ddf['name'] =ddf['name'].cat.as_known()
#pivot the table
df = ddf.pivot_table(columns='name', values='value', index='index')
df.head()
#KeyError: 'index'
Expected result (with or without index) - pivot rows to columns without any value modification:
MAC000002 MAC000003 ...
0.147 0.845
0.110 ...
0.736 ...
0.404 ...
Any idea why I am getting a KeyError 'index'
and how I can overcome this?
–
–
According to the docs for pivot_table
, value of index
kwarg should refer to an existing column, so instead of setting name to the index, a column should be created with the desired index value:
# ddf.index.name = 'index'
ddf['index'] = ddf.index
Note that this assumes that the index is what you are really pivoting by.
Below is a reproducible snippet:
data = """
| name | day | value | time
0 | MAC000002 | 2012-12-16| 0.147| 09:30:00
1 | MAC000002 | 2012-12-16| 0.110| 10:00:00
2 | MAC000002 | 2012-12-16| 0.736| 10:30:00
3 | MAC000002 | 2012-12-16| 0.404| 11:00:00
4 | MAC000003 | 2012-12-16| 0.845| 00:30:00
import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO(data), sep='|')
df.columns = [c.strip() for c in df.columns]
import dask.dataframe as dd
ddf = dd.from_pandas(df, npartitions=3)
ddf['index'] = ddf.index
#dask requires string as category type
ddf['name'] = ddf['name'].astype('category')
ddf['name'] =ddf['name'].cat.as_known()
ddf.pivot_table(columns='name', values='value', index='index').compute()
# name MAC000002 MAC000003
# index
# 0 0.147 NaN
# 1 0.110 NaN
# 2 0.736 NaN
# 3 0.404 NaN
# 4 NaN 0.845
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.