Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Ask Question
I was going through the
documentation
about the hierarchical indexing in Pandas. I tried testing the examples from it to create an empty dataframe with hierarchical indexing:
In [5]: df = pd.DataFrame()
In [6]: df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
However, it throws an error:
ValueError Traceback (most recent call last)
<ipython-input-6-dd823f9b8d22> in <module>()
----> 1 df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __setattr__(self, name, value)
2755 try:
2756 object.__getattribute__(self, name)
-> 2757 return object.__setattr__(self, name, value)
2758 except AttributeError:
2759 pass
pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44873)()
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
447 def _set_axis(self, axis, labels):
--> 448 self._data.set_axis(axis, labels)
449 self._clear_item_cache()
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
2800 raise ValueError('Length mismatch: Expected axis has %d elements, '
2801 'new values have %d elements' %
-> 2802 (old_len, new_len))
2804 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements
I don't see any problem with my code. Any ideas what is happening?
The problem is that you have an empty data frame which has zero columns, and you are trying to assign a four columns multi-index to it; If you create an empty data frame of four columns initially, the error will be gone:
df = pd.DataFrame(pd.np.empty((0, 4)))
df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
Or you can create empty data frame with the multi-index as follows:
multi_index = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
df = pd.DataFrame(columns=multi_index)
# first second
# a b a b
–
–
–
This solution does not require numpy:
# create empty DataFrame with 4 columns
df = pd.DataFrame(columns = range(4))
df.columns = pd.MultiIndex(
levels = [['first', 'second'], ['a', 'b']],
codes = [[0, 0, 1, 1], [0, 1, 0, 1]]
(Note: I changed labels to codes because that was changed in Pandas v1.0.0)
This error can also occur if you have used df.loc[, <col_name>]= value
and you have not wrapped the condition within double brackets (). Make sure to always insert conditions in loc statements in double brackets.
It should be something similar to the one below:
df.loc[<(condition1) & (condition2)>, <col_name>]= value
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.