Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams
 BirthYear    Sex    Area    Count
2015         W      Dhaka    6
2015         M      Dhaka    3
2015         W      Khulna   1
2015         M      Khulna   8
2014         M      Dhaka    13
2014         W      Dhaka    20
2014         M      Khulna   9
2014         W      Khulna   6
2013         W      Dhaka    11
2013         M      Dhaka    2
2013         W      Khulna    8
2013         M      Khulna    5
2012         M      Dhaka    12
2012         W      Dhaka    4
2012         W      Khulna    7
2012         M      Khulna    1

now I want to create a barchart in Pandas where only the Male & Female born on 2015 will be shown. The code :

df = pd.read_csv('out.csv')
df=df.reset_index()
df=df.loc[df["BirthYear"]==2015]
agg_df = df.groupby(['Sex']).sum()
agg_df.reset_index(inplace=True)
piv_df = agg_df.pivot(columns='Sex', values='Count')
piv_df.plot.bar(stacked=True)
plt.show()

and after execution,IDLE shows this error:

    Traceback (most recent call last):
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\indexes\base.py", line 1945, in get_loc
    return self._engine.get_loc(key)
  File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4066)
  File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:3930)
  File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12408)
  File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12359)
KeyError: 'BirthYear'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:/Users/sabid/Dropbox/Freelancing/data visualization python/pie.py", line 8, in <module>
    df=df.loc[df["StichtagDatJahr"]==2015]
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
    values = self._data.get(item)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\internals.py", line 3290, in get
    loc = self.items.get_loc(item)
  File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4066)
  File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:3930)
  File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12408)
  File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12359)
KeyError: 'BirthYear'

I came to know from this link that it happens because the 'BirthYear' column name has some header before it. But I don't know how to remove the header and make the code work. Is there any fruitful solution for this?

What do you mean by "some header before it?" If you mean that there's a a space at the start of the string? – Batman Nov 5, 2016 at 22:42 @SabidBinHabib, can you post an output of print(df.columns.tolist()) just after calling pd.read_csv(...)? – MaxU - stand with Ukraine Nov 5, 2016 at 22:50 @SabidBinHabib, what is your pandas version? Pandas 0.19.0 should be able to fix this problem automatically – MaxU - stand with Ukraine Nov 5, 2016 at 23:18 I would also say replace df=df.reset_index() with df.reset_index(drop=True, inplace=True). Setting drop to false will make the previous index appear as a new column named index. – jkr Nov 5, 2016 at 22:54 @Batman, df.rename(columns=list) will produce TypeError: 'list' object is not callable. Test this: df = pd.DataFrame(np.random.rand(3,2),columns=list('ab')); df.rename(columns=['X','Y']) – MaxU - stand with Ukraine Nov 5, 2016 at 23:00 @Jakub it shows error: df.loc[df["BirthYear"]==2015] AttributeError: 'NoneType' object has no attribute 'loc' – Sabid Habib Nov 5, 2016 at 23:04 You should be able to use a list. pandas.pydata.org/pandas-docs/stable/generated/… If you're using an older version of Pandas you can use a dict, or just df.columns=["BirthYear", "Sex", "Area", "Count"] – Batman Nov 5, 2016 at 23:08 @Batman, here is a way how to use a list: df = pd.DataFrame(np.random.rand(3,2),columns=list('ab')); df = df.rename(columns=lambda x: ['X','Y'][df.columns.get_loc(x)]). But the proper solution would be to specify correct encoding with BOM – MaxU - stand with Ukraine Nov 5, 2016 at 23:12

I am not sure about this, but I think using the pivot method messed you up. You don't need to use pivot because agg_df is basically a pivot table. Here is the code I used to create that graph:

import pandas as pd
# I made this to approximate your CSV file.
table = {
    'BirthYear': [2015, 2015, 2015, 2015, 2014, 2014,],
    'Sex': ['W', 'M', 'W', 'M', 'M', 'W',],
    'Area': ['Dhaka', 'Dhaka', 'Khulna', 'Khulna', 'Dhaka', 'Dhaka',],
    'Count': [6, 3, 1, 8, 13, 20]
df = pd.DataFrame(table)
df = df.reset_index(drop=True)
# Select people born in 2015.
df = df.loc[df["BirthYear"] == 2015]
# This is basically a pivot table.
agg_df = df.groupby(['Sex']).sum()
# Make the plot.
agg_df['Count'].plot.bar(stacked=True)
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.