Reformat value_counts() analysis in Pandas for large number of columns

0 votes

I have dataset that consist of hundreds of column, and thousands of row

In [119]:
df.columns
Out[119]:
Index(['column 1', 'column2',
       ...
       'column 100'],
      dtype='object', name='var_name')

Usually I did value_counts() for every single column to see the distribution.

In [121]:
a = df['column1'].value_counts()
In [122]:
a
Out[122]:
1     77494
2      5389
0      2016
3       878
Name: column 1, dtype: int64

But for this dataframe, if I did this for every columns, this will make my notebook very messy, how to automate this? Is there any function that help?

If you have other information, all my data is int64, but I hope the best answer can give solution that works in every cases. I want to make the solution answer in pandas dataframe.

Based on @MaxU suggestion, this is my version of simplified dataframe

df

id  column1  column2 column3
1         3        1       7
2         3        2       8
3         2        3       7
4         2        1       8
5         1        2       7

and my expected output is:

column 1   count
1          1
2          2
3          2
column 2   count
1          2
2          2
3          1
column 3   count
7          3
8          2
3          1

Sep 25, 2018 in Python by bug_seeker
• 14,970 points
197 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

I'd do it this way:

In [83]: df.drop('id',1).apply(lambda c: c.value_counts().to_dict())
Out[83]:
column1    {3: 2, 2: 2, 1: 1}
column2    {2: 2, 1: 2, 3: 1}
column3          {7: 3, 8: 2}
dtype: object

or:

In [84]: for c in df.drop('id',1):
    ...:     print(df[c].value_counts())
    ...:
3    2
2    2
1    1
Name: column1, dtype: int64   # <----- column name
2    2
1    2
3    1
Name: column2, dtype: int64
7    3
8    2
Name: column3, dtype: int64
answered Sep 25, 2018 by Priyaj
• 56,120 points

Related Questions In Python

0 votes
1 answer

How can I reformat value_counts() analysis in Pandas for large number of columns?

If I were you, I'd do it ...READ MORE

answered Apr 17, 2018 in Python by anonymous
1,534 views
0 votes
2 answers

How to calculate square root of a number in python?

calculate square root in python >>> import math ...READ MORE

answered Apr 2 in Python by anonymous
100 views
0 votes
1 answer

Number of days between dates in Python

You can use the date module to ...READ MORE

answered May 30, 2018 in Python by Nietzsche's daemon
• 4,260 points
24 views
0 votes
1 answer

Lazy loading of columns in sqlalchemy python

class Book(Base): __tablename__ = ...READ MORE

answered Nov 8, 2018 in Python by Nymeria
• 3,500 points
27 views
0 votes
1 answer

How to rename columns in pandas (Python)?

It is easy by just adding ".columns" ...READ MORE

answered Apr 30, 2018 in Data Analytics by DeepCoder786
• 1,700 points
54 views
0 votes
1 answer

What is the Difference in Size and Count in pandas (python)?

The major difference is size includes NaN ...READ MORE

answered Apr 30, 2018 in Data Analytics by DeepCoder786
• 1,700 points
545 views
0 votes
2 answers
0 votes
1 answer

Converting a pandas data-frame to a dictionary

Emp_dict=Employee.to_dict('records') You can directly use the 'to_dict()' function ...READ MORE

answered May 23, 2018 in Data Analytics by Bharani
• 4,550 points
720 views
+1 vote
1 answer

How to estimate number of clusters through EM in scikit-learn

For future reference, the fixed function looks ...READ MORE

answered Sep 26, 2018 in Python by Priyaj
• 56,120 points
33 views
0 votes
1 answer

How to Pivot pandas for removing of some headers and renaming of some indexes?

Solution is add parameter values to pivot, then add reset_index for column ...READ MORE

answered Sep 27, 2018 in Python by Priyaj
• 56,120 points
89 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.