Pandas is one of the most popular libraries in Python. Pandas provide data structures, a large collection of inbuilt methods, and operations for data analysis. It’s made mainly for working with relational or labeled data both easily and intuitively. There are many in-build methods supported by the pandas library which enables you to quickly perform operations on a large dataset. In this article, we will study how you can efficiently count the number of rows in pandas groupby using some in-build pandas library function along with example and output. So, let's get started!

**What is Groupby in Pandas?**

When dealing with **data science projects**, you’ll often experiment with a large amount of data and keep trying the operations on datasets over and over. This is where the concept of groupby comes into the picture. You can define groupby as the ability to aggregate the given data efficiently by improving the performance and efficiency of your code. Groupby concept mainly refers to:

**Splitting**the dataset in form of the group by applying some operations**Applying**the given function to each group independently**Combining**the different results of each dataset using the**groupby()**method and result into a data structure.

As pandas groupby refers to individual groups of a given dataset, what if you wish to count the number of rows present in each of these groups? Counting them manually is quite an infeasible and impossible task, and therefore, let us study some of the efficient methods which can help you with this task.

**How to Count Rows in Each Group of Pandas Groupby?**

Below are two methods by which you can count the number of objects in groupby pandas:

**1) Using pandas groupby size() method**

The most simple method for pandas groupby count is by using the in-built pandas method named **size()**. It returns a pandas series that possess the total number of row count for each group. The basic working of the size() method is the same as **len()** method and hence, it is not affected by NaN values in the dataset.

For better understanding, let us go through an example below:

Consider the dataframe consisting of the bunch of students' names with respect to the subjects they study.

import pandas as pd data = { "Students": ["Ray", "John", "Mole", "Smith", "Jay", "Milli", "Tom", "Rick"], "Subjects": ["Maths", "Economics", "Science", "Maths", "Statistics", "Statistics", "Statistics", "Computers"] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df)

**Output:**

Students Subjects 0 Ray Maths 1 John Economics 2 Mole Science 3 Smith Maths 4 Jay Statistics 5 Milli Statistics 6 Tom Statistics 7 Rick Computers

Now, let us group the above dataframe with the column “Subjects” and identify the number of rows in each group using the groupby size() method.

**For example:**

import pandas as pd data = { "Students": ["Ray", "John", "Mole", "Smith", "Jay", "Milli", "Tom", "Rick"], "Subjects": ["Maths", "Economics", "Science", "Maths", "Statistics", "Statistics", "Statistics", "Computers"] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df.groupby('Subjects').size())

**Output:**

Subjects Computers 1 Economics 1 Maths 2 Science 1 Statistics 3 dtype: int64

As a result, the output for the above example displays the count of rows for each group in the dataframe with respective to the subjects available.

**2) Using pandas grouby count() method**

Instead of the size() method, you can also use the pandas groupby **count()** method to count the values of each column in each group. Note that the number of counts is always similar to the row sizes if there is **no presence of NaN value** in the dataframe. Check out the below example for a better understanding of the pandas grouby count() method:

**For example:**

import pandas as pd data = { "Students": ["Ray", "John", "Mole", "Smith", "Jay", "Milli", "Tom", "Rick"], "Subjects": ["Maths", "Economics", "Science", "Maths", "Statistics", "Statistics", "Statistics", "Computers"] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df.groupby('Subjects').count())

**Output:**

Students Subjects Computers 1 Economics 1 Maths 2 Science 1 Statistics 3

Apart from this, you can also use the **value_count() method** if you are grouping the dataframe using a single column.

**For example:**

import pandas as pd data = { "Students": ["Ray", "John", "Mole", "Smith", "Jay", "Milli", "Tom", "Rick"], "Subjects": ["Maths", "Economics", "Science", "Maths", "Statistics", "Statistics", "Statistics", "Computers"] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df['Subjects'].value_counts())

**Output:**

Statistics 3 Maths 2 Economics 1 Science 1 Computers 1 Name: Subjects, dtype: int64

**Difference between Size() and Count() Methods**

Looking at the above examples, you must have made up your mind to interchangeably use the size() and count() method while working with pandas groupby. However, note that both of these methods are quite distinct on their own. The count() function returns the number of values in each group, which **may or may not be equal to the number of rows** because any NaN values encountered by the count() method will be ignored in this case. However, on the other hand, the size() method will get the **actual number of rows** for each group of dataframe irrespective of NaN values. Let’s understand this using an example:

**For example:**

import numpy as np # create a dataframe data = { "Students": ["Ray", "John", "Mole", "John", "John", "John", "Ray", "Rick"], "Subjects": ["Maths", "Economics", "Science", "Maths", np.nan, "Statistics", "Statistics", "Computers"] } df = pd.DataFrame(data) # display the dataframe print(df.groupby('Students').size())

**Output:**

Students John 4 Mole 1 Ray 2 Rick 1 dtype: int64

Now using the count() method on the “Students” column of dataframe

**For example:**

import numpy as np # create a dataframe data = { "Students": ["Ray", "John", "Mole", "John", "John", "John", "Ray", "Rick"], "Subjects": ["Maths", "Economics", "Science", "Maths", np.nan, "Statistics", "Statistics", "Computers"] } df = pd.DataFrame(data) # display the dataframe print(df.groupby('Students').count())

**Output:**

Subjects Students John 3 Mole 1 Ray 2 Rick 1

Looking at the above example, you must have understood that if you wish to count the total number of rows in each dataframe, make use of the **size()** method on groupby, and if you wish to count only the **non-null values**, get your task done with pandas groupby **count()** method.

**Conclusion**

Python Pandas is an open-source library that provides the ability of high data manipulation and data analysis tools. However, to utilize this ability of pandas efficiently, you must be familiar with a huge collection of pandas in-built libraries which enables you to perform certain operations on large datasets. In this article, we studied how you can count the number of rows in each group of pandas groupby using some in-built functions and make your programming easy and efficient while working with massive data. If you want to practice more about **pandas, try these exercises for beginners**.