If you regularly work with data in Python using the powerful Pandas library, you may often find yourself needing to manipulate and filter DataFrames. After performing these operations, you might end up with a smaller DataFrame that still retains the row index of the original one. This can lead to non-continuous or undesired indexes. Pandas provides a handy function called reset_index(), which we learn about in this article.
What is the reset_index() Function in Pandas?
The reset_index() method is a powerful tool in Pandas that allows you to reset the index of a DataFrame back to the default integer index (0, 1, 2, …). By default, this method keeps the original indexes in a column named “index,” but you can choose to remove them using the drop parameter.
If you want to remove the column, it can be done using the drop() method.
Here’s a basic syntax for the reset_index() function:
dataframe.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')
Here’s a breakdown of the syntax:
- level: Specifies the levels to reset. By default, it resets all levels.
- drop: Specifies whether to drop the old index column. The default value is False, which means the old index is kept as a column in the DataFrame.
- inplace: Specifies whether to modify the DataFrame in place or return a new DataFrame. The default value is False, which returns a new DataFrame with the reset index.
- col_level: For DataFrames with multi-level columns, determine which level the labels are inserted into. The default value is 0, which inserts the labels into the first level.
- col_fill: For DataFrames with multi-level columns, determine how the other levels are named. The default value is an empty string (”), but you can specify a custom name.
Now that we have an understanding of the reset_index() method, let’s explore different scenarios and examples to reset the index in a Pandas DataFrame.
Creating a Custom Index Without Removing the Default Index
In some cases, you may want to create a custom index for your DataFrame without removing the default index. This can be achieved by passing a list or array of values to the index parameter when creating the DataFrame.
Let’s consider the following example:
import pandas as pd data = { "Name": ["Sally", "Mary", "John"], "Age": [50, 40, 30], "Qualified": [True, False, False] } # Create the custom index index = ["X", "Y", "Z"] df = pd.DataFrame(data, index=index) # Display DataFrame print('DataFrame:\n', df)
Output:
DataFrame:
Name Age Qualified
X Sally 50 True
Y Mary 40 False
Z John 30 False
Creating a Custom Index and Removing the Default Index
If you want to create a custom index for your DataFrame and remove the default index, you can use the reset_index() method with the drop parameter set to True.
The reset_index() method removes the old index and replaces it with a new default integer index. The drop=True parameter ensures that the old index is not added as a column in the DataFrame.
Continuing from the previous example, let’s reset the index and remove the default index:
import pandas as pd data = { "Name": ["Sally", "Mary", "John"], "Age": [50, 40, 30], "Qualified": [True, False, False] } # Create custom index index = ["X", "Y", "Z"] df = pd.DataFrame(data, index=index) # Display the original DataFrame print('Original DataFrame:\n', df) # To reset the index to the default integer index df.reset_index(drop=True, inplace=True) # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output:
Original DataFrame:
Name Age Qualified
X Sally 50 True
Y Mary 40 False
Z John 30 False
Modified DataFrame:
Name Age Qualified
0 Sally 50 True
1 Mary 40 False
2 John 30 False
Resetting the Index to the Default Integer Index
When you want to reset the index of a DataFrame to the default integer index, you can simply use the reset_index() method without any parameters. This will remove the existing index and replace it with a new sequential index starting from 0.
Let’s consider the following example:
import pandas as pd data = { "Name": ["Sally", "Mary", "John"], "Age": [50, 40, 30], "Qualified": [True, False, False] } index = ["X", "Y", "Z"] df = pd.DataFrame(data, index=index) # Display the original DataFrame print('Original DataFrame:\n', df) # To reset the index to the default integer index newdf = df.reset_index() # Display the modified DataFrame print('Modified DataFrame:\n', newdf)
Output:
Original DataFrame:
Name Age Qualified
X Sally 50 True
Y Mary 40 False
Z John 30 False
Modified DataFrame:
index Name Age Qualified
0 X Sally 50 True
1 Y Mary 40 False
2 Z John 30 False
Making a Column of the DataFrame as the Index
In some cases, you may want to reset a custom index back to the default integer index and make the default index the new index of the DataFrame. This can be achieved by using the reset_index() method with the drop parameter set to True.
Let’s consider the following example:
import pandas as pd data = { "Name": ["Sally", "Mary", "John"], "Age": [50, 40, 30], "Qualified": [True, False, False] } # Create custom index index = ["X", "Y", "Z"] df = pd.DataFrame(data, index=index) # Display the original DataFrame print('Original DataFrame:\n', df) # To reset the index as the column 'Age' df.set_index('Age', inplace=True) # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output:
Original DataFrame:
Name Age Qualified
X Sally 50 True
Y Mary 40 False
Z John 30 False
Modified DataFrame:
Age Name Qualified
50 Sally True
40 Mary False
30 John False
Conclusion
In this article, we explored various methods to reset the index in a Pandas DataFrame. We learned how to reset the index to the default integer index, create a custom index, remove the default index, and make a column the new index. By mastering this method, you can efficiently manipulate and transform your data in Pandas, ensuring your DataFrames are organized and structured according to your requirements.