Changing the name of Pandas columns is a common task that often arises when working with data, whether you need to standardize column names, make them more descriptive, or simply prefer a different naming convention. In this article, we will explore how to rename a single column or multiple columns in Pandas.
Pandas rename() Function
Renaming columns in a data frame can be beneficial for various reasons, such as improving readability, maintaining consistency, or aligning with specific naming conventions.
The rename() function in Pandas allows you to rename one or more columns by providing a dictionary-like object that maps the old column names to the new ones.
If you want to remove columns, you can use the drop() method.
Here’s a basic syntax for the rename function:
DataFrame.rename(mapper=None, index=None, columns=None, axis=None, copy=True, inplace=False, level=None)
Here’s a breakdown of the syntax:
- mapper: Dictionary-like or function. It’s used to specify the mapping of old names to new names. If it’s a dictionary, keys are the current column/index names, and values are the new names. If it’s a function, it should take a column/index name and return a new name.
- index and columns: These parameters are used to specifically rename the index or columns, respectively. You can provide a dictionary or a function similar to the mapper parameter.
- axis: Specifies whether to rename the index (axis=0), columns (axis=1), or both (axis=None, which is the default).
- copy: If True (default), it creates a new DataFrame with the updated names. If False, it modifies the original DataFrame in place.
- inplace: If True, it modifies the DataFrame in place and returns None. If False (default), it returns a new DataFrame with the updated names.
- level: For DataFrames with hierarchical indexing, this parameter specifies the level to rename.
Now let us discuss the various methods of using the rename() method in Python Pandas.
Renaming a Single Column
To rename a single column, you can specify the old column name as the key and the new column name as the value in the rename() function.
Let’s consider the following example in Python:
import pandas as pd data = {'name': ['John', 'Jane', 'Jade'], 'age': [25, 30, 35]} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Rename the column 'name' df.rename(columns={'name': 'full_name'}, inplace=True) # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output:
Original DataFrame:
name age
0 John 25
1 Jane 30
2 Jade 35
Modified DataFrame:
full_name age
0 John 25
1 Jane 30
2 Jade 35
Renaming Multiple Columns
To rename multiple columns, you can provide a dictionary-like object with the old column names as keys and the new column names as values.
Let’s consider the following example in Python:
import pandas as pd data = {'name': ['John', 'Jane', 'Jade'], 'age': [25, 30, 35], 'city': ['New York', 'London', 'Paris']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Rename the columns 'name', 'city' df.rename(columns={'name': 'full_name', 'city': 'location'}, inplace=True) # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output:
Original DataFrame:
name age city
0 John 25 New York
1 Jane 30 London
2 Jade 35 Paris
Modified DataFrame:
full_name age location
0 John 25 New York
1 Jane 30 London
2 Jade 35 Paris
Assigning a List of New Column Names
There are various methods to rename columns in Pandas Python other than using the rename function. Another approach to renaming columns in a Pandas DataFrame is by directly assigning a list containing the new column names to the columns attribute of the DataFrame object. This method is useful when you want to rename all the columns or a subset of them.
Consider the following example:
import pandas as pd data = {'name': ['John', 'Jane', 'Jade'], 'age': [25, 30, 35]} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Rename both the "name" and "age" columns to "full_name" and "years_old" df.columns = ['full_name', 'years_old'] # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output
Original DataFrame:
name age
0 John 25
1 Jane 30
2 Jade 35
Modified DataFrame:
full_name years_old
0 John 25
1 Jane 30
2 Jade 35
Using the add_prefix() and add_suffix()
The add_prefix() and add_suffix() functions in Pandas allow you to add a prefix or suffix to the existing column names. These functions are useful when you want to maintain the original column names but add additional information to differentiate them.
Consider the following example:
import pandas as pd data = {'name': ['John', 'Jane', 'Jade'], 'age': [25, 30, 35]} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # To add a prefix "user_" and a suffix "_info" to the column names df = df.add_prefix('user_') df = df.add_suffix('_info') # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output:
Original DataFrame:
name age
0 John 25
1 Jane 30
2 Jade 35
Modified DataFrame:
user_name_info user_age_info
0 John 25
1 Jane 30
2 Jade 35
Using Dataframe.columns.str.replace
Pandas provides the Dataframe.columns.str.replace method, which allows you to replace specific texts within column names. This method is useful when you want to replace certain substrings or characters with new values in column names.
Consider the following example:
import pandas as pd data = {'name': ['John', 'Jane', 'Jade'], 'age': [25, 30, 35]} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # To replace the substring "name" with "full_name" and "age" with "years_old" in the column names df.columns = df.columns.str.replace('name', 'full_name') df.columns = df.columns.str.replace('age', 'years_old') # Display the modified DataFrame print('Modified DataFrame:\n', df)
Output:
Original DataFrame:
name age
0 John 25
1 Jane 30
2 Jade 35
Modified DataFrame:
full_name years_old
0 John 25
1 Jane 30
2 Jade 35
Conclusion
In this article, we explored many different methods to rename columns, including using the rename() function, assigning a list of new column names, using the set_axis() function, and adding prefixes and suffixes with add_prefix() and add_suffix(). By mastering these techniques, you can easily manipulate column names in your DataFrame to align with your data analysis needs.