Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home Data Science

Pandas DataFrame astype() Method (with Examples)

Piyush Kaushal by Piyush Kaushal
December 4, 2023
Reading Time: 6 mins read
Pandas DataFrame astype() Method
Follow us on Google News   Subscribe to our newsletter

Python is a really handy language that can do a lot of things, especially when it comes to dealing with data, especially with Pandas library. In this article, we’re going to dive into astype() in Pandas. It lets us change the type of data in Pandas to whatever we want. Plus, it’s got this extra power where it can turn existing columns into special categories.

What is the astype() Method in Pandas?

The astype() method in Pandas is used to cast a pandas object, such as a DataFrame or Series, to a specified data type.  Hence, it provides a flexible way to convert the data types of one or more columns in a DataFrame. It is truly useful when we are required to change the data type of a specific column or multiple columns simultaneously.

Besides changing the data type of columns, the astype() method also allows us to convert columns to categorical types. This is useful when dealing with variables that only have a limited number of unique values, such as categorical variables or factors.

Syntax and Parameters of astype() Method

Now let us explore the syntax of the astype() method.

DataFrame.astype(dtype, copy=True, errors='raise', **kwargs)

Here is the breakdown of the syntax:

  • dtype: Specifies the data type to which the DataFrame should be cast. It can be a numpy.dtype or a Python type. Alternatively, we can provide a dictionary with column names as keys and their corresponding data types as values.
  • copy: Specifies whether to return a copy of the DataFrame when copy=True. By default, it is set to True. If copy=False, changes made to the values may get reflected to other pandas objects.
  • errors: Handles errors on invalid data for the provided data type. It can take two values: ‘raise’ (default) allows exceptions to be raised, while ‘ignore’ ignores exceptions and returns the original object on error.
  • **kwargs: Additional keyword arguments that can be passed to the constructor of the class.

Now let us see various use cases of the astype() function.

Casting the Data Type of a Single Column 

The astype() method is commonly used to change the data type of a specific column in a DataFrame.  Let’s consider an example: We have a DataFrame with columns representing different attributes of a person, such as Name, Age, and Weight. We want to convert the Weight column to an integer data type.

We can use the astype() method in this way:

import pandas as pd

data = {
    "Name": ["John", "Emma", "Michael"],
    "Age": [25, 30, 35],
    "Weight": [65.2, 68.5, 73.1]
}
df = pd.DataFrame(data)
# Display the original DataFrame
print('Original DataFrame:\n', df)

# Change the data type of 'Weight' column
df["Weight"] = df["Weight"].astype('int64')

# Display the new DataFrame. 
print('Updated DataFrame:\n',df)

Output:

Original DataFrame:
       Name  Age  Weight
0     John   25    65.2
1     Emma   30    68.5
2  Michael   35    73.1

Updated DataFrame:
       Name  Age  Weight
0     John   25      65
1     Emma   30      68
2  Michael   35      73

Casting the Data Type of Multiple Columns 

In addition to changing the data type of a single column, the astype() method also allows us to change the data types of multiple columns simultaneously. This can be achieved by providing a dictionary containing column names as keys and their corresponding data types as values.

Let us take an example, same as above let us now try to change the data types of the Weight and Age columns. Check the Python code below:

import pandas as pd

data = {
    "Name": ["John", "Emma", "Michael"],
    "Age": [25, 30, 35],
    "Weight": [65.2, 68.5, 73.1]
}
df = pd.DataFrame(data)

# Display the original DataFrame
print('Original DataFrame:\n', df)

# Change the data type of 'Weight' and 'Age' columns
df = df.astype({"Age": 'float', "Weight": 'int64'})

# Display the new DataFrame. 
print('Updated DataFrame:\n',df)

Output:

Original DataFrame:
       Name  Age  Weight
0     John   25    65.2
1     Emma   30    68.5
2  Michael   35    73.1

Updated DataFrame:
       Name   Age  Weight
0     John  25.0      65
1     Emma  30.0      68
2  Michael  35.0      73

Converting Columns to Categorical Type 

Astype() can also be used to convert the column type into categorical. Categorical types are useful when dealing with variables that have a limited number of unique values or represent categories of factors.

Consider a scenario where we have a DataFrame with a column representing the gender of individuals. We want to convert the Gender column to a categorical type. Converting the column to categorical data type will allow for more efficient data storage. Here is the code:

import pandas as pd

data = {
    "Name": ["John", "Emma", "Michael"],
    "Gender": ["Male", "Female", "Male"],
    "Age": [25, 30, 35]
}
df = pd.DataFrame(data)

# Display the original DataFrame
print('Original DataFrame:\n', df)

# Convert the 'Gender' column to categorical data type
df["Gender"] = df["Gender"].astype('category')

# Display the new DataFrame. 
print('Updated DataFrame:\n',df)

Output:

Original DataFrame:
       Name  Gender  Age
0     John    Male   25
1     Emma  Female   30
2  Michael    Male   35

Updated DataFrame:
       Name  Gender  Age
0     John    Male   25
1     Emma  Female   30
2  Michael    Male   35

Handling Missing Values 

When working with real-world datasets, it is common to encounter missing or NaN (Not a Number) values. To learn more about NaN values and how to handle them, refer to Pandas-fillNA.

The astype() method provides a convenient way to handle missing values while changing the data type of columns. Consider a scenario where we have a DataFrame with a column representing the weight of individuals. However, this column contains some missing values also called NaN.

To avoid errors while changing the data type of the Weight column, we are required to handle the missing values first. We can accomplish this by dropping the rows containing any NaN values using the dropna() method.

After handling the missing values, we can proceed with changing the data type of the Weight column using the astype() method. Check the example below:

import pandas as pd

data = {
    "Name": ["John", "Emma", "Michael"],
    "Weight": [65.2, 68.5, None],
    "Age": [25, 30, 35]
}
df = pd.DataFrame(data)

# Display the original DataFrame
print('Original DataFrame:\n', df)

# Use dropna to handle missing values
df.dropna(inplace=True)

# Change the data type of 'Weight' column
df["Weight"] = df["Weight"].astype('int64')

# Display the new DataFrame. 
print('Updated DataFrame:\n',df)

Output:

Original DataFrame:
       Name  Weight  Age
0     John    65.2   25
1     Emma    68.5   30
2  Michael     NaN   35

Updated DataFrame:
    Name  Weight  Age
0  John      65   25
1  Emma      68   30

Conclusion

In this article, we have discussed a very powerful tool in Pandas Python called astype(). It is very useful while performing data analysis to change the data types of single or multiple columns in a DataFrame. By understanding its syntax, and use cases, we can effectively manipulate data types in Pandas for various data analysis tasks.

ShareTweetShareSendSend
Piyush Kaushal

Piyush Kaushal

I am Piyush Kaushal, currently pursuing a degree in software engineering at a prestigious government university. I am dedicated to staying informed about the latest technological developments and continuously expanding my knowledge base. I take great pleasure in sharing my expertise in data science and coding with fellow aspiring minds.

RelatedPosts

Moving Average in Pandas

Calculate Moving Average in Pandas (with code)

January 12, 2024
Pandas Convert Datetime to Date Column

Convert Datetime to Date Column in Pandas (with code)

January 4, 2024
Convert Pandas DataFrame to NumPy Array

Convert Pandas DataFrame to NumPy Array (with code)

January 3, 2024
Pandas DataFrame isna() Method

Pandas DataFrame isna() Method Explained

January 3, 2024
Pandas DataFrame copy() Method

Pandas DataFrame copy() Method Explained

January 1, 2024

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.