{"id":710,"date":"2023-12-01T15:39:50","date_gmt":"2023-12-01T15:39:50","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=710"},"modified":"2023-12-04T09:06:18","modified_gmt":"2023-12-04T09:06:18","slug":"pandas-dataframe-astype","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/pandas-dataframe-astype\/","title":{"rendered":"Pandas DataFrame astype() Method (with Examples)"},"content":{"rendered":"\n<p>Python is a really handy language that can do a lot of things, especially when it comes to dealing with data, especially with Pandas library. In this article, we&#8217;re going to dive into astype() in Pandas. It lets us change the type of data in Pandas to whatever we want. Plus, it&#8217;s got this extra power where it can turn existing columns into special categories.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What is the astype() Method in Pandas?<\/strong><\/h2>\n\n\n\n<p><strong>The astype() method in Pandas is used to cast a pandas object, such as a DataFrame or Series, to a specified data type.\u00a0 <\/strong>Hence, it provides a flexible way to convert the data types of one or more columns in a DataFrame. It is truly useful when we are required to change the data type of a specific column or multiple columns simultaneously.<\/p>\n\n\n\n<p>Besides changing the data type of columns, the astype() method also allows us to convert columns to categorical types. This is useful when dealing with variables that only have a limited number of unique values, such as categorical variables or factors.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Syntax and Parameters of astype() Method<\/strong><\/h3>\n\n\n\n<p>Now let us explore the syntax of the astype() method.<\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background has-medium-font-size\" style=\"background-color:#fedcba\"><code>DataFrame.astype(dtype, copy=True, errors='raise', **kwargs)\n<\/code><\/pre>\n\n\n\n<p>Here is the breakdown of the syntax:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>dtype<\/strong>: Specifies the data type to which the DataFrame should be cast. It can be a numpy.dtype or a Python type. Alternatively, we can provide a dictionary with column names as keys and their corresponding data types as values.<\/li>\n\n\n\n<li><strong>copy<\/strong>: Specifies whether to return a copy of the DataFrame when copy=True. By default, it is set to True. If copy=False, changes made to the values may get reflected to other pandas objects.<\/li>\n\n\n\n<li><strong>errors<\/strong>: Handles errors on invalid data for the provided data type. It can take two values: &#8216;raise&#8217; (default) allows exceptions to be raised, while &#8216;ignore&#8217; ignores exceptions and returns the original object on error.<\/li>\n\n\n\n<li>**<strong>kwargs<\/strong>: Additional keyword arguments that can be passed to the constructor of the class.<\/li>\n<\/ul>\n\n\n\n<p>Now let us see various use cases of the astype() function.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Casting the Data Type of a Single Column&nbsp;<\/strong><\/h3>\n\n\n\n<p><strong>The astype() method is commonly used to change the data type of a specific column in a DataFrame. <\/strong>&nbsp;Let&#8217;s consider an example: We have a DataFrame with columns representing different attributes of a person, such as Name, Age, and Weight. We want to convert the Weight column to an integer data type.<\/p>\n\n\n\n<p>We can use the astype() method in this way:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndata = {\n    &quot;Name&quot;: [&quot;John&quot;, &quot;Emma&quot;, &quot;Michael&quot;],\n    &quot;Age&quot;: [25, 30, 35],\n    &quot;Weight&quot;: [65.2, 68.5, 73.1]\n}\ndf = pd.DataFrame(data)\n# Display the original DataFrame\nprint('Original DataFrame:\\n', df)\n\n# Change the data type of 'Weight' column\ndf[&quot;Weight&quot;] = df[&quot;Weight&quot;].astype('int64')\n\n# Display the new DataFrame. \nprint('Updated DataFrame:\\n',df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\n       Name  Age  Weight\n0     John   25    65.2\n1     Emma   30    68.5\n2  Michael   35    73.1\n\nUpdated DataFrame:\n       Name  Age  Weight\n0     John   25      65\n1     Emma   30      68\n2  Michael   35      73\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Casting the Data Type of Multiple Columns&nbsp;<\/strong><\/h3>\n\n\n\n<p>In addition to changing the data type of a single column, the astype() method also allows us to change the data types of multiple columns simultaneously. This can be achieved by providing a dictionary containing column names as keys and their corresponding data types as values.<\/p>\n\n\n\n<p>Let us take an example, same as above let us now try to change the data types of the Weight and Age columns. Check the Python code below:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndata = {\n    &quot;Name&quot;: [&quot;John&quot;, &quot;Emma&quot;, &quot;Michael&quot;],\n    &quot;Age&quot;: [25, 30, 35],\n    &quot;Weight&quot;: [65.2, 68.5, 73.1]\n}\ndf = pd.DataFrame(data)\n\n# Display the original DataFrame\nprint('Original DataFrame:\\n', df)\n\n# Change the data type of 'Weight' and 'Age' columns\ndf = df.astype({&quot;Age&quot;: 'float', &quot;Weight&quot;: 'int64'})\n\n# Display the new DataFrame. \nprint('Updated DataFrame:\\n',df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\n       Name  Age  Weight\n0     John   25    65.2\n1     Emma   30    68.5\n2  Michael   35    73.1\n\nUpdated DataFrame:\n       Name   Age  Weight\n0     John  25.0      65\n1     Emma  30.0      68\n2  Michael  35.0      73\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Converting Columns to Categorical Type&nbsp;<\/strong><\/h3>\n\n\n\n<p>Astype() can also be used to convert the column type into categorical. Categorical types are useful when dealing with variables that have a limited number of unique values or represent categories of factors.<\/p>\n\n\n\n<p>Consider a scenario where we have a DataFrame with a column representing the gender of individuals. We want to convert the Gender column to a categorical type. Converting the column to categorical data type will allow for more efficient data storage. Here is the code:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndata = {\n    &quot;Name&quot;: [&quot;John&quot;, &quot;Emma&quot;, &quot;Michael&quot;],\n    &quot;Gender&quot;: [&quot;Male&quot;, &quot;Female&quot;, &quot;Male&quot;],\n    &quot;Age&quot;: [25, 30, 35]\n}\ndf = pd.DataFrame(data)\n\n# Display the original DataFrame\nprint('Original DataFrame:\\n', df)\n\n# Convert the 'Gender' column to categorical data type\ndf[&quot;Gender&quot;] = df[&quot;Gender&quot;].astype('category')\n\n# Display the new DataFrame. \nprint('Updated DataFrame:\\n',df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\n       Name  Gender  Age\n0     John    Male   25\n1     Emma  Female   30\n2  Michael    Male   35\n\nUpdated DataFrame:\n       Name  Gender  Age\n0     John    Male   25\n1     Emma  Female   30\n2  Michael    Male   35<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Handling Missing Values&nbsp;<\/strong><\/h3>\n\n\n\n<p>When working with real-world datasets, it is common to encounter missing or NaN (Not a Number) values. To learn more about NaN values and how to handle them, refer to <a href=\"https:\/\/favtutor.com\/articles\/pandas-fillna-method\/\">Pandas-fillNA<\/a>.<\/p>\n\n\n\n<p><strong>The astype() method provides a convenient way to handle missing values while changing the data type of columns. <\/strong>Consider a scenario where we have a DataFrame with a column representing the weight of individuals. However, this column contains some missing values also called NaN.<\/p>\n\n\n\n<p>To avoid errors while changing the data type of the Weight column, we are required to handle the missing values first. We can accomplish this by dropping the rows containing any NaN values using the dropna() method.<\/p>\n\n\n\n<p>After handling the missing values, we can proceed with changing the data type of the Weight column using the astype() method. Check the example below:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndata = {\n    &quot;Name&quot;: [&quot;John&quot;, &quot;Emma&quot;, &quot;Michael&quot;],\n    &quot;Weight&quot;: [65.2, 68.5, None],\n    &quot;Age&quot;: [25, 30, 35]\n}\ndf = pd.DataFrame(data)\n\n# Display the original DataFrame\nprint('Original DataFrame:\\n', df)\n\n# Use dropna to handle missing values\ndf.dropna(inplace=True)\n\n# Change the data type of 'Weight' column\ndf[&quot;Weight&quot;] = df[&quot;Weight&quot;].astype('int64')\n\n# Display the new DataFrame. \nprint('Updated DataFrame:\\n',df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\n       Name  Weight  Age\n0     John    65.2   25\n1     Emma    68.5   30\n2  Michael     NaN   35\n\nUpdated DataFrame:\n    Name  Weight  Age\n0  John      65   25\n1  Emma      68   30\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p>In this article, we have discussed a very powerful tool in Pandas Python called astype().  It is very useful while performing data analysis to change the data types of single or multiple columns in a DataFrame. By understanding its syntax, and use cases, we can effectively manipulate data types in Pandas for various data analysis tasks.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learn everything about Pandas DataFrame astype() method and how to use it for different cases along with examples.<\/p>\n","protected":false},"author":10,"featured_media":713,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[35],"tags":[37,54],"class_list":["post-710","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","tag-pandas","tag-pandas-dataframe"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/710","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=710"}],"version-history":[{"count":3,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/710\/revisions"}],"predecessor-version":[{"id":818,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/710\/revisions\/818"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/713"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=710"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=710"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=710"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}