{"id":716,"date":"2023-12-03T17:03:37","date_gmt":"2023-12-03T17:03:37","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=716"},"modified":"2023-12-05T05:28:14","modified_gmt":"2023-12-05T05:28:14","slug":"concatenate-dataframes-pandas","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/concatenate-dataframes-pandas\/","title":{"rendered":"How to Concatenate Pandas DataFrames? (with code)"},"content":{"rendered":"\n<p>Pandas is a very useful Python library that provides great data analysis with the help of its DataFrame. On many occasions, we may want to combine two DataFrames, either vertically (along rows) or horizontally (along columns), depending on our data analysis needs. This article will explain how to concatenate two or more DataFrames using the concat() function in pandas.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Concatenation of Two or More DataFrames in Pandas<\/strong><\/h2>\n\n\n\n<p>Concatenation simply means combining or putting together entities. <strong>Concatenation, in the context of pandas, refers to the process of combining two or more DataFrames along either the rows or columns axis. <\/strong>It allows us to merge datasets with similar or different structures, creating a unified DataFrame that can be easily analyzed and manipulated.<\/p>\n\n\n\n<p>Let us now explore the various techniques we can use to concat two or more DataFrames.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Concatenate along Rows<\/strong><\/h3>\n\n\n\n<p>We can concat the DataFrames along the rows. <strong>One way to concatenate DataFrames is by stacking them vertically along the rows axis.&nbsp; <\/strong>We can do this by using the <strong>pd.concat()<\/strong> function in pandas. We set the axis to 0.<\/p>\n\n\n\n<p>Let&#8217;s see how to concatenate along rows with an example:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\nimport numpy as np\n\n# Create the DataFrames\ndf1 = pd.DataFrame(np.random.randint(25, size=(4, 4)), index=[&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\ndf2 = pd.DataFrame(np.random.randint(25, size=(6, 4)), index=[&quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;, &quot;9&quot;, &quot;10&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\n\n# Display the original DataFrames\nprint('DataFrame 1:\\n', df1)\nprint('DataFrame 2:\\n', df2)\n\n# Concat along rows\ndf = pd.concat([df1, df2], axis=0)\n\n# Display the concated DataFrame\nprint('Concated DataFrame:\\n', df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>DataFrame 1:\n     A   B   C   D\n1  13  21  22  24\n2   0   8  16   8\n3  21  16   1  21\n4  10  14  19  17\n\nDataFrame 2:\n      A   B   C   D\n5   24  17  10   0\n6    0   8   5   4\n7   16  17  22   6\n8   18  21   5  10\n9    2  23   4  16\n10  15   7   0   2\n\nConcated DataFrame:\n      A   B   C   D\n1   13  21  22  24\n2    0   8  16   8\n3   21  16   1  21\n4   10  14  19  17\n5   24  17  10   0\n6    0   8   5   4\n7   16  17  22   6\n8   18  21   5  10\n9    2  23   4  16\n10  15   7   0   2\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Concatenate Along Columns<\/strong><\/h3>\n\n\n\n<p>We can also concatenate our DataFrames along the columns, just like we did with the rows.<\/p>\n\n\n\n<p>We can concatenate DataFrames horizontally along the columns axis. This can be useful when we have DataFrames with different columns but the same index values. We set the axis to 1. The missing values will be replaced by Nan values. To learn how to handle the missing values refer to <a href=\"https:\/\/favtutor.com\/articles\/pandas-fillna-method\/\"><strong>pandas-fillna<\/strong><\/a>.<\/p>\n\n\n\n<p>Here is the Python code to do it:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\nimport numpy as np\n\n# Create the DataFrames\ndf1 = pd.DataFrame(np.random.randint(25, size=(4, 4)), index=[&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\ndf2 = pd.DataFrame(np.random.randint(25, size=(6, 4)), index=[&quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;, &quot;9&quot;, &quot;10&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\n\n# Display the original DataFrames\nprint('DataFrame 1:\\n', df1)\nprint('DataFrame 2:\\n', df2)\n\n# Concat along collumns\ndf = pd.concat([df1, df2], axis=1)\n\n# Display the concated DataFrame\nprint('Concated DataFrame:\\n', df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>DataFrame 1:\n     A   B   C   D\n1   5  15   1  22\n2   7  10   0   2\n3  11   4   8  13\n4  15  18  18   4\n\nDataFrame 2:\n      A   B   C   D\n5   15   1  10  19\n6    2  12  16  10\n7   15  20   8   2\n8   17  13   1  10\n9    7   5   9  16\n10   2  20  13   9\n\nConcated DataFrame:\n        A     B     C     D     A     B     C     D\n1    5.0  15.0   1.0  22.0   NaN   NaN   NaN   NaN\n2    7.0  10.0   0.0   2.0   NaN   NaN   NaN   NaN\n3   11.0   4.0   8.0  13.0   NaN   NaN   NaN   NaN\n4   15.0  18.0  18.0   4.0   NaN   NaN   NaN   NaN\n5    NaN   NaN   NaN   NaN  15.0   1.0  10.0  19.0\n6    NaN   NaN   NaN   NaN   2.0  12.0  16.0  10.0\n7    NaN   NaN   NaN   NaN  15.0  20.0   8.0   2.0\n8    NaN   NaN   NaN   NaN  17.0  13.0   1.0  10.0\n9    NaN   NaN   NaN   NaN   7.0   5.0   9.0  16.0\n10   NaN   NaN   NaN   NaN   2.0  20.0  13.0   9.0\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Concatenating with Append<\/strong><\/h3>\n\n\n\n<p>In addition to the pd.concat() function, pandas provide an easy shortcut for concatenating DataFrames using the append() method. The append() method can be used to append one or more DataFrames to another DataFrame.<\/p>\n\n\n\n<p>Let us see an example:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\nimport numpy as np\n\n# Create the DataFrames\ndf1 = pd.DataFrame(np.random.randint(25, size=(4, 4)), index=[&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\ndf2 = pd.DataFrame(np.random.randint(25, size=(6, 4)), index=[&quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;, &quot;9&quot;, &quot;10&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\n\n# Display the original DataFrames\nprint('DataFrame 1:\\n', df1)\nprint('DataFrame 2:\\n', df2)\n\n# Concat using append()\ndf = df1.append(df2)\n\n# Display the concated DataFrame\nprint('Concated DataFrame:\\n', df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>DataFrame 1:\n     A   B   C   D\n1  12  10  12   3\n2   3  12   9  19\n3  14  13  17  20\n4  19  20   2  14\n\nDataFrame 2:\n      A   B   C   D\n5   16   3  21  22\n6   19  12  21  23\n7    7  14  24  23\n8    0  11  16  23\n9    9  23   2   8\n10  21  10  21  18\n\nConcated DataFrame:\n      A   B   C   D\n1   12  10  12   3\n2    3  12   9  19\n3   14  13  17  20\n4   19  20   2  14\n5   16   3  21  22\n6   19  12  21  23\n7    7  14  24  23\n8    0  11  16  23\n9    9  23   2   8\n10  21  10  21  18\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Using Various Types of Joins<\/strong><\/h3>\n\n\n\n<p>Another method to cancatenate DataFrames is by using the joins. When joining DataFrames, we have a lot of types of joins available to us.<strong> The type of join determines how the rows from the original DataFrames will be combined in the resulting DataFrame.&nbsp;<\/strong><\/p>\n\n\n\n<p>The common types of joins are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Inner Join<\/strong>: The resulting DataFrame will only contain rows where the key exists in both DataFrames being joined. It acts like an intersection of the two DataFrames.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"320\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4-1024x320.png\" alt=\"Inner join\" class=\"wp-image-719\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4-1024x320.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4-300x94.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4-768x240.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4-750x234.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4-1140x356.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/4.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Outer Join<\/strong>: The resulting DataFrame will contain all rows from both DataFrames. It acts like a Union of the DataFrames. The missing values will be replaced by NaN.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"320\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3-1024x320.png\" alt=\"Outer Join\n\" class=\"wp-image-720\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3-1024x320.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3-300x94.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3-768x240.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3-750x234.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3-1140x356.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/3.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Left Join<\/strong>: The resulting DataFrame will contain all rows from the left DataFrame and the matched rows from the right DataFrame. Again, the missing values will be replaced by the Nan values.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"320\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1-1024x320.png\" alt=\"Left Join\" class=\"wp-image-722\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1-1024x320.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1-300x94.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1-768x240.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1-750x234.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1-1140x356.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/1.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Right Join<\/strong>: The resulting DataFrame will contain all rows from the right DataFrame and the matched rows from the left DataFrame. The missing values will be replaced by Nan values.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"320\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2-1024x320.png\" alt=\"Right Join\" class=\"wp-image-723\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2-1024x320.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2-300x94.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2-768x240.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2-750x234.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2-1140x356.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2023\/12\/2.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<p>Let us see how to implement it in Python:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\nimport numpy as np\n\n# Create the DataFrames\ndf1 = pd.DataFrame(np.random.randint(25, size=(4, 4)), index=[&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\ndf2 = pd.DataFrame(np.random.randint(25, size=(6, 4)), index=[&quot;5&quot;, &quot;6&quot;, &quot;7&quot;, &quot;8&quot;, &quot;9&quot;, &quot;10&quot;], columns=[&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;])\n\n# Display the original DataFrames\nprint('DataFrame 1:\\n', df1)\nprint('DataFrame 2:\\n', df2)\n\n# Concat using outer join on 'B'\ndf = pd.merge(df1, df2, on='B', how='outer')\n\n# Display the concated DataFrame\nprint('Concated DataFrame:\\n', df)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>DataFrame 1:\n    A   B   C   D\n1  7  13  21   0\n2  0   6  19   5\n3  4   1   2  14\n4  9  20  16   1\n\nDataFrame 2:\n      A   B   C   D\n5   10  23   0   3\n6    4  11  10  24\n7    2  13  13  23\n8    7  12   9  10\n9   19  23   4  15\n10   0  21   9   2\n\nConcated DataFrame:\n    A_x   B   C_x   D_x   A_y   C_y   D_y\n0  7.0  13  21.0   0.0   2.0  13.0  23.0\n1  0.0   6  19.0   5.0   NaN   NaN   NaN\n2  4.0   1   2.0  14.0   NaN   NaN   NaN\n3  9.0  20  16.0   1.0   NaN   NaN   NaN\n4  NaN  23   NaN   NaN  10.0   0.0   3.0\n5  NaN  23   NaN   NaN  19.0   4.0  15.0\n6  NaN  11   NaN   NaN   4.0  10.0  24.0\n7  NaN  12   NaN   NaN   7.0   9.0  10.0\n8  NaN  21   NaN   NaN   0.0   9.0   2.0\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p>In this article, we have discovered various methods to concat or join two or more Pandas DataFrames. Remember to experiment with different concatenation techniques, so that you will have the flexibility and power to merge and analyze datasets with ease. You can now move in to learn how to <a href=\"https:\/\/favtutor.com\/blogs\/pandas-add-column\" data-type=\"link\" data-id=\"https:\/\/favtutor.com\/blogs\/pandas-add-column\">iterate over rows in Pandas<\/a>, which is also important to learn for beginners.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to concatenate two or more dataframes in pandas along rows and columns with the code to implement it.<\/p>\n","protected":false},"author":10,"featured_media":717,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[35],"tags":[37,54],"class_list":["post-716","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","tag-pandas","tag-pandas-dataframe"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/716","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=716"}],"version-history":[{"count":2,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/716\/revisions"}],"predecessor-version":[{"id":822,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/716\/revisions\/822"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/717"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=716"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=716"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=716"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}