{"id":873,"date":"2023-12-08T13:00:00","date_gmt":"2023-12-08T13:00:00","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=873"},"modified":"2023-12-11T05:04:57","modified_gmt":"2023-12-11T05:04:57","slug":"pandas-dataframe-quantile","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/pandas-dataframe-quantile\/","title":{"rendered":"Pandas DataFrame quantile Function (with Examples)"},"content":{"rendered":"\n<p>Pandas is a very powerful data manipulation library in Python that provides the ability to import and analyze data efficiently. Pandas library has a unique function which enables us to perform the above-mentioned tasks. In this article, we will learn the quantile method for Pandas DataFrame and explore how to use it with different examples.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What is the DataFrame.quantile() Function?<\/strong><\/h2>\n\n\n\n<p>In statistics, a quantile is a way to divide a dataset into equal parts. The quantile function in Python helps you find a specific value in your data set that can relate to a given probability.<\/p>\n\n\n\n<p><strong>The DataFrame.quantile() function in Pandas returns the values at the specified quantile for each column or row in a DataFrame.<\/strong> It uses the numpy.percentile function, internally,\u00a0 to perform the calculations. By dividing a frequency distribution into equal groups, each containing the same fraction of the total population, the quantiles can provide valuable insights into the data distribution.\u00a0<\/p>\n\n\n\n<p>In simple words, it can be used to divide our dataset by dividing them based on the frequency distribution of the data. Imagine you have a list of exam scores for a class. This function can help you figure out the score based on a probability distribution, for example, the top 25% of students from the rest. This separating score is called a quantile.<\/p>\n\n\n\n<p>Here is the basic syntax for how to use it:<\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')<\/code><\/pre>\n\n\n\n<p>Let us now look at the breakdown of the syntax.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>q<\/strong>: This parameter specifies the quantile(s) to compute. It can be a float or an array-like object with values between 0 and 1. The default value is 0.5, which corresponds to the 50% quantile.<br><\/li>\n\n\n\n<li><strong>axis<\/strong>: Determines whether the quantiles should be computed row-wise or column-wise. The value 0 or &#8216;index&#8217; corresponds to row-wise computation, while 1 or &#8216;columns&#8217; corresponds to column-wise computation. The default value is 0 (row-wise).<br><\/li>\n\n\n\n<li><strong>numeric_only<\/strong>: A boolean parameter that specifies whether only numeric data should be included in the computation. By default, it is set to True but can be set to False to include datetime and <a href=\"https:\/\/favtutor.com\/blogs\/timedelta-python\" data-type=\"link\" data-id=\"https:\/\/favtutor.com\/blogs\/timedelta-python\">timedelta<\/a> data as well.<br><\/li>\n\n\n\n<li><strong>interpolation<\/strong>: This optional parameter determines the interpolation method to use when the desired quantile lies between two data points. Available options are &#8216;linear&#8217;, &#8216;lower&#8217;, &#8216;higher&#8217;, &#8216;midpoint&#8217;, and &#8216;nearest&#8217;. The default method is &#8216;linear&#8217;.<\/li>\n<\/ul>\n\n\n\n<p>Now, let us look at various ways we can use the quantile function.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Calculating a Single Quantile<\/strong><\/h3>\n\n\n\n<p>We can find a single quantile easily with this function, as we have learned a quantile of the data is a separating factor based on a proportion.<\/p>\n\n\n\n<p>Let us consider an example in Python, suppose we want to find a 0.2 quantile of all the columns of a DataFrame:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndf = pd.DataFrame({'A': [1, 5, 3, 4, 2],\n                   'B': [3, 2, 4, 3, 4],\n                   'C': [2, 2, 7, 3, 4],\n                   'D': [4, 3, 6, 12, 7]})\n# Display the DataFrame\nprint('Original DataFrame:\\n', df)\n\n# Display the 0.2 quantile\nprint('The 0.2 quantile of the data:\\n',df.quantile(0.2))<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\nA  B  C   D\n0  1  3  2   4\n1  5  2  2   3\n2  3  4  7   6\n3  4  3  3  12\n4  2  4  4   7\n\nThe 0.2 quantile of the data:\n A    1.8\nB    2.8\nC    2.0\nD    3.8\nName: 0.2, dtype: float64\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Calculating Multiple Quantiles<\/strong><\/h3>\n\n\n\n<p>To calculate multiple quantiles, we can pass an array-like object as the parameter. Let&#8217;s find the 0.1, 0.25, 0.5, and 0.75 quantiles along the index axis for the DataFrame with the following Python code:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndf = pd.DataFrame({'A': [1, 5, 3, 4, 2],\n                   'B': [3, 2, 4, 3, 4],\n                   'C': [2, 2, 7, 3, 4],\n                   'D': [4, 3, 6, 12, 7]})\n# Display the DataFrame\nprint('Original DataFrame:\\n', df1)\n\n# Pass the array-like object to find multiple quantiles\nres = df.quantile([0.1, 0.25, 0.5, 0.75], axis=0)\n\n# Display the 0.2 quantile\nprint('The resulting quantiles of the data:\\n',res)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\n    A  B  C   D\n0  1  3  2   4\n1  5  2  2   3\n2  3  4  7   6\n3  4  3  3  12\n4  2  4  4   7\n\nThe resulting quantiles of the data:\n         A    B    C    D\n0.10  1.4  2.4  2.0  3.4\n0.25  2.0  3.0  2.0  4.0\n0.50  3.0  3.0  3.0  6.0\n0.75  4.0  4.0  4.0  7.0<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Including Non-Numeric Data<\/strong><\/h3>\n\n\n\n<p>By default, the quantile() function only considers numeric data for calculation. However, you can include datetime and timedelta data by setting the numeric_only parameter to False.\u00a0<\/p>\n\n\n\n<p>Let us consider an example:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import pandas as pd\n\ndf = pd.DataFrame({'A': [1, 2],\n                   'B': [pd.Timestamp('2010'), pd.Timestamp('2011')],\n                   'C': [pd.Timedelta('1 days'), pd.Timedelta('2 days')]})\n# Display the DataFrame\nprint('Original DataFrame:\\n', df)\n\n# Using numeric_only=False to include datetime and timedelta objects\nres = df.quantile(0.5, numeric_only=False)\n\n# Display the 0.2 quantile\nprint('The resulted quantile of the data:\\n',res)<\/pre><\/div>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-background\" style=\"background-color:#fedcba\"><code>Original DataFrame:\nA          B      C\n0  1 2010-01-01 1 days\n1  2 2011-01-01 2 days\n\nThe resulted quantile of the data:\nA                    1.5\nB    2010-07-02 12:00:00\nC        1 days 12:00:00\nName: 0.5, dtype: object<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p>In this article, we learned how to use the .quantile() function in Pandas Python. We explored the various techniques we can use to find single or multiple quantiles, as this function provide a flexible and efficient solution. For more assistance, we can <a href=\"https:\/\/favtutor.com\/python-assignment-help\" data-type=\"link\" data-id=\"https:\/\/favtutor.com\/python-assignment-help\">help with your Python homework<\/a> as well.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to use the quantile function in Python to find single or multiple quantiles in Pandas DataFrame.<\/p>\n","protected":false},"author":10,"featured_media":875,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[35],"tags":[37,54],"class_list":["post-873","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","tag-pandas","tag-pandas-dataframe"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/873","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=873"}],"version-history":[{"count":3,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/873\/revisions"}],"predecessor-version":[{"id":916,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/873\/revisions\/916"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/875"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=873"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=873"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=873"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}