Python provides Pandas, which is one of the most powerful and widely used libraries for importing, manipulating, and analyzing data. It provides a lot of tools and utilities that enable one to perform efficient data manipulation and analysis. In this article, we will explore the various ways of utilizing the iloc() method in Python Pandas.
What is the Pandas iloc() method?
The iloc() method in Pandas is used for grabbing data based on positions. It’s especially handy when our DataFrame doesn’t have simple numeric labels or when we don’t really know what those labels are.
So, here’s how it works: when we write dataframe.iloc[row_index, column_index], we’re telling Pandas to give us the data at a specific position. The row_index is the number that says which row we want, and if we want, we can also say which column we’re interested in using column_index (but that part is optional).
Depending on what we ask for, Pandas will give us either a whole bunch of rows and columns (which is like getting a DataFrame) or just one row or one column (which is like getting a Series).
Let us now dive into the various ways we can use the iloc() function.
1) Extracting a Single Row
To extract a single row from a DataFrame using iloc(), we can simply provide the index position of the desired row.
For example, consider the following DataFrame:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
Here, suppose we want to get the second row of this DataFrame. We can do this by simply using iloc() as follows:
rows = df.iloc[1]
print(rows)
Output:
Name Jane
Age 30
City London
2) Extracting Multiple Rows
We can also extract multiple rows using the iloc() method. We can use a list of integers or a slice object.
Let us first try using a list of integers to extract multiple rows. For example, consider the following DataFrame:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
To extract multiple rows using a list of integers, we can pass the list as the index position parameter to iloc(). For example, if we want to extract the second and fourth rows of the DataFrame, we can do the following:
rows = df.iloc[[1, 3]]
# Display the extracted rows.
print(rows)
Output:
Name Age City
1 Jane 30 London
3 Bob 40 Tokyo
Now, let us try using the slice of an object. Again, consider the same example as above:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
We can use a slice object to extract a range of rows from the DataFrame. The slice object is specified using the start: end notation. For example, if we want to extract the second and third rows of the DataFrame, we can do the following:
rows = df.iloc[1:3]
# Display the extracted rows.
print(rows)
Output:
Name Age City
1 Jane 30 London
2 Alice 35 Paris
3) Boolean Array
There is another method to extract multiple rows from a DataFrame. We can also use a boolean array.
If we have a boolean array of the same length as the index, we can use it to extract rows that correspond to True values in the array. For example, suppose we have the following boolean array:
mask = [True, False, True, False]
We can use this boolean array to extract the rows from the DataFrame as follows:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Make the boolean array mask = [True, False, True, False] rows = df.iloc[mask] # Display the extracted rows. print('Extracted Rows:\n',rows)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
Extracted Rows:
Name Age City
0 John 25 New York
2 Alice 35 Paris
4) Lambda Function
We can also use a callable function in combination with iloc() to extract rows based on a custom condition. The callable function should take a Series or DataFrame as input and return a valid output for indexing.
For example, suppose we want to extract rows with even index labels. We can define a lambda function that checks if the index label is even and use it with iloc() as follows:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Using the lambda function rows = df.iloc[lambda x: x.index % 2 == 0] # Display the extracted rows. print('Extracted Rows:\n',rows)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
Extracted Rows:
Name Age City
0 John 25 New York
2 Alice 35 Paris
5) Extracting both Rows and Columns
So far, we have focused only on extracting rows using iloc(). However, the iloc() can be used to extract both rows and columns simultaneously, allowing us to get smaller subsets of the DataFrame.
If we want to extract a specific row and a specific column from a DataFrame, we can provide the row and column index positions as single integers.
For example, if we want to get the value in the second row and first column of the DataFrame, we can do the following:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Extracting the value value = df.iloc[1, 0] # Display the extracted rows. print('Extracted value:\n',value)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
Extracted value:
Jane
6) Using Lists of Integers
To extract specific rows and columns using lists of integers, we can pass the lists with the index position of the row and column as parameters to iloc().
For example, if we want to extract the values in the third and fourth rows, and the second and third columns of the DataFrame, we can do the following:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Extracting the values as the subset of the df subset = df.iloc[[2, 3], [1, 2]] # Display the extracted values. print('Extracted subset:\n',subset)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
Extracted subset:
Age City
2 35 Paris
3 40 Tokyo
7) Using Slice Objects
Similar to extracting rows, we can also use the slice objects to extract a range of rows and columns from a DataFrame.
For example, if we want to extract the values in the second and third rows, and the first three columns of the DataFrame, we can do the following:
import pandas as pd data = {'Name': ['John', 'Jane', 'Alice', 'Bob'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'London', 'Paris', 'Tokyo']} df = pd.DataFrame(data) # Display the original DataFrame print('Original DataFrame:\n', df) # Extracting the values as the subset of the df using slice object subset = df.iloc[1:3, 0:3] # Display the extracted values. print('Extracted subset:\n',subset)
Output:
Original DataFrame:
Name Age City
0 John 25 New York
1 Jane 30 London
2 Alice 35 Paris
3 Bob 40 Tokyo
Extracted subset:
Name Age City
1 Jane 30 London
2 Alice 35 Paris
Conclusion
In this article, we looked at a powerful tool called iloc() in the Pandas library. It’s like a special method that helps us grab specific rows and columns from a table of data (which we call a DataFrame) based on their position in the table. Still confused? Well, then you might need some good help with your Python homework from our experts!