Python is famous for its English-like commands, code readability, and its simple programming syntax. It also provides its users with a wide array of data structures and list is one of them. The List data structure allows us to store large amounts of sequential data in a single variable. A good programmer must know how to handle and manipulate Lists. In this article, you will learn how to partition a list into chunks of a given size. But before we do that, let us give you a quick recap of Lists.
What is a List?
If you are familiar with arrays in different programming languages then the concept of lists will come naturally to you. The list is one of the built-in data types in python, which is used to store a collection of data. We think of lists as dynamically sized arrays, which can store heterogeneous data as well. By dynamically sized arrays, what we mean is that the size of the lists can change during the runtime. Lastly, lists have the property of being mutable, meaning we can alter lists by adding or removing an element even after creating them. Just like an array, the elements of a list are indexed, with the index of the first element being 0.
For example:
# create a list List = ["favTutor", 60, 32.6] # acessing elements of the list print("Element at the first position: " +List[0]) print("Element at the third position: " +str(List[2]))
Output:
Element at the first position: favTutor
Element at the third position: 32.6
Split a List into Chunks of Given Size in Python
Some applications, like development and machine learning, require lists to be split in a certain way i.e., equal chunks of a certain size must be formed. Not sure what this means? Let us lend a hand. Splitting a list into equal chunks returns a list containing a number of lists, each having an equal number of elements. For example, If the given list contains m elements and we have been given that the size of each chunk should be n. Then the newly formed list will contain m/n lists each containing n elements. See the illustration below:
If the given length of the list is not fully divisible the given length of the chunks, then the last partition should be filled with the left-over elements. Look at the illustration below, to get a better grasp of this idea.
Python provides us with 5 methods to split into chunks of equal size. Let us now dive deep into each of those methods one by one.
1) Using yield
Before discussing this approach, let us first understand in brief what yield is. To make this concept easy to understand, let us draw an analogy between the return keyword and the yield keyword. The former is used inside a function to exit a function and give back a value to the caller function. The latter is just like it, but there is one major difference, yield gives back a generator to the caller function rather than any object or data structure. To iterate on this concept, when a function is called and the thread of execution encounters a yield keyword in the function, the function is terminated at that line and the caller is given a generator object.
In this approach, we create a generator to yield successive chunks of the given size.
For example:
# create a list o_list = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] # generator def partition(lst, size): for i in range(0, len(lst), size): yield lst[i : i+size] # size of each chunk n = 2 # partition the list p_list = list(partition(o_list, n)) # display original list print("Original List: ") print(o_list) # display the list results print("Partitioned List:") print(p_list)
Output:
[10, 20, 30, 40, 50, 60, 70, 80, 90, 100] Partitioned List: [[10, 20], [30, 40], [50, 60], [70, 80], [90, 100]]
2) Using list comprehension
List Comprehension provides us with a more elegant way of achieving our aim. It is a shorter and concise way of creating a new list from an existing list, tuple, strings, etc. A list comprehension consists of an expression and a for loop. The for loop is used to iterate over each element of the original list and the expression is executed for each of those elements. The new value of each element, generated using the expression is included in the new list. Now you may be wondering how we can split a list using List comprehension? Don’t worry, we will tell you.
This approach combines the concepts of slicing a list and list comprehension. We first loop through the length of the original list, with the step size equal to the given chunk size. Using the value of the iterator we partition the original list into smaller lists.
For example:
# create a list o_list = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] # size of each chunk n = 3 # partition the list using list comprehension p_list = [o_list[i:i + n] for i in range(0, len(o_list), n)] # display original list print("Original List: ") print(o_list) # display the list results print("Partitioned List:") print(p_list)
Output:
Original List: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] Partitioned List: [[10, 20, 30], [40, 50, 60], [70, 80, 90], [100]]
3) Using itertools module
Python language comes with a large number of modules and itertools is one of them. This module has a method called islice(), to create a slice of a list. Let us first look at the method syntax of this function to understand better:
itertools.islice(iterable, start, stop[, step])
This method takes four arguments: iterable, start, stop, and step. The iterable argument, in this case, is the list that we want to break into chunks. The start index refers to that index of the original list from where the elements are to be included in the slice. The stop index on the other hand refers to that index till where the elements are to be included. The step argument allows us to skip elements. Now let us understand, how we can divide a list into smaller chunks of a given size using this method.
In this approach, we create a generator method that yields a slice or chunk of the original list.
For example:
import itertools # create a list o_list = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] # generator def partition(lst, size): for i in range(0, len(lst), size): yield list(itertools.islice(lst, i, i + size)) # size of each chunk n = 3 # partition the list p_list = list(partition(o_list, n)) # display original list print("Original List: ") print(o_list) # display the list results print("Partitioned List:") print(p_list)
Output:
Original List: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] Partitioned List: [[10, 20, 30], [40, 50, 60], [70, 80, 90], [100]]
4) Using toolz.itertoolz module
If the concept of using yield or list comprehension is confusing to you, then do not worry. You can use this straightforward method to achieve your aim. The tools.itertools module of python contains the partition() method. Let us first look at the method syntax of this method:
toolz.itertoolz.partition(n, seq, pad='__no__pad__')
This method simply takes the size of chunks as its first parameter and the list which needs to be divided as its second parameter. Look at the example below, to see exactly how one can use this method to achieve partition of a list into chunks of equal size.
For example:
from toolz import partition # create a list o_list = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] # size of each chunk n = 3 # partition the list p_list = list(partition(n, o_list)) # display original list print("Original List: ") print(o_list) # display the list results print("Partitioned List:") print(p_list)
Output:
Original List: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] Partitioned List: [[10, 20, 30], [40, 50, 60], [70, 80, 90], [100]]
5) Using NumPy
Numpy is a library in python which is very famous for uniquely handling arrays. This module contains several methods to implement operations related to linear algebra, Fourier transform, and matrices. It also consists of a method called array_spit(), which does the very same task which we are trying to achieve. The method syntax of this function is as follows:
numpy.array_split(ary, indices_or_sections, axis=0)
This method takes the list to be partitioned as its first argument and the size of the chunks as its second argument. The axis argument comes into the picture when we are working with dataframes. But since today we are only working with simple lists, you can ignore this argument for now. Let us now look at this function in action.
For example:
import numpy # create a list o_list = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] # size of each chunk n = 2 # partition the list p_list = numpy.array_split(o_list, n) # display original list print("Original List: ") print(o_list) # display the list results print("Partitioned List:") print(p_list)
Output:
Original List: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] Partitioned List: [array([10, 20]), array([30, 40]), array([50, 60]), array([70, 80]), array([90, 100])]
Conclusion
The list is one of the most fundamental concepts of python and knowing one's way around them is a skill that everyone must possess. In this article, we saw five different ways on how we can partition a list into chunks of a given size. We request you to go through all the above-discussed methods thoroughly as they are highly efficient and useful.