As a python list is a collection of multiple elements even containing duplicates, sometimes it is necessary to make the list unique. Here, we are going to study the multiple ways to remove duplicates from the list in python. So, let's get started!
What is a List?
The list is the most important data type in the python language. In Python language, the list is written as the list of comma-separated values inside the square bracket. The most important advantage of the list is that the elements inside the list are not compulsorily of the same data type and negative indexing.
Also, all the operation of the string is similarly applied to list data type such as slicing, concatenation, etc. Also, we can create a nested list i.e list containing another list.
Example:
# creating a list of items with different data types sample_list = [6,"mark",[A,I]] print(sample_list)
Output:
[6, mark, ['A', 'I']]
What is the Need to Remove Duplicates from the List?
There are several reasons to do it. Duplicates in a list can make it difficult to read and understand. It can also take up unnecessary space. These are the major reasons to remove duplicates from a python list. Additionally, it can lead to confusion and errors if you're using the list for certain operations.
For example, if you're trying to find the unique elements in a list, duplicates can give you incorrect results. In general, it's a good idea to remove duplicates from a list to make it more organized and easier to work with.
5 Ways to Remove Duplicates from a List in Python
There are many ways to remove duplicates from a list in python. Let’s study them below:
Method 1) Naïve Method
In this method, we will iterate over the whole list using a for loop. We will create a new list to keep all the unique values and use the "not in" operator in python to find out if the current element that we are checking exists in the new list that we have created. If it does not exist, we will add it to the new list and if it does exist we will ignore it.
Code:
# removing duplicated from the list using naive methods # initializing list sam_list = [11, 13, 15, 16, 13, 15, 16, 11] print ("The list is: " + str(sam_list)) # remove duplicated from list result = [] for i in sam_list: if i not in result: result.append(i) # printing list after removal print ("The list after removing duplicates : " + str(result))
Output:
The list is: [11, 13, 15, 16, 13, 15, 16, 11] The list after removing duplicates: [11, 13, 15, 16]
Method 2) Using a list comprehension
List comprehension refers to using a for loop to create a list and then storing it under a variable name. The method is similar to the naive approach that we have discussed above but instead of using an external for loop, it creates a for loop inside the square braces of a list. This method is called list comprehension.
We use the for loop inside the list braces and add the if condition allowing us to filter out values that are duplicates.
Code:
# removing duplicated from the list using list comprehension # initializing list sam_list = [11, 13, 15, 16, 13, 15, 16, 11] print ("The list is: " + str(sam_list)) # to remove duplicated from list result = [] [result.append(x) for x in sam_list if x not in result] # printing list after removal print ("The list after removing duplicates: " + str(result))
Output:
The list is: [11, 13, 15, 16, 13, 15, 16, 11] The list after removing duplicates: [11, 13, 15, 16]
Method 3) Using set()
This method is the most popular method to remove the duplicate from the python list. A set is a data structure that is very similar to lists. It is a collection of items that can be accessed using a single variable name. But the most important property of a set is that it cannot have duplicate values. How can we use this?
If a list is typecasted to a set, that is it is passed as an argument to the set() method, it will automatically create a set consisting of all elements in the list but it will not keep duplicate values. The resultant set can be converted back to a list using the list() method. The only drawback to this method is, you lose the order that exists in the original list.
Code:
# removing duplicated from the list using set() # initializing list sam_list = [11, 15, 13, 16, 13, 15, 16, 11] print ("The list is: " + str(sam_list)) # to remove duplicated from list sam_list = list(set(sam_list)) # printing list after removal # ordering distorted print ("The list after removing duplicates: " + str(sam_list))
Output:
The list is: [11, 13, 15, 16, 13, 15, 16, 11] The list after removing duplicates: [11, 13, 15, 16]
Method 4) Using list comprehensive + enumerate()
List comprehensive when merged with enumerate function we can remove the duplicate from the python list. Basically in this method, the already occurred elements are skipped, and also the order is maintained. This is done by the enumerate function.
In the code below, the variable n keeps track of the index of the element being checked, and then it can be used to see if the element already exists in the list up to the index specified by n. If it does exist, we ignore it else we add it to a new list and this is done using list comprehensions too as we discussed above.
Code:
# removing duplicated from the list using list comprehension + enumerate() # initializing list sam_list = [11, 15, 13, 16, 13, 15, 16, 11] print ("The list is: " + str(sam_list)) # to remove duplicated from list result = [i for n, i in enumerate(sam_list) if i not in sam_list[:n]] # printing list after removal print ("The list after removing duplicates: " + str(result))
Output:
The list is: [11, 13, 15, 16, 13, 15, 16, 11] The list after removing duplicates: [11, 13, 15, 16]
Method 5) Using collections.OrderedDict.fromkeys()
This is the fastest method to achieve the target of removing duplicates from the python list. This method will first remove the duplicates and return a dictionary that has converted to a list. Also, this method works well in the case of a string.
In the below code when we use the fromkeys() method it will create keys of all the elements in the list. But keys in a dictionary cannot be duplicated, therefore, the fromkeys() method will remove duplicate values on its own.
Code:
# removing duplicated from list using collections.OrderedDict.fromkeys() from collections import OrderedDict # initializing list sam_list = [11, 15, 13, 16, 13, 15, 16, 11] print ("The list is: " + str(sam_list)) # to remove duplicated from list result = list(OrderedDict.fromkeys(sam_list)) # printing list after removal print ("The list after removing duplicates: " + str(result))
Output:
The list is: [11, 13, 15, 16, 13, 15, 16, 11] The list after removing duplicates: [11, 13, 15, 16]
These are a few of the methods with which we can remove the duplicate from the python list.
Conclusion
Hence, in this article, we learned about the python list and different methods to remove duplicate elements from the list in python. Also, we studied the example along with the output for different methods.