What’s New ?

The Top 10 favtutor Features You Might Have Overlooked

Read More

How to Remove Duplicates from List in Python? (with code)

  • Nov 13, 2023
  • 7 Minutes Read
  • Why Trust Us
    We uphold a strict editorial policy that emphasizes factual accuracy, relevance, and impartiality. Our content is crafted by top technical writers with deep knowledge in the fields of computer science and data science, ensuring each piece is meticulously reviewed by a team of seasoned editors to guarantee compliance with the highest standards in educational content creation and publishing.
  • By Shivali Bhadaniya
How to Remove Duplicates from List in Python? (with code)

In Python, a list is a versatile collection of objects that allows for duplicates.  But sometimes it is necessary to make the list unique to streamline our data or perform certain operations. Here, we are going to study the multiple ways to remove duplicates from the list in Python. So, let's get started!

Why Remove Duplicates from the List?

A Python list is a built-in data structure to store a collection of items. It is written as the list of comma-separated values inside the square bracket. The most important advantage of it is that the elements inside the list are not compulsorily of the same data type. Learn more about printing lists in Python to understand the concept better.

Now there are several reasons to remove duplicates from a list. Duplicates in a list can take up unnecessary space and decrease performance. Additionally, it can lead to confusion and errors if you're using the list for certain operations. So, removing duplicates will make the data more accurate for better analysis.

For example, if you're trying to find the unique elements in a list, duplicates can give you incorrect results. In general, it's a good idea to remove duplicates from a list to make it more organized and easier to work with.

If you are still confused between an array and a list, read Array vs List in Python.

7 Ways to Remove Duplicates from a List in Python

There are many ways to remove duplicates from a list in Python. Let’s check them out one by one:

1) Using set()

 A set is a data structure that is very similar to lists. It is a collection of items that can be accessed using a single variable name.

The simplest way to remove duplicates from a list in Python is by converting the list into a set. It will automatically remove similar entries because set has a property that it cannot have duplicate values.

If a list is typecasted to a set, that is it is passed as an argument to the set() method, it will automatically create a set consisting of all elements in the list but it will not keep duplicate values. The resultant set can be converted back to a list using the list() method.

Example:

# removing duplicated from the list using set() 

# initializing list 
sam_list = [11, 15, 13, 16, 13, 15, 16, 11] 
print ("The list is: " + str(sam_list)) 

# to remove duplicated from list 
sam_list = list(set(sam_list)) 

# printing list after removal 
# ordering distorted
print ("The list after removing duplicates: " + str(sam_list)) 

 

Output:

 The list is: [11, 13, 15, 16, 13, 15, 16, 11]

 The list after removing duplicates: [11, 13, 15, 16]

 

This approach makes use of Python sets which are implemented as hash tables, allowing for very quick validity checks. It is quite quick, especially for larger lists, but the only drawback is that we will lose the order that exists in the original list.

2) Using a For Loop

In this method, we will iterate over the whole list using a 'for' loop. We will create a new list to keep all the unique values and use the "not in" operator in Python to find out if the current element that we are checking exists in the new list that we have created. If it does not exist, we will add it to the new list and if it does exist we will ignore it.

Example:

# removing duplicated from the list using naive methods 

# initializing list 
sam_list = [11, 13, 15, 16, 13, 15, 16, 11] 
print ("The list is: " + str(sam_list)) 

# remove duplicated from list 
result = [] 
for i in sam_list: 
    if i not in result: 
        result.append(i) 

# printing list after removal 
print ("The list after removing duplicates : " + str(result)) 

 

Output:

 The list is: [11, 13, 15, 16, 13, 15, 16, 11]

 The list after removing duplicates: [11, 13, 15, 16]

 

3) Using collections.OrderedDict.fromkeys()

This is the fastest method to solve the problem. We will first remove the duplicates and return a dictionary that has been converted to a list. In the below code when we use the fromkeys() method it will create keys of all the elements in the list. But keys in a dictionary cannot be duplicated, therefore, the fromkeys() method will remove duplicate values on its own.

Example:

# removing duplicated from list using collections.OrderedDict.fromkeys() 
from collections import OrderedDict 

# initializing list 
sam_list = [11, 15, 13, 16, 13, 15, 16, 11] 
print ("The list is: " + str(sam_list)) 

# to remove duplicated from list 
result = list(OrderedDict.fromkeys(sam_list)) 

# printing list after removal 
print ("The list after removing duplicates: " + str(result)) 

 

Output:

 The list is: [11, 13, 15, 16, 13, 15, 16, 11]

 The list after removing duplicates: [11, 13, 15, 16] 

 

We used OrderedDict from the collections module to preserve the order.

4) Using a list comprehension

List comprehension refers to using a for loop to create a list and then storing it under a variable name. The method is similar to the naive approach that we have discussed above but instead of using an external for loop, it creates a for loop inside the square braces of a list. This method is called list comprehension.

We use the for loop inside the list braces and add the if condition allowing us to filter out values that are duplicates.

Example:

# removing duplicated from the list using list comprehension 

# initializing list 
sam_list = [11, 13, 15, 16, 13, 15, 16, 11] 
print ("The list is: " + str(sam_list)) 

 
# to remove duplicated from list 
result = [] 
[result.append(x) for x in sam_list if x not in result] 

# printing list after removal 
print ("The list after removing duplicates: " + str(result)) 

 

Output:

 The list is: [11, 13, 15, 16, 13, 15, 16, 11]

 The list after removing duplicates: [11, 13, 15, 16]

 

5) Using list comprehension & enumerate()

List comprehensive when merged with enumerate function we can remove the duplicate from the python list. Basically in this method, the already occurred elements are skipped, and also the order is maintained. This is done by the enumerate function.

In the code below, the variable n keeps track of the index of the element being checked, and then it can be used to see if the element already exists in the list up to the index specified by n. If it does exist, we ignore it else we add it to a new list and this is done using list comprehensions too as we discussed above.

Example:

# removing duplicated from the list using list comprehension + enumerate() 

# initializing list 
sam_list = [11, 15, 13, 16, 13, 15, 16, 11] 
print ("The list is: " + str(sam_list)) 

# to remove duplicated from list 
result = [i for n, i in enumerate(sam_list) if i not in sam_list[:n]] 

# printing list after removal 
print ("The list after removing duplicates: " + str(result)) 

 

Output:

 The list is: [11, 13, 15, 16, 13, 15, 16, 11]

 The list after removing duplicates: [11, 13, 15, 16]

 

 6) Using the ‘pandas’ module

Using the pd.Series() method, a Pandas Series object is constructed from the orginal list. The Series object is then invoked using the drop duplicates() function to eliminate any duplicate values. Lastly, using the tolist() function, the resulting Series object is transformed back into a list.

Example:

import pandas as pd
original_list = [1, 1, 2, 3, 4, 4]
new_list = pd.Series(original_list).drop_duplicates().tolist()
print(f"the original list is {original_list} and the new list without duplicates is {new_list}”)

 

Output:

the original list is [1, 1, 2, 3, 4, 4] and the new list without duplicates is [1, 2, 3, 4]

 

7) Using the ‘pandas’ module

Using the pd.Series() method, a Pandas Series object is constructed from the original list. The Series object is then invoked using the drop duplicates() function to eliminate any duplicate values. Lastly, using the tolist() function, the resulting Series object is transformed back into a list.

Example:

import pandas as pd
original_list = [1, 1, 2, 3, 4, 4]
new_list = pd.Series(original_list).drop_duplicates().tolist()
print(f"the original list is {original_list} and the new list without duplicates is {new_list}”)

 

Output:

the original list is [1, 1, 2, 3, 4, 4] and the new list without duplicates is [1, 2, 3, 4]

 

How to remove duplicate words from a list? 

To remove duplicate words from a list in Python, you can use the set( ) function, consider the example below :

my_list = ["mobile","laptop","earphones","mobile", 'laptop']
new_list = list(set(my_list))
print(f"the old list was {my_list}, the new list without duplicates is {new_list}”)

 

Output:

the old list was ['mobile', 'laptop', 'earphones', 'mobile', 'laptop'], the new list without duplicates is ['mobile', 'laptop', 'earphones']

 

Keep in mind, that this method does not maintain the order of the original list. If you need to keep the order, you can filter out duplicates with a loop and an empty list:

my_list =['mobile', 'laptop', 'earphones', 'mobile', 'laptop']
new_list = []
for word in my_list:
    if word not in new_list:
       new_list.append(word)
print(f"the old list was {my_list}, the new list without duplicates is {new_list}”)

 

Output:

the old list was ['mobile', 'laptop', 'earphones', 'mobile', 'laptop'], the new list without duplicates is ['mobile', 'laptop', 'earphones']

 

Key Considerations

While the methods described above provide effective ways to remove duplicates from a list, there are a few additional considerations to keep in mind:

  • Data Type Compatibility: Some methods may have limitations when working with specific data types. Ensure that the chosen method is compatible with the elements in your list.

  • Order Preservation: If maintaining the original order of elements is crucial, consider using methods such as list comprehension with enumeration or the collections module's OrderedDict class.

  • Performance: For large lists or time-sensitive operations, it's important to choose a method that offers optimal performance. Consider conducting performance tests to determine the most efficient approach.

Conclusion

Removing duplicates from a list in Python is a common task, and this article has provided you with various methods to accomplish it. We learned different methods to remove duplicate elements from the list in Python from the naive approach to utilizing set, list comprehension, and specialized modules.

FavTutor - 24x7 Live Coding Help from Expert Tutors!

About The Author
Shivali Bhadaniya
I'm Shivali Bhadaniya, a computer engineer student and technical content writer, very enthusiastic to learn and explore new technologies and looking towards great opportunities. It is amazing for me to share my knowledge through my content to help curious minds.