The String Compression problem is one of the famous leetcode problems in Data structure and algorithms that involves finding and returning the length of a new string formed after compressing the original one. In this article, we will understand the String Compression problem and explore the approach to solve it by implementing code in C++, Java, and Python.
What is a String Compression Problem?
In the string compression problem, we are given with vector of characters and we have to compress those according to some rules and return the length of a new string thus formed.
Those rules are: Begin with an empty string s. For each group of consecutive repeating characters in chars: If the group’s length is 1, append the character to s. Otherwise, append the character followed by the group’s length.
Let us discuss this problem with an example:
chars = [“a”,“b”,“b”,“c”,“c”,“c”]
Output = 5
Naive Approach
The simplest way to solve the String Compression problem is by using two pointers. One pointer for iterating through the original character array and one for keeping track of the current position in the compressed array. We will also use a variable to keep track of the count of consecutive characters.
Below are the steps to be followed:
- First, we will set the current letter to the first character in the array and initialize the count to 0.
- Then we will iterate through the array until you find a different character or reach the end of the array.
- For each iteration, increment the count and the index i.
- When we find a different character or reach the end of the array, write the current letter to the compressed array and, if the count is greater than 1, write the count as a string to the compressed array.
- Then reset the count to 0 and set the current letter to the new letter.
- Finally, we return the length of the compressed array, which is equal to the position of the last character in the compressed array plus one.
The time complexity for the naive approach is O(n) because we are iterating through the whole chars vector only once. The space complexity for this approach is O(n).
Optimized Approach
In this approach, we will use the same approach used above but instead of storing the compressed string separately we will store its length and update the input vector given to us. Thus to store the length of the compressed string we will use another ans variable.
- Different character and count = 1 so we will do ans += 1 and get ans =1;
- Different character and count = 2 so we will do ans += 2 and get ans = 3;
- j goes out of bound and the count is equal to 3 so ans += 2 and we will get ans = 5
C++ Code
Here is the C++ program to solve String Compression problem:
#include <iostream> #include <vector> #include <string> using namespace std; // Adding the std namespace // Function to compress a vector of characters in-place int compress(vector<char>& chars) { int ans = 0; // Variable to keep track of the compressed vector length // Iterate through input vector using i pointer for (int i = 0; i < chars.size();) { const char letter = chars[i]; // Current character being compressed int count = 0; // Count of consecutive occurrences of the letter // Count consecutive occurrences of the letter in the input vector while (i < chars.size() && chars[i] == letter) { ++count; ++i; } // Write the letter to the compressed vector chars[ans++] = letter; // If count is greater than 1, write count as a string to the compressed vector if (count > 1) { // Convert count to a string and iterate over each character in the string for (const char c : to_string(count)) { chars[ans++] = c; } } } // Return the length of the compressed vector return ans; } int main() { // Example vector vector<char> inputVector = {'a', 'b', 'b', 'c', 'c', 'c'}; // Call the compress function int compressedLength = compress(inputVector); // Display the compressed vector and its length cout << "Compressed Vector: "; for (int i = 0; i < compressedLength; ++i) { cout << inputVector[i] << " "; } cout << "\nCompressed Vector Length: " << compressedLength << endl; return 0; }
Java Code
Below is the Java code to solve this problem:
import java.util.ArrayList; import java.util.List; public class Solution { public static int compress(List<Character> chars) { int ans = 0; // Iterate through input list using i pointer for (int i = 0; i < chars.size();) { char letter = chars.get(i); // Current character being compressed int count = 0; // Count of consecutive occurrences of letter // Count consecutive occurrences of letter in input list while (i < chars.size() && chars.get(i) == letter) { ++count; ++i; } // Write letter to compressed list chars.set(ans++, letter); // If count is greater than 1, write count as string to compressed list if (count > 1) { // Convert count to string and iterate over each character in string for (char c : Integer.toString(count).toCharArray()) { chars.set(ans++, c); } } } // Return length of compressed list return ans; } public static void main(String[] args) { // Example list List<Character> inputList = new ArrayList<>(); inputList.add('a'); inputList.add('b'); inputList.add('b'); inputList.add('c'); inputList.add('c'); inputList.add('c'); // Call the compress function int compressedLength = compress(inputList); // Display the compressed list and its length System.out.print("Compressed List: "); for (int i = 0; i < compressedLength; ++i) { System.out.print(inputList.get(i) + " "); } System.out.println("\nCompressed List Length: " + compressedLength); } }
Python Code
You can implement it in Python as well:
def compress(chars): ans = 0 # Iterate through input list using i pointer i = 0 while i < len(chars): letter = chars[i] # Current character being compressed count = 0 # Count of consecutive occurrences of letter # Count consecutive occurrences of letter in input list while i < len(chars) and chars[i] == letter: count += 1 i += 1 # Write letter to compressed list chars[ans] = letter ans += 1 # If count is greater than 1, write count as string to compressed list if count > 1: # Convert count to string and iterate over each character in string for c in str(count): chars[ans] = c ans += 1 # Return length of compressed list return ans # Example list input_list = ['a', 'b', 'b', 'c', 'c', 'c'] # Call the compress function compressed_length = compress(input_list) # Display the compressed list and its length print("Compressed List:", end=" ") for i in range(compressed_length): print(input_list[i], end=" ") print("\nCompressed List Length:", compressed_length)
Output:
Compressed Vector: a b 2 c 3
Compressed Vector Length: 5
The Time complexity for the approach is O(n) because we are iterating through the vector only once. The Space complexity for this approach is O(1) because no extra space is used to store compressed strings.
The use of carefully managed pointers minimizes time complexity to O(n) while maintaining constant space complexity. This approach is recommended for larger datasets where performance is crucial.
Finally, check the comparison of both of them:
Approaches | Time Complexity | Space Complexity | Details |
---|---|---|---|
Naive Approach:Iterating over array | O(n) | O(n) | We traverse over the whole string and consider groups of repeated characters and based on their length we form a compressed string and at last return it’s length. |
Optimized Approach | O(n) | O(1) | Here instead of storing the whole compressed string we store only its length and update the input vector given to us. |
Conclusion
In conclusion, the optimized approach given above to solve the String Compression leetcode problem stands out as the most efficient solution for the string compression problem, providing a balance between simplicity and performance. There are so many such interesting problems that you need to master, which you can do by joining our Data Structures Bootcamp to learn it thoroughly.