What’s New ?

The Top 10 favtutor Features You Might Have Overlooked

Read More

Rabin-Karp Algorithm for Pattern Search (C++ & Java Code)

  • Oct 01, 2022
  • 7 Minutes Read
  • Why Trust Us
    We uphold a strict editorial policy that emphasizes factual accuracy, relevance, and impartiality. Our content is crafted by top technical writers with deep knowledge in the fields of computer science and data science, ensuring each piece is meticulously reviewed by a team of seasoned editors to guarantee compliance with the highest standards in educational content creation and publishing.
  • By Manvi Saxena
Rabin-Karp Algorithm for Pattern Search (C++ & Java Code)

If you are from a computer science background, you must know the importance of strings and patterns in the Data Structures & Algorithms world. Even though we have loads of data structures for information, strings are the only ones used for word processing. 

In this tech blog, we are going to learn about a crucial and very optimal string algorithm that is often the first choice for pattern-matching problems. Yes, we'll be learning about the Rabin-Karp Algorithm today!  We'll get a good understanding of what this algorithm is, a solved example using this algorithm for your ease, and last but not least its implementation in C++ and Java languages.

So, it's going to be an exciting ride, let's go!

What is Rabin-Karp Algorithm?

Patterns in a string are quite common to find with eyes, but what if you need an algorithm to search patterns in your string for a computer? Well, that's what a Rabin-Karp Algorithm is for!

A Rabin-Karp Algorithm is used for finding out patterns in a string using a Hash Function. Unlike the other alternatives present, this method does not check each and every alphabet but rather, minimizes its searching span over limited alphabets. 

Using a hash value in this algorithm is of great significance because due to this value only, the searching space is reduced manifolds and the efficiency increases tremendously. This procedure makes it much more efficient than the other methods. 

One more popular Aho-Corasick algorithm is also important for pattern searching.

Learning about the Hash Value in Rabin-Karp

What is Hash Value?

A Hash Value is a programming tool that is used for mapping a larger input to a relatively smaller output. In this world run by technology, there is information everywhere. But, how do we identify the information of what we need more than anything else?

For this, Hash Values are used. They help in uniquely identifying the information. Because of its collision resistance nature, there is a very rare chance of different information having the same hash value. 

How is Hash Value calculated in Rabin-Karp?

In the Rabin-Karp algorithm, the value of a long string is printed out as a small integer value using a hash value. If you're a curious computer geek like me and use Java as your main programming language, you must have used a function called HashCode(). If we pass a string inside this function, then on printing it we get an integer value, which is the Hash Value of the inputted string. 

But, in the Rabin-Karp algorithm, we use a special formula to find out the hash value of the pattern to be searched and compare it with the hash values of the given string to match the best. Take a look at the below formula that we use for calculating the same:

Hash Value of text = Σ(v * d^(m-1)) mod 13

where, v = numerical value of the text in lexicographical order, eg. a=1, b=2, c=3...

          d = number of characters in the input set,

          m = length of the input set

Example: Finding the Hash Value of abc:

string abc

= Σ(v * dn-1) mod 13

= ((1 * 102) + (2 * 101) + (3 * 100)) mod 13

= 123 mod 13

= 6

 

NOTE: In the above calculations, you can take the modulus of any prime number that eases the calculation. 

Understanding working with an Example

After learning about the hash value and its significance in the Rabin-Karp algorithm, let's finally move over to how this algorithm works internally. For your ease, we have provided an example and explained the work. Let's dive in!

Given Text: ABCCDDAEFG

working of rabin karp algorithm example

Text to be Searched: CDD

text to be searched

Steps to follow:

1. Assigning the Numerical Weight/ Value to the characters in general: In this step, we assign the values to the alphabets as per their lexicographical order. For eg. a=1, b=2, c=3...

2. Calculate the Hash Value of the string to be searched: Now, we calculate the hash value of the string that is to be searched with the formula that was given above, ie CDD.

Calculate the Hash Value of the string to be searched

= Σ(v * dm-1) mod 13

= ((3 * 102) + (4 * 101) + (4 * 100)) mod 13

= 344 mod 13

= 6

 

3. Calculate the Hash Value for the text by dividing them into groups of size 'm': For the first window ABC, hash value:

= Σ(v * dn-1) mod 13

= ((1 * 102) + (2 * 101) + (3 * 100)) mod 13

= 123 mod 13

= 6

 

In this step, after calculating the hash value of each window one by one, we check whether the hash value of the string matches the one to be found. Look at the image below to understand:

rabin karp algorithm working explained

4. Comparison: You need to keep iterating and comparing the hash value of other windows and match it with the string. If it does, we perform the pattern matching for the rest of the alphabets, otherwise, we move to others. 

Output: 

True, the string contains the given text at position 2.

 

Implementation of Rabin-Karp with C++

#include 
#include 
using namespace std;

#define d 10

void RabinKarp(char text[], char string[], int q) {
  int m = strlen(text);
  int n = strlen(string);
  int i, j;
  int p = 0;
  int t = 0;
  int h = 1;

  for (i = 0; i < m - 1; i++)
    h = (h * d) % q;

  //hash value for string and pattern
  for (i = 0; i < m; i++) {
    p = (d * p + string[i]) % q;
    t = (d * t + pattern[i]) % q;
  }

  // Find the match
  for (i = 0; i <= n - m; i++) {
    if (p == t) {
      for (j = 0; j < m; j++) {
        if (string[i + j] != text[j])
          break;
      }

      if (j == m)
        cout << "Pattern is found at position: " << i + 1 << endl;
    }

    if (i < n - m) {
      t = (d * (t - string[i] * h) + string[i + m]) % q;

      if (t < 0)
        t = (t + q);
    }
  }
}

int main() {
  char string[] = "ABCCDDAEFG";
  char text[] = "CDD";
  int q = 13;
  RabinKarp(string, pattern, q);
}

 

Implementation of Rabin-Karp with Java

public class RabinKarp {
  public final static int d = 10;
  static void search(String pattern, String str, int q) {
    int m = pattern.length();
    int n = str.length();
    int i, j;
    int p = 0;
    int t = 0;
    int h = 1;

    for (i = 0; i < m - 1; i++)
      h = (h * d) % q;
    // Calculate hash value for pattern and text
    for (i = 0; i < m; i++) {
      p = (d * p + pattern.charAt(i)) % q;
      t = (d * t + str.charAt(i)) % q;
    }

    // Find the match
    for (i = 0; i <= n - m; i++) {
      if (p == t) {
        for (j = 0; j < m; j++) {
          if (str.charAt(i + j) != pattern.charAt(j))
            break;
        }
        if (j == m)
          System.out.println("Pattern is found at position: " + (i + 1));
      }

      if (i < n - m) {
        t = (d * (t - str.charAt(i) * h) + str.charAt(i + m)) % q;
        if (t < 0)
          t = (t + q);
      }
    }
  }
  public static void main(String[] args) {
    String str = "ABCCDDAEFG";
    String pattern = "CDD";
    int q = 13;
    search(pattern, str, q);
  }
}

 

The complexity of the algorithm

The complexity of the Rabin-Karp algorithm in a general scenario is quite good, which is O(m+n). However, if we talk about the worst case, it increases to O ((n-m+1) m. 

*is the length of the pattern & is the length of the string.

Comparison of Rabin-Karp with other alternatives

Since you have understood both the implementation and the theoretical knowledge behind this algorithm, you must have understood that it works in a much more efficient way as compared to other methods. Using this algorithm reduces our time complexity since we do only one comparison per text substring and match only when the hash values are found to be equal. 

Applications of the Rabin-Karp Algorithm

The Rabin-Karp Algorithm has a wide range of applications in the world of word processing and problems revolving around strings. Some other applications are listed below:

  • For pattern matching
  • For finding a particular text in a bigger string

Conclusion

Data Structures and Algorithms is one such field where the continuous practice is necessary to gain authority over the subject. Word Processing is gaining popularity in the outer world due to the data increasing at an exponential rate, thus understanding these types of algorithms will provide you an edge over others. At FavTutor, we not only provide these valuable tech blogs but offer 24*7 guidance over programming as well. 

FavTutor - 24x7 Live Coding Help from Expert Tutors!

About The Author
Manvi Saxena
I'm Manvi Saxena, a technophile, and technical content writer. Having a curious mind helps me gain exposure and explore the latest technologies. Being a coder myself, it feels fantastic to share my knowledge via tech blogs to help others.