Given a text string and a pattern string, implement the Rabin-Karp algorithm to find all occurrences of the pattern in the text. The Rabin-Karp algorithm uses hashing to find any substring in a text that matches the pattern.
Input/Output Examples
plaintext
Example 1:
Input:
text = "abracadabra"
pattern = "abra"
Output: [0, 7]
Explanation: The pattern abra is found at indices 0 and 7 in the text.
Example 2:
Input:
text = "hello world"
pattern = "world"
Output: [6]
Explanation: The pattern world is found at index 6 in the text.
Example 3:
Input:
text = "abcde"
pattern = "xyz"
Output: []
Explanation: The pattern xyz is not found in the text.
Approach to Implement Rabin-Karp Algorithm
- Compute Hash for Pattern and Initial Text Window:
- Calculate the hash value of the pattern and the first window of the text (of the same length as the pattern) using a rolling hash function.
- Slide Over the Text:
- For each window in the text, compare the hash value of the window with the hash value of the pattern.
- If the hash values match, verify by comparing the actual substring to the pattern to avoid false positives.
- Use Rolling Hash for Efficient Rehashing:
- Use a rolling hash to compute the hash of the next window by removing the first character of the current window and adding the next character in the text.
- Store and Return All Occurrences:
- Store the starting indices where the pattern matches the text and return the list of indices.
C++ Program to Implement Rabin-Karp Algorithm
cpp
#include <iostream>
#include <vector>
#include <string>
using namespace std;
const int d = 256; // Number of characters in the input alphabet
const int q = 101; // A prime number
// Function to find pattern occurrences using Rabin-Karp algorithm
vector<int> rabinKarp(const string& text, const string& pattern) {
int m = pattern.length();
int n = text.length();
int p = 0; // Hash value for pattern
int t = 0; // Hash value for text
int h = 1;
vector<int> result;
// Calculate h = pow(d, m-1) % q
for (int i = 0; i < m - 1; i++) {
h = (h * d) % q;
}
// Calculate the hash value of the pattern and first window of text
for (int i = 0; i < m; i++) {
p = (d * p + pattern[i]) % q;
t = (d * t + text[i]) % q;
}
// Slide the pattern over text one by one
for (int i = 0; i <= n - m; i++) {
if (p == t) {
if (text.substr(i, m) == pattern) {
result.push_back(i);
}
}
if (i < n - m) {
t = (d * (t - text[i] * h) + text[i + m]) % q;
if (t < 0) t = (t + q);
}
}
return result;
}
int main() {
string text = "abracadabra";
string pattern = "abra";
vector<int> result = rabinKarp(text, pattern);
cout << "Pattern found at indices: ";
for (int index : result) {
cout << index << " ";
}
cout << endl;
return 0;
}
Java Program to Implement Rabin-Karp Algorithm
java
import java.util.ArrayList;
import java.util.List;
public class RabinKarpAlgorithm {
private static final int d = 256; // Number of characters in the input alphabet
private static final int q = 101; // A prime number
// Function to find pattern occurrences using Rabin-Karp algorithm
public static List<Integer> rabinKarp(String text, String pattern) {
int m = pattern.length();
int n = text.length();
int p = 0; // Hash value for pattern
int t = 0; // Hash value for text
int h = 1;
List<Integer> result = new ArrayList<>();
for (int i = 0; i < m - 1; i++) {
h = (h * d) % q;
}
for (int i = 0; i < m; i++) {
p = (d * p + pattern.charAt(i)) % q;
t = (d * t + text.charAt(i)) % q;
}
for (int i = 0; i <= n - m; i++) {
if (p == t) {
if (text.substring(i, i + m).equals(pattern)) {
result.add(i);
}
}
if (i < n - m) {
t = (d * (t - text.charAt(i) * h) + text.charAt(i + m)) % q;
if (t < 0) t = (t + q);
}
}
return result;
}
public static void main(String[] args) {
String text = "hello world";
String pattern = "world";
List<Integer> result = rabinKarp(text, pattern);
System.out.println("Pattern found at indices: " + result);
}
}
Python Program to Implement Rabin-Karp Algorithm
python
# Function to find pattern occurrences using Rabin-Karp algorithm
def rabin_karp(text, pattern):
d = 256 # Number of characters in the input alphabet
q = 101 # A prime number
m = len(pattern)
n = len(text)
p = 0 # Hash value for pattern
t = 0 # Hash value for text
h = 1
result = []
for i in range(m - 1):
h = (h * d) % q
for i in range(m):
p = (d * p + ord(pattern[i])) % q
t = (d * t + ord(text[i])) % q
for i in range(n - m + 1):
if p == t:
if text[i:i + m] == pattern:
result.append(i)
if i < n - m:
t = (d * (t - ord(text[i]) * h) + ord(text[i + m])) % q
if t < 0:
t += q
return result
# Example usage
if __name__ == "__main__":
text = "abcdeabcde"
pattern = "cde"
result = rabin_karp(text, pattern)
print("Pattern found at indices:", result)
- Time Complexity:
O((n - m + 1) * m)
in the worst case due to hash collisions, wheren
is the length of the text andm
is the length of the pattern. However, average-case complexity isO(n + m)
. - Space Complexity:
O(1)
, as the algorithm only requires a few variables for hashing and does not use additional space proportional to the input size.