Rabin-Karp 算法

Rabin-Karp 算法

字符串匹配算法之一,利用哈希进行匹配比较,利用滚动哈希优化哈希函数

#滚动哈希

将滑动窗口中的字符串哈希成一个数字

hi=p=0L1si+pbL1ph_i = \sum_{p = 0}^{L - 1}{s_{i + p} b^{L - 1 - p}}

其中LL为滑动窗口大小,ss为探针字符串,bb为基数

index-10123456
sabcdefg
hi1h_{i-1}b5b^5b4b^4b3b^3b2b^2bb11
hih_{i}b5b^5b4b^4b3b^3b2b^2bb11

则有递推公式

hi=p=0L1si+pbL1p=bp=0L1si+p1bL1psi1bL+si+L1=(hi1si1bL1)b+si+L1\begin{aligned} h_i &= \sum_{p = 0}^{L - 1}{s_{i + p} b^{L - 1 - p}}\\ &= b \cdot \sum_{p = 0}^{L - 1}{s_{i + p - 1} b^{L - 1 - p}} - s_{i-1} b^{L} + s_{i+L-1}\\ &= (h_{i-1} - s_{i-1} b^{L-1}) b + s_{i+L-1} \end{aligned}

代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
int rk_hash(const std::string s, int start, int base, int L, int mod)
{
int hash = 0;
for (int i = 0; i < L; i++)
{
hash = (hash * base) % mod;
hash = (hash + s[start + i]) % mod;
}
return hash;
}

int powmod(int a, int n, int m) {
if (n == 1) return a;
if (n % 2 == 0) return powmod((long long)a * a % m, n / 2, m) % m;
return ((powmod((long long)a * a % m, n / 2, m) % m) * ((long long)a % m)) % m;
}

int rk_cont_hash(const std::string s, int start, int last_h, int base, int L, int mod)
{
int hash = last_h - s[start - 1] * powmod(base, L - 1, mod);
hash = (hash * base) % mod;
hash = (hash + s[start + L - 1]) % mod;
return hash;
}