Thursday, July 21, 2016

214. Shortest Palindrome

I have to follow the top rated solution. The top one uses the KMP algorithm.
Here are two good links explaining this algorithm.
https://www.youtube.com/watch?v=GTJr8OvyEVQ
https://www.youtube.com/watch?v=KG44VoDtsAA
After preprocessing the pattern, the kmp[i] is the longest length of the suffix that matches prefix at ith character in pattern string. So if we concatenate input string s and its reversed string r, and process the concatenated string by KMP algorithm, we know that the longest length that the suffix of string r matches prefix of string s is kmp[kmp.size()-1]. Let’s take “aacecaaa” as an example here.
The concatenated string will be “aacecaaa#aaacecaa” (Why put # here? I’ll explain later)
After KMP processing, the kmp vector will be [0 1 0 0 0 1 2 2 0 1 2 2 3 4 5 6 7]. So for the last character, the longest length that the suffix matches prefix is 7. Note the suffix of the string is actually the reversed input string and the prefix is the input string itself. Also, it means that there is already 7 characters from the beginning matches the 7 characters that from the end. So all we need to do is to copy the rest 1 unmatched character to the beginning of the input string in a reverse order, which is exactly copying the first 1 character in the reversed string to the beginning of the input string.

When I revisited this problem, I made mistake in
line 16: I was checking "j != 0". However, we should check "tmp[i] == tmp[j]" because even j reaches back to 0, it's still possible that "tmp[i] == tmp[j]" and thus kmp[i] should be 1 not 0.

1:  class Solution {  
2:  public:  
3:    string shortestPalindrome(string s) {  
4:      string r = s;  
5:      reverse(r.begin(), r.end());  
6:      string tmp = s + "#" + r;  
7:      vector<int> kmp(tmp.size(), 0);  
8:      int i = 1, j = 0;  
9:      for (; i < tmp.size(); i++) {  
10:        if (tmp[i] == tmp[j]) kmp[i] = ++j;  
11:        else {  
12:          j = kmp[j-1];  
13:          while (j > 0 && tmp[i] != tmp[j]) {  
14:            j = kmp[j-1];  
15:          }  
16:          if (tmp[i] == tmp[j]) {  
17:            kmp[i] = j+1;  
18:            j++;  
19:          }  
20:        }  
21:      }  
22:      return r.substr(0, s.size()-kmp[kmp.size()-1])+s;  
23:    }  
24:  };  


Now let’s look at why “#” is needed. Without “#”, what happens if the input string is “aa”? The kmp vector will be [0 1 2 3]. The kmp[kmp.size()-1] is larger than the input string size which makes return line invalid.
And here is a short version of KMP algorithm.

1:  class Solution {  
2:  public:  
3:    string shortestPalindrome(string s) {  
4:      string r = s;  
5:      reverse(r.begin(), r.end());  
6:      string tmp = s + "#" + r;  
7:      vector<int> kmp(tmp.size(), 0);  
8:      int i = 1, j = 0;  
9:      for (; i < tmp.size(); i++) {  
10:        j = kmp[i-1];  
11:        while (j > 0 && tmp[i] != tmp[j]) {  
12:          j = kmp[j-1];  
13:        }  
14:        kmp[i] = (j += tmp[i] == tmp[j]);  
15:      }  
16:      return r.substr(0, s.size()-kmp[kmp.size()-1])+s;  
17:    }  
18:  };  

No comments:

Post a Comment