Sunday, June 19, 2016

10. Regular Expression Matching

There are two ways to solve the problem.
1. Recursive way.
First of all, for regular expression, it is invalid to start with '*'. So '*' must at least start from p[1].

Case 1: p[1] == '*'
For example of input string "abc" and pattern "abca*". There is no 'a' matching, then we just need to move p's pointer to position p+2.
For example of input string "abcaa" and pattern "abca*". There is at least one 'a' matching, then we just need to move s' pointer to next position.

Case 2: p[1] != '*'
For example of input string "aaa" and pattern "ab". All we need to care is if p[0] matches s[0] and if so, move to next position in input string s and pattern p.

1:  class Solution {  
2:  public:  
3:    bool isMatch(string s, string p) {  
4:      if (p.empty()) return s.empty();  
5:      if (p[1] == '*') {  
6:        return isMatch(s, p.substr(2)) || !s.empty() && (s[0] == p[0] || '.' == p[0]) && isMatch(s.substr(1), p);  
7:      } else {  
8:        return !s.empty() && (s[0] == p[0] || '.' == p[0]) && isMatch(s.substr(1), p.substr(1));  
9:      }  
10:    }  
11:  };  

2. Dynamic Programming.
Let dp[i][j] be true if s[0..i-1] matches p[0..j-1]. Similar to recursive solution, there are two cases:

Case 1: p[j-1] != '*'
dp[i][j] = dp[i-1][j-1] && s[0...i-1] matches p[0...j-1]

Case 2: p[j-1] == '*'
dp[i][j] = dp[i][j-2] (i.e. 'a*' matches empty) || s[0...i-1] matches p[0...j-2] && dp[i-1][j] (i.e. 'a*' match 'a')

When I revisited this problem, I made mistake in,
line 18: I didn’t check if (s[i-1] == p[j-2] || p[j-2] == ‘.’). You have to make sure that otherwise, for input “abcd” and pattern “d*”, you’ll get all true for the last column in the dp matrix because dp[0][2] is true.

1:  class Solution {  
2:  public:  
3:    bool isMatch(string s, string p) {  
4:      int sn = s.size();  
5:      int pn = p.size();  
6:      vector<vector<bool>> dp(sn+1, vector<bool>(pn+1, false));  
7:      dp[0][0] = true; // s="" matches p=""  
8:      // dp[0][1] must be false because "" can't match any character  
9:      for (int j = 2; j <= pn; j++) {  
10:        // the only case "" matches p is p="a*a*" something like that.  
11:        dp[0][j] = '*' == p[j-1] && dp[0][j-2];  
12:      }  
13:      for (int i = 1; i <= sn; i++) {  
14:        for (int j = 1; j <= pn; j++) {  
15:          if (p[j-1] == '*') {  
16:            // "a*" matches empty, which requires s[0...i-1] matches p[0...j-3]  
17:            // or "a*" matches "a", which requires s[0...i-2] matches p[0...j-1]  
18:            dp[i][j] = dp[i][j-2] || ((s[i-1] == p[j-2] || p[j-2] == '.') && dp[i-1][j]);  
19:          } else {  
20:            dp[i][j] = (s[i-1] == p[j-1] || p[j-1] == '.') && dp[i-1][j-1];  
21:          }  
22:        }  
23:      }  
24:      return dp[sn][pn];  
25:    }  
26:  };  

No comments:

Post a Comment