# 10 Regular Expression Matching
Input是一個String和一個pattern,回傳string是否matching pattern
pattern規則:
'.' Matches any single character.
'*' Matches zero or more of the preceding element.
- Idea: 由於string和pattern是否符合,可以由兩者的substring推得,所以可以用DP,初始化結果如下
0 | x | a | * | b | . | c | * | |
---|---|---|---|---|---|---|---|---|
0 | T | F | F | F | F | F | F | F |
x | F | |||||||
a | F | |||||||
a | F | |||||||
b | F | |||||||
y | F | |||||||
c | F | |||||||
c | F |
填完值的結果
0 | x | a | * | b | . | c | * | |
---|---|---|---|---|---|---|---|---|
0 | T | F | F | F | F | F | F | F |
x | F | T | F | T | F | F | F | F |
a | F | F | T | T | F | F | F | F |
a | F | F | F | T | F | F | F | F |
b | F | F | F | F | T | F | F | F |
y | F | F | F | F | F | T | F | T |
c | F | F | F | F | F | F | T | T |
c | F | F | F | F | F | F | F | T |
如何解讀這個表格?
舉(2,2)那格來說,就是string x和pattern x是否match,也就是將該格row和col的substring比對的結果。
根據填表過程整理出下列會有match的情況
- s和p目前的字母一模一樣,目前是match,只要看還沒加上目前字母的時候是否match
if s.charAt(i) == p.charAt(j) dp[i+1][j+1] = dp[i][j]
p目前的字母是'.',可以match任意字母,所以跟情況1一樣,只要看還沒加上目前字母的時候是否match
p目前是'*'分成幾種情況:*前的字母出現0次
dp[i+1][j+1] = dp[i+1][j-1]
*前的字母和s的目前字母有match
dp[i+1][j+1] = dp[i][j+1]
*前是'.',和上一個情況相同
Pseudocode
denote i to s.length, j to p.length initialize: 0 row & 0 col: TFFF... conditions with a match: if p[j] == '.': a match, dp[i+1][j+1] = dp[i][j] if p[j] == s[i] : a match, dp[i+1][j+1] = dp[i][j] if p[j] == '*': 1. zero occurence in s: dp[i+1][j+1] = dp[i+1][j-1] 2. more than 1 occurence in s: if s[i]==p[j-1] || p[j-1] == '.', dp[i+1][j+1] = dp[i+1][j] || dp[i][j+1] else no match dp[i][j] = false; return dp[s.length][p.length]
Code
public class Solution { public boolean isMatch(String s, String p) { boolean[][] dp = new boolean[s.length()+1][p.length()+1]; //initialize, a*, a*b* dp[0][0] = true; for (int i = 0; i < p.length(); i++) { if (p.charAt(i) == '*' ) { dp[0][i+1] = dp[0][i-1]; } } //go through dp matrix for(int i=0; i< s.length(); i++){ for(int j=0; j< p.length(); j++){ if(p.charAt(j) == '.' || p.charAt(j) == s.charAt(i)) dp[i+1][j+1] = dp[i][j]; else if(p.charAt(j) == '*'){ //prev match dp[i+1][j+1] = dp[i+1][j-1]; if(s.charAt(i) == p.charAt(j-1) || p.charAt(j-1) == '.'){ dp[i+1][j+1] = dp[i+1][j+1] || dp[i][j+1]; } }else dp[i+1][j+1] = false; } } return dp[s.length()][p.length()]; } }