刷題筆記 - LeetCode 139. Word Break

# 刷題筆記 - LeetCode 139. Word Break https://leetcode.com/problems/word-break ## Thoughts 給一個 string `s`, 看他這個 s 能不能被 `wordDict` 給的字組合而成 Example 1: >Input: s = "leetcode", wordDict = ["leet","code"] Output: true Explanation: Return true because "leetcode" can be segmented as "leet code". Example 2: >Input: s = "applepenapple", wordDict = ["apple","pen"] Output: true Explanation: Return true because "applepenapple" can be segmented as "apple pen apple". Note that you are allowed to reuse a dictionary word. Example 3: >Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"] Output: false \ 從 Example 3 可看出，`s` 要可以被 wordDict 中的字拆分如果更改一下 Test case >Input: s = "catsanddog", wordDict = ["cats","dog","sand","and","cat"] Output: true `s` 可以分成 `cats`, `and`, `dog`，這個組合全都有出現在 `wordDict` 中，因此回傳是 `true` ## Solution 1 - DP, bottom-up ### Logic 拆分問題 - > `s[0:i]` 是不是 wordDict 中出現的字？ > 是的話，`s[i:]` 能不能繼續拆？假設使用 DP，先建立 DP array ``` dp = [False] * (len(s) + 1) ``` 利用每個 `dp[i]` 來記錄 `s[0:i]` 是不是存在 wordDict 中 \ 為什麼 dp array 長度是設定成 `len(s) + 1`? 因為如果 `len(s) == 0` 那一定可以被拆解 (`dp[0]=0`) \ 假設 `s` 長度為 8，整個遍歷的起始會是 ``` 0 1 2 3 4 5 6 7 8 ↑ i ``` 從 1 開始，檢查`s[0:1]` 是否在字典中遍歷 `s`，假設走到的位置 `i`，可以被正確拆分的話 => 代表 `i` 之前的 substring，也是可以被正確拆分那再去看 substring，就是一個重複的問題 ``` 0 1 2 3 4 5 6 7 8 ↑ ↑ j i ``` 如果 `i` 走到 7 那就必須檢查 `s[0:7]` 是否能被正確拆分 **代表必須再一個迴圈來檢查 substring** (上圖 `j`) 而我們使用 `dp[i]` 來儲存 `s[0:i]` 是否存在字典中那 substring 就可以被拆成：`s[0:j]` & `s[j:i]` 所以假設 `s[0:j]` 在字典中 (也就是 `dp[j]==True`) ，那需要確認的就是 `s[j:i]` 是不是在字典中，是的話那 `dp[i]` 就也應該設成 True <br> ### Code ```python!= class Solution: def wordBreak(self, s: str, wordDict: List[str]) -> bool: wordSet = set(wordDict) dp = [False] * (len(s) + 1) dp[0] = True for i in range(1,len(s) + 1): for j in range(0, i): if dp[j] and s[j:i] in wordSet: dp[i] = True break return dp[len(s)] ``` line 7: 從 1 開始遍歷，到 len(s) (dp array 長度是 len(s)+1) line 8: 第二層迴圈去看 `s[:i]` 的 substring，確認是不是能被拆分。因為先前已經使用 `dp[i]` 來儲存該位置是否確定可被字典中的字所組合，所以可以用 `dp[j]` 來判斷，而剩下的字串就是 `s[j:i]` line 11: 如果確定至少有一個組合是可被字典拆分，就不用再繼續第二層迴圈，直接 break 最後回傳 `dp[len(s)]` 就是答案 ### Time & Space Complexity - Time: O(n^2) - n: length of the string`s` - Space: O(n + k) - n: 建了一個 `len(s)+1` 的 DP array - k: 建立 `wordSet`，k 表示 wordDict 字的數量 ### Optimization 第二層迴圈 `for j in range(0:i):` 可以不用跑完所有的 `0 ~ i` 假設字典裡面最長的單字長度為 `m`，那就不用檢查比 `m` 還長的 `s[j:i]`，因為一定不存在於字典中 ## Solution 2 - Recursion (working)