# [AIdrifter CS 浮生筆錄](https://hackmd.io/s/rypeUnYSb) <br> 花花醬 Alog/DS 特輯 # SP1. Disjoint-set/Union find Forest - Disjoint-set/Union-find Forest Find(x): find the root/cluster-id of x Union(x, y): merge two clusters Check whether two elements are in the same set or not in O(1)*. Find: O(ɑ(n))* ≈ O(1) Union: O(ɑ(n))* ≈ O(1) Space: O(n) *: amortized ɑ(.): inverse Ackermann function Without optimization: Find: O(n) - Two key optimizations: 1) Path compression: make tree flat - 把所有partents node 都接到root上面 ![](https://i.imgur.com/1S99my7.png) 2) Union by rank: merge low rank tree to high rank one ![](https://i.imgur.com/3MPf1Sp.png) ```C++ // Author: Huahua class UnionFindSet { public: UnionFindSet(int n) { ranks_ = vector<int>(n + 1, 0); parents_ = vector<int>(n + 1, 0); for (int i = 0; i < parents_.size(); ++i) parents_[i] = i; } // Merge sets that contains u and v. // Return true if merged, false if u and v are already in one set. bool Union(int u, int v) { int pu = Find(u); int pv = Find(v); if (pu == pv) return false; // Meger low rank tree into high rank tree if (ranks_[pv] < ranks_[pu]) parents_[pv] = pu; else if (ranks_[pu] < ranks_[pv]) parents_[pu] = pv; else { parents_[pv] = pu; ranks_[pu] += 1; } return true; } // Get the root of u. int Find(int u) { // Compress the path during traversal if (u != parents_[u]) parents_[u] = Find(parents_[u]); return parents_[u]; } private: vector<int> parents_; vector<int> ranks_; }; ``` Reference: 花花酱 LeetCode 399. Evaluate Division 花花酱 LeetCode 547. Friend Circles ```C++ ``` 花花酱 LeetCode 737. Sentence Similarity II ```C++ ``` 花花酱 LeetCode 684. Redundant Connection ```C++ class Solution { public: vector<int> findRedundantConnection(vector<vector<int>>& edges) { UnionFindSet S(edges,size()) for(const auto &edge : edges) if(!S.Unioin(edge[0], edge[1])) return edge; // Don't forget it return {}; } }; ``` 花花酱 LeetCode 685. Redundant Connection II 花花酱 LeetCode 839. Similar String Groups # SP2. Input Size and Time Complexity ![](https://i.imgur.com/9VHO8hT.png) < 10: O(n!) permutation < 15: O(2^n) combination < 50: O(n^4) DP < 200: O(n^3) DP, all pairs shortest path < 1,000: O(n^2) DP, all pairs, dense graph < 1,000,000: O(nlogn), sorting-based (greedy), heap, divide & conquer < 1,000,000: O(n), DP, graph traversal / topological sorting (V+E), tree traversal < INT_MAX: O(sqrt(n)), prime, square sum < INT_MAX: O(logn), binary search < INT_MAX: O(1) Math # SP3. Fenwick Tree / Binary Indexed Tree - Array [2, 5, -1 ,3 ,6 ,7 ] , 求array中連續元素的總和 - naive implement - query: O(n) 很爛 就是直接算 不聰明 - DP - 第一次O(n) 建sum table - 之後query都是 O(1) : Sum(2,5) = Sum(5,0) - Sum(1,0) - 不能簡單update - Fenwick tree was proposed to solve the prefix sum problem - The idea is to store `partial sum in each node` and get total sum by travering the tree form leaf to root. The tree has height log(n) - Query: O(log(n)) - Update: O(log(n)) - ![](https://i.imgur.com/Qn4J5R8.png) ```C++ class FenwickTree { public: FenwickTree(int n): sums_(n + 1, 0) {} // update A[i] = delta void update(int i, int delta) { while (i < sums_.size()) { sums_[i] += delta; i += lowbit(i); } } // sum of elements from 0 to n-1 int query(int i) const { int sum = 0; while (i > 0) { sum += sums_[i]; i -= lowbit(i); } return sum; } private: // get low bits static inline int lowbit(int x) { return x & (-x); } vector<int> sums_; }; ``` 307. Range Sum Query – Mutable 315. Count of Smaller Numbers After Self # SP4. Time/Space Complexity of Recursive Algorithms - Master Theorem 為CS基本 ## Analysis algorithm - quick sort - merge sort - Time Complexity - Best case o(nlgn) - Space Complexity - O(logn + n) : stack depth 看遞迴深度 + n(merge) - 要copy element 不一樣比較快 - Inorder Traversal - 請和quick sort比較 為何是o(n) 而不是 o(log(n)) - combination - binary string 不是0就是1 = 2^n -1 種組合 - permutation - Summary - Space Complexity : 遞迴深度 ## DP with memorization - 計劃遞歸(把結果存在memory) - Time: - (number of subproblems) * (exclusive time to slove each subproblem) - Space: - (max depth of call stack) * (space used by each subproblem) - Fibonaci - Cherry Pickup - Burst Balloons Reference: http://zxi.mytechroad.com/blog/sp/time-space-complexity-of-recursion-functions-sp4/ # SP5. Binary Search Template: [l, r) f():答案: O() g():解在左邊還是右邊 :O(1) or O(n) or O(nlogn) f(m) 不一定存在,如果沒有optimal 的話。 ```python """ Returns the smallest number m such that g(m) is true. """ def binary_search(l, r): while l < r: m = l + (r - l) // 2 if f(m): return m # if m is the answer if g(m): r = m # new range [l, m) else l = m + 1 # new range [m+1, r) return l # or not found ``` ![](https://i.imgur.com/50gzQE1.png) - lower_bound(x) A[i] >= x - upper_bound(x) A[i] > x ```C++ #include <iostream> #include <vector> using namespace std; int upper_bound(const vector<int>& A, int val, int l, int r) { while (l < r) { int m = l + (r - l) / 2; if (A[m] > val) r = m; else l = m + 1; } return l; } int lower_bound(const vector<int>& A, int val, int l, int r) { while (l < r) { int m = l + (r - l) / 2; if (A[m] >= val) r = m; else l = m + 1; } return l; } int main() { vector<int> A{1, 2, 2, 2, 4, 4, 5}; cout << lower_bound(A, 0, 0, A.size()) << endl; // 0 cout << lower_bound(A, 2, 0, A.size()) << endl; // 1 cout << lower_bound(A, 3, 0, A.size()) << endl; // 4 cout << upper_bound(A, 2, 0, A.size()) << endl; // 4 cout << upper_bound(A, 4, 0, A.size()) << endl; // 6 cout << upper_bound(A, 5, 0, A.size()) << endl; // 7 } ``` ![](https://i.imgur.com/0vfaPrP.png) ![](https://i.imgur.com/wIDKShA.png) ![](https://i.imgur.com/9ewK93q.png) ## Practices 69. Sqrt(x) ```C++ ``` 704. Binary Search ```C++ ``` 378. Kth Smallest Element in a Sorted Matrix ```C++ int kthSmallest(vector<vector<int>>& A, int k) { priority_queue<int> q; for(auto vec:A){ for(auto c:vec){ q.push(c); if(q.size()>k) q.pop(); } } return q.top(); } ``` :::info #5楼 2018-02-06 21:33 DennisHCR 我不太明白后面两种解为什么返回left?left不一定是矩阵里的数字吧?还是我漏掉了什么? 支持(0) 反对(0) #6楼 [楼主] 2018-03-29 07:52 Grandyang @ DennisHCR 这是个好问题,left和right最终会相等,并且会变成数组中第k小的数字。举个例子来说吧,比如数组为: [1 2 12 100] k = 3 那么刚开始left = 1, right = 100, mid = 50, 遍历完 cnt = 3,此时right更新为50 此时left = 1, right = 50, mid = 25, 遍历完之后 cnt = 3, 此时right更新为25 此时left = 1, right = 25, mid = 13, 遍历完之后 cnt = 3, 此时right更新为13 此时left = 1, right = 13, mid = 7, 遍历完之后 cnt = 2, 此时left更新为8 此时left = 8, right = 13, mid = 10, 遍历完之后 cnt = 2, 此时left更新为11 此时left = 11, right = 12, mid = 11, 遍历完之后 cnt = 2, 此时left更新为12 循环结束,left和right均为12,任意返回一个即可。 ::: ```C++ int l = A[0][0], r = A[A.size()-1][A.size()-1]; while(l < r){ unsigned int m = l + (r-l)/2; unsigned int total = 0; for(size_t i= 0; i<A.size(); i++){ int c = upper_bound(A[i].begin(), A[i].end(), m) - A[i].begin(); total += c; //printf("total: %u c:%u \n",total,c); } //printf("l:%u r:%u m:%u total:%u \n",l ,r ,m ,total); if(total >= k) r = m; else l = m + 1; } return l; ``` 875. Koko Eating Bananas ```C++ int minEatingSpeed(vector<int>& piles, int H) { int l = 1; int r = 0; for(auto c: piles) r = max(c,r); // [l,r) r++; while(l<r){ int m = l + (r-l)/2; int h = 0; // h: hours // m: eat banana each hours for(auto p:piles) h += (p + m -1)/m; //printf("m:%u h:%u \n",m,h); if(h <= H) r = m; else l = m + 1; } return l; } ``` # SP6. 200期總結與展望 # SP7. LeetCode Amortized Analysis ![](https://i.imgur.com/lc24ed5.png) ![](https://i.imgur.com/xvELqio.png) - 聚合分析(aggregate method) : 硬功夫累計法,決定 n 個操作序列的耗費上界 T(n),然後計算平均耗費為 T(n) / n。 - 記帳方法(accounting method) : 銀行會計(記帳)法,確定每個操作的耗費,結合它的直接執行時間及它在對運行時中未來操作的影響。通常來說,許多短操作增量累加成"債",而通過減少長操作的次數來「償還」。 - 勢能方法(potential method) : 類似記帳方法,但通過預先儲蓄「勢能」而在需要的時候釋放。 Reference: [Algorithm - 平攤分析 Amortized Analysis](https://mropengate.blogspot.com/2015/06/algorithm-amortized-analysis.html) # SP8. Dynamic Programming I 使用條件 - Optimal substructure - overloapping subproblem - 2^n(exponetial) > n^k(polynomial) - 如果不考慮overlap就是divde and conquer - No after affect - the optimal solution of a subproblem will not change when it was used to solve a bigger problem optimally. ```shell 大->小 (Top Down) 大->小(Bottom up) Recursive -> recursion & Memoziation -> Dynamic Programming 不需要考慮(找mem) 要確定會不會重複算 ``` - DP精神:要考慮如何處理overlaping 沒有的話就是單純的divide and conquer(有overlap但不鳥他) - DP = recursion + re-use - Reference: [Difference between Divide and Conquer Algo and Dynamic Programming](https://stackoverflow.com/questions/13538459/difference-between-divide-and-conquer-algo-and-dynamic-programming) ## Fibonacci sequence ## LCS : Longest common subsequence ## Knapsack problem ## Floyd warshall ## Bellman Ford ## Leetcode - LC62 : unique path - LC926 : Flip String to Monotone - LC845 : Longest Mountain in Array # SP9. Dynamic Programming II ![](https://i.imgur.com/xy9agUd.png) - 花花整理的題庫分類 - 一維共四種 - (1) 最多也最複雜 ![](https://i.imgur.com/dyPrJo4.png) - (2) 1 + 2 + 3 + ... n (j) ![](https://i.imgur.com/qWKZo9g.png) - (3) 輸入 O(m) + O(n), two array/strings ![](https://i.imgur.com/h8To56J.png) - (4) 其實是(3)的變形 利用K作為break points ![](https://i.imgur.com/dNCdmpq.png) - =維共兩種 - Input O(mn) ![](https://i.imgur.com/TR3k1rv.png) - Input O(mn) 且需要K個steps ![](https://i.imgur.com/7pGLd8q.png) # SP10. 0/1 Knaspack Problem - NP-Complete - 無法在Polynomial time找到答案 - In computational complexity theory, an NP-complete decision problem is one belonging to both the NP and the NP-hard complexity classes. In this context, NP stands for "nondeterministic polynomial time". The set of NP-complete problems is often denoted by NP-C or NPC. Reference: [NP-Complete 輕鬆談演算法的複雜度分界](https://www.ycc.idv.tw/algorithm-complexity-theory.html) - Why Can Use DP - O(nw) , 總共n個item 且Weight 最重為W - Weight 範圍要非常小 - n > 20, w<10^6 ## DFS解01背包問題 ## 0/1 knaspack prolem C[i,k] : 取道item i, 且負重還有k的時候 , 可以拿的最大Value ```C++ static int knapSackIterative(int W, int wt[], int val[], int n) { int[][] dp = new int[n+1][W+1]; for(int i = 0; i<=n i++) // iteim for(int j = 0; j<= W j++) // weight { if(i==0 || j ==0) dp[i][j] = 0; if(wt[i] > j) dp[i][j] = dp[i-1][j] else // [FIXME] Check the condition dp[i][j] = Math.max(dp[i-1][j], dp[i-1][j-wt[i]] + val[i] ); } } public static void main(String args[]) { int val[] = new int[] { 60, 100, 120 }; int wt[] = new int[] { 10, 20, 30 }; int W = 51; int n = val.length; System.out.println(knapSackIterative(W, wt, val, n)); } ``` ## Reduce Space Complexity - Use a tmp array - iterator j in reverse order # SP11. 0/1 Knaspack Problem # SP12. Binary Tree - Inorder traversal :排序好的BST - Balanced Tree - For all sub tree, |TL - TR| <=1 - 很多Tree設計都是為了BST的平衡 - AVL Tree, Red-Black Tree ... - Preorder traversal vs heap stack relationship - tree node 在heap上面是不連續的 - [FIXME] 看懂preorder traversal 過程中 stack 與 heap的關係圖 - 避免有忘記free的問題 自己寫tree node的destructor - Hot to create BST(unbalanced) ## Template - 如何用左右子樹回傳的答案 去建構最後的答案 ![](https://i.imgur.com/YDA9Ai2.png) - one tree or two tree ![](https://i.imgur.com/Khe0UxE.png) ![](https://i.imgur.com/lvJ9hKH.png) ## find the max val of a tree ```C++ // 左子樹最大 右子樹最大 在和自己比 int maxVal(root) { if(!root) return INT_MIN; int max_left = maxVal(root->left); int max_right = maxVal(root->right) return Math.max(max_left,max_right,root->data); } ``` # SP13. Recursion unrolling and performance measurement ## Combination and Permutation ```shell Input: nums = [1,2,3] Output:[ [3], [1], [2], [1,2,3], [1,3], [2,3], [1,2], []] ``` ![](https://i.imgur.com/qY4TPyy.png) ```C++ // Author: Huahua, running time: 4 ms class Solution { public: vector<vector<int>> subsets(vector<int>& nums) { vector<vector<int>> ans; vector<int> cur; for (int i = 0; i <= nums.size(); ++i) dfs(nums, i, 0, cur, ans); return ans; } private: void dfs(const vector<int>& nums, int n, int s, vector<int>& cur, vector<vector<int>>& ans) { if (n == cur.size()) { ans.push_back(cur); return; } for (int i = s; i < nums.size(); ++i) { cur.push_back(nums[i]); dfs(nums, n, i + 1, cur, ans); cur.pop_back(); } } }; ``` ## Recursion unrolling ### Combination ![](https://i.imgur.com/RCNuorN.png) ### Permutation ![](https://i.imgur.com/7DU8oRc.png) ## Performance measurement - Backtracking - same set of inputs - absoulte running time - Diffetent version of programs running on the same hardware - Same version of program running on different hardwares - Profiling - Relative running time - identify the hotspots(usually a few functions) - one of my program spent ~ 50% on sqrt - Optimize those hostpots an do benchmarking again # SP14. Segment Tree - 也是拿來求array中連續的元素總和 和fenwick tree運用一樣 但是偏重recursive方式去建tree - buildTree O(n) - update element's value O(logn) - we don't need to add new node in tree, kust update the original node's value. - rangeQuery O(logn + k) - K is number of reported segment ## Build Tree ![](https://i.imgur.com/6VTA5Hm.png) ## Update Node value - 不要加新的點 ![](https://i.imgur.com/WvCTvN5.png) ## Range Query ![](https://i.imgur.com/IN6MJqS.png) ## 307. Range Sum Query – Mutable ```C++ // Author: Huahua, running time: 24 ms class SegmentTreeNode { public: SegmentTreeNode(int start, int end, int sum, SegmentTreeNode* left = nullptr, SegmentTreeNode* right = nullptr): start(start), end(end), sum(sum), left(left), right(right){} SegmentTreeNode(const SegmentTreeNode&) = delete; SegmentTreeNode& operator=(const SegmentTreeNode&) = delete; ~SegmentTreeNode() { delete left; delete right; left = right = nullptr; } int start; int end; int sum; SegmentTreeNode* left; SegmentTreeNode* right; }; class NumArray { public: NumArray(vector<int> nums) { nums_.swap(nums); if (!nums_.empty()) root_.reset(buildTree(0, nums_.size() - 1)); } void update(int i, int val) { updateTree(root_.get(), i, val); } int sumRange(int i, int j) { return sumRange(root_.get(), i, j); } private: vector<int> nums_; std::unique_ptr<SegmentTreeNode> root_; SegmentTreeNode* buildTree(int start, int end) { if (start == end) { return new SegmentTreeNode(start, end, nums_[start]); } int mid = start + (end - start) / 2; auto left = buildTree(start, mid); auto right = buildTree(mid + 1, end); auto node = new SegmentTreeNode(start, end, left->sum + right->sum, left, right); return node; } void updateTree(SegmentTreeNode* root, int i, int val) { if (root->start == i && root->end == i) { root->sum = val; return; } int mid = root->start + (root->end - root->start) / 2; if (i <= mid) { updateTree(root->left, i, val); } else { updateTree(root->right, i, val); } root->sum = root->left->sum + root->right->sum; } int sumRange(SegmentTreeNode* root, int i, int j) { if (i == root->start && j == root->end) { return root->sum; } int mid = root->start + (root->end - root->start) / 2; if (j <= mid) { return sumRange(root->left, i, j); } else if (i > mid) { return sumRange(root->right, i, j); } else { return sumRange(root->left, i, mid) + sumRange(root->right, mid + 1, j); } } }; ``` # SP15 SP16. 如何有效率的刷題 - 總共要刷多少? - 分類每題10~20 (DP越多越好) - 共200~300 - total 100 hr - 如何刷題? - 每週分類刷: - week1: Tree/Linked List - week2: Search - week3: Dynamic Programming - run 1 : 5 min沒方向就看答案 - run 2 : 寫一題不可過60min - run 3 : 15~20寫不出來 就看答案 - RTFS, RTFS, RTFS(Read The Fucking Source Code) - 前75%的 Top的有可能會優化太多 會看不懂 - 學習看不同語言 - Algo - DS - Template - Optimal - 培養的能力 - 這題要用哪個演算法? - 給你region 可以推算出Time Complexity嘛? - Coding Style - naming - Tab - 換行 - 括號 - 完整的手寫實現,不要copy代碼,增強肌肉記憶 # SP17. Binary Search II g(x) is a function taht exists m such that g(x) > 0 (True) if x >=m g(x) <=0 else ![](https://i.imgur.com/BCnbdTG.png) ```python def binary_search(l, r) while l < r m = l + (r-l) if(g(m)) r = m # new range :[l,m) else l = m + 1; # new range :[m+1, r) return l; # or not found ``` r應該設n+1, 但是會有oveflow問題,(n = 2^32 - 1) 所以設r = n,如果找不到最後在return n 即可。 ![](https://i.imgur.com/Cxb7eUj.png) ![](https://i.imgur.com/gG47hC5.png) # SP18. Minimuum spanning tree