LIS 最長遞增子序列解法探究

# Longest Increasing Subsequence 最長遞增子序列在此文章中，了方便說明，會將 Longest Increasing Subsequence 簡稱為 LIS ## 說明最長遞增子序列就是在一個數列中，找出最長的子序列，且子序列的數字依序遞增，且不一定要在原先的數列中連續。 ## 舉例假設現在有一數列： **1 , 2 , 5 , 3 , 4 , 10 , 9** 那這個數列的 LIS 就會是： **1 , 2 , 3 , 4 , 10** **1 , 2 , 3 , 4 , 9** 沒錯，LIS 可以有很多個因此大部分的問題都主要是求 LIS 的長度 ## 解法目前 LIS 長度有兩種方法可以求得，首先來講比較好理解的，動態規劃解法 ### 動態規劃動態規劃最重要的部分就是定義和寫出轉移式，所以先來定義轉移式首先先定義一個一維陣列 $dp[i]$ 代表以第 $i$ 個位置結尾的 LIS 長度，$num[i]$ 表示原數列的第 $i$ 個位置而 $dp[i]$ 會從前面的 $dp[j]$ $(j<i)$ 中取最大的 LIS 長度來轉移，而如果此時 $num[j]<num[i]$，代表可以將 $num[i]$ 這個位置加入 LIS 當中(LIS+1)，否則 $dp[i]$ 就保持原本的 LIS 長度，因此可以推出轉移式： **$dp[i]=max(dp[i],dp[j]+1)$** 別忘了要把 $dp$ 每一個值都初始化為 $1$，因為自己也算是 LIS 的一個數，否則轉移式無法正確運行 ### 範例題目給一個長度為 $n$ $(n<=10^3)$ 的數列 $num$，求 LIS 的長度？程式碼： ```cpp= #include<bits/stdc++.h> using namespace std; const int N=1000; int num[N],dp[N]; int main(){ int n,ans=1; cin>>n; for(int i=0;i<n;i++) cin>>num[i]; fill(dp,dp+n,1);//初始化為1 for(int i=1;i<n;i++){ for(int j=0;j<i;j++){ if(num[j]<=num[i]){ dp[i]=max(dp[i],dp[j]+1);//轉移式 ans=max(ans,dp[i]);//用一個變數存取最大的LIS } } } cout<<ans<<'\n'; } ``` ### 探討使用動態規劃的解法會需要開到兩層迴圈，這樣複雜度是 $O(n^2)$，難道這就是極限了嗎？不，還有一種更快的方法可以把複雜度壓縮至 $O(nlogn)$，就讓我們繼續看下去。 ### 使用二分搜的解法這個解法的思路在於，要維護一個 LIS 的數組 $LIS$，每當遇到一個數字 $i$，就去與 $LIS$ 的最後數字相比，如果比較大就將 $i$ 插入到 $LIS$ 最後；如果 $i$ 沒有比最後的數還大，那就使用二分搜找尋 $LIS$ 中大於 $i$ 的最小數字，並將那個數替代為 $i$。這邊拿維基百科的 GIF 來把剛剛的解法演示一下： ![](https://hackmd.io/_uploads/rJVse4Yyp.gif) 這樣一來，只會將每個數遍歷一次，並且每次最差只會二分搜一次，因此時間複雜度僅為 $O(nlogn)$ ### 實作程式碼程式碼： ```cpp= int main(){//LIS ios::sync_with_stdio(0),cin.tie(0); int n; cin>>n; vector<int> num(n),LIS; for(auto &i:num) cin>>i; for(const auto &i:num){ if(LIS.empty()||LIS.back()<=i) LIS.push_back(i);//將i插入最後 else *lower_bound(LIS.begin(),LIS.end(),i)=i;//用二分搜尋找並替代 } cout<<LIS.size(); } ``` **注意！這邊維護的數組 $LIS$ 裡面的數字並不是實際的 LIS，因此無法直接使用 $LIS$ 來當作其中一組 LIS 來使用！** ### 相關的練習題其實 LIS 的解法都不複雜，因此每個題目都是難在你看不出那是 LIS 的題型 [CSES Towers](https://cses.fi/problemset/task/1073) ||LIS裸題，不知道為什麼是LIS的可以參考這個 [LIS GitHub](https://github.com/labuladong/fucking-algorithm/blob/master/%E5%8A%A8%E6%80%81%E8%A7%84%E5%88%92%E7%B3%BB%E5%88%97/%E5%8A%A8%E6%80%81%E8%A7%84%E5%88%92%E8%AE%BE%E8%AE%A1%EF%BC%9A%E6%9C%80%E9%95%BF%E9%80%92%E5%A2%9E%E5%AD%90%E5%BA%8F%E5%88%97.md)|| [ZJ d052. 11456 - Trainsorting](https://zerojudge.tw/ShowProblem?problemid=d052) ||LIS 變形題|| [ZJ f608. 4. 飛黃騰達](https://zerojudge.tw/ShowProblem?problemid=f608) ||LIS 變形題，因為題目不是嚴格遞增，相等也可以，所以要換成使用 upper_bound()|| ### 延伸如果題目要求的不只有 LIS 的長度，還要求一組的 LIS 怎麼辦，剛剛的兩個解法都只能用來求 LIS 的長度，無法用來求實際的一組 LIS，比如像這題： [ZJ d242. 00481 - What Goes Up](https://zerojudge.tw/ShowProblem?problemid=d242) 其實不難，可以使用一個陣列 $dp$ 用來儲存每次數字最後到達 $LIS$ 的位置，並在變歷完整個數列後，倒著將 $dp$ 逆推回去，即可得出一組最後出現的 LIS 程式碼： ```cpp= //d242. 00481 - What Goes Up #include<bits/stdc++.h> using namespace std; #define all(x) x.begin(),x.end() const int N=5e5+5; int dp[N]={}; int main(){ ios::sync_with_stdio(0),cin.tie(0); int n; vector<int> num,lis,ans; while(cin>>n){ num.push_back(n); } for(int i=0;i<num.size();i++){ if(lis.empty()||lis.back()<num[i]){ lis.push_back(num[i]); dp[i]=lis.size(); } else{ auto iter=lower_bound(all(lis),num[i]); dp[i]=iter-lis.begin()+1; *iter=num[i]; } } cout<<lis.size()<<"\n-\n"; int length=lis.size(); for(int i=num.size()-1;i>=0;i--){ if(dp[i]==length){ ans.push_back(num[i]); length--; } } for(int i=ans.size()-1;i>=0;i--) cout<<ans[i]<<'\n'; } ``` ## 應用在 Bazaar(一個版本控制系統)上，使用著 Patience Diff 的比較演算法，在其中用來求得 LCS(最長公共子序列)的 Patience sorting 就是 LIS Patience sorting 的過程：建立一個堆陣列，依序取出數字，並加入比該數字小的堆中，如果沒有符合的堆，則新增一個堆並加入該數。等數字全部取完後，將各個堆串接就可以得出排序完的數列。 ### 舉例(使用 wiki 的例子) 有一個欲排序的數列 3, 5, 7, 1, 4 1. 取出數字 3, 由於目前沒有任何堆所以產生一號堆並把 3 放入堆一內容: 3 2. 取出數字 5, 5 比一號堆的第一個數字 3 大, 故產生二號堆並把 5 放入堆一內容: 3 堆二內容: 5 3. 取出數字 7, 7 比一號堆和二號堆的第一個數字大, 故產生三號堆並把 7 放入堆一內容: 3 堆二內容: 5 堆三內容: 7 4. 取出數字 1, 所有堆的第一個數字都比 1 大, 故放入一號堆堆一內容: 3, 1 堆二內容: 5 堆三內容: 7 5. 取出數字 4, 只有一號堆的第一個數字比 4 小, 所以將 4 放入二號堆堆一內容: 3, 1 堆二內容: 5, 4 堆三內容: 7 這個過程與前面提的題目[CSES Towers](https://cses.fi/problemset/task/1073) 解題過程幾乎一模一樣，LIS 就是堆的數量，像剛剛的例子： 3, 5, 7, 1, 4 這個數列的 LIS 就是 3，因為他最後使用了三個堆程式碼範例： ```python= from bisect import bisect def LCS(a, b): #這個演算法會避免受到重覆文本的干擾，所以會去除重複的文本 #找字串 a 的單詞在文本中出現的位置 index = {} for i in range(len(a)): line = a[i] if line in index: index[line] = None else: index[line]= i #找到字串 a 和 b 都有出現的單詞 btoa = [None] * len(b) index2 = {} for pos, line in enumerate(b): next = index.get(line) if next is not None: if line in index2: btoa[index2[line]] = None del index[line] else: index2[line] = pos btoa[pos] = next #耐心排序的部分(LIS) backpointers = [None] * len(b) stacks = [] lasts = [] k = 0 for bpos, apos in enumerate(btoa): if apos is None: continue if stacks and stacks[-1] < apos: k = len(stacks) elif stacks and stacks[k] < apos and (k == len(stacks) - 1 or stacks[k+1] > apos): k += 1 else: k = bisect(stacks, apos) if k > 0: backpointers[bpos] = lasts[k-1] if k < len(stacks): stacks[k] = apos lasts[k] = bpos else: stacks.append(apos) lasts.append(bpos) if len(lasts) == 0: return [] result = [] k = lasts[-1] while k is not None: result.append((btoa[k], k)) k = backpointers[k] result.reverse() return result text1 = "Hello Hello , world ! This is an example text for testing." text2 = "Hello ! This is a world example text for testing." result = LCS(text1.split(), text2.split()) print(result) for i in result: print(text1.split()[i[0]],end=" ") ```

Read more

競程比賽用模板

課程規劃

資訊網路