Interval Partitioning

上一次提到的 Interval Scheduling，是因為我們只有一個 Resource 可以處裡工作
那如果今天我要問的是

我最少要多少個 Resource ，才可以把工作做完？

就是 Interval Partitioning 問題

像下面這張工作表，最少就是需要三個 Resource 才能把工作做完

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

不過跟 Scheduling 一樣，他只能告訴你最少需要幾個 Resource ，種類不是他的目的

當然，他一定會吐出一種情況

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

在講方法前，要先理解一個概念，深度 Depth

Depth / d

在一個時間段 (Interval) 內，會有重疊的工作，記下這些重疊的工作數量
而整個工作表中，最大的重疊工作數量，就是 Depth
像上面的例子，深度就是 3 ，下面深度都用代號 d 替代

而 d 代表了我至少所需的 Resource

至少所需的 Resource / Lower Bound

如上面所說，就是一個工作表的深度 d

證明

假如在第 i 區間有達到 d，代表我至少需要 Resource
此時遇到第 k 個區間：
假如第 k 個區間發現不只需要 d 個 Resource，例如 d + 1 ，代表該區間同時有 d + 1 個工作重疊，與 d 的定義矛盾

如何找到 Depth

老師說自己想想看，時間複雜度是 O(nlogn)

將原本的工作堆，依照開始時間和結束時間，排序成兩個陣列

應該將這兩個陣列綁成 List，之後才可以做時間點的判定
有空來寫寫，挖坑

此時制定兩個變數，d_max 用來記錄最大值，d 用來記錄目前的值
遍歷陣列，如果遇到「開始時間」，就讓 d++；遇到「結束時間」，就讓 d–
每次 d 進行改變後，就跟 d_max 判斷，比較大的話就存進 d_max
而且要注意一點，對於同個時間點，「結束時間」要排在「開始時間」前面
否則 d_max 會因為上面的原則而過大

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Pseudo Code

目的是要幫每個 Interval 上標籤，標籤數量從{1,2…d}

看到後面才明白，原來Interval 指的不是一個時間區段，而是一個工作的時間區段
所以一個 Interval 其實就是一份工作
下面都會括號表示是工作

給重疊的 Interval 上不同的標籤

function：Innterval_Partitioning(R)
1. I = {I1...In} // sort Intervals(Works) in ascending order of their start times
2. for j from 1 to n do
3.    exclude the labels of all assigned intervals(works) that are not "compatible" with Ij
4.    if (there is a nonexcluded label from {1,2...d})
5.       assign a nonexcluded label to Ij
6.    else leave Ij unlabeled

簡單來說就是依序把每個工作丟進一個 label ，而 label 會存放他所儲存的工作中
最後加入的工作，的結束時間
然後每次要丟入前，把要丟入的工作，的開始時間，跟 label 所儲存的「最後結束時間」比較
如果「要丟入的工作的開始時間」大於「最後結束時間」，代表兩份工作間一定沒有重疊，就加進去
反之就換下一個 label 再次檢查能不能丟進去

所以程式碼所說的 "compatible" 就是比較兩份工作會不會衝突的部分

看圖會比較明白

此時三個 Label 的最後結束時間，一開始都先定為 0
因此，工作 1 就會依照 Label 的順序，以及判斷的準則，丟入 Label 1

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

由於 Label 1 的最後結束時間已經更新為工作 1 的結束時間，也就是 2
所以，工作 2 會因為開始時間小於 Label 1 的最後結束時間，不能塞進 Label 1
因此會塞進 Label 2

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

工作三也依照一樣的方式，最後塞進 Label 3

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

剩餘的工作一樣

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

最後就得到下面的結果

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

最佳解證明：Line 6 不會發生

對於一個工作 I ，假設在他開始的時間點重疊了 t 個工作
所以包含他在內，在s(I)那個時間點，總共重疊了 t + 1 個工作
而我們知道我一開始找的 d 是整個工作表的最大重疊數量，於是可以知道

t + 1 \leq d, t \leq d - 1

而上面右邊的不等式告訴我們，最後一定會剩下一個 label 可以給工作 I 使用

所以每個工作一定都可以被分配到一個 Label ，代表 Line 6 一定不會發生
這也代表我們的程式碼，的確是會使用到 Lower Bound 的

而這項證明就告訴我們，我們的程式碼是最佳解

正確性證明：不會有重疊的工作有同個 Label

假設有兩個工作 A B，就說 s(A)<s(B)
所謂的不重疊，就是f(A)≦s(B)，其他情況都算重疊
根據程式碼，因為 Label 會記錄最後的工作的結束時間，用他來決定要不要塞入下一份工作
假設 A 是他所在的 Label 的最後一個工作。如果f(A)>s(B)，也就是AB重疊
那他絕對不會被塞入 A 所在的 Label ，畢竟這就是我們程式碼確保的事情

Scheduling to Minimize Lateness

這個例子是要展示一次「交換無異」的證明方式

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

這次換成是我有一堆工作，每個工作都有各自的期限 Deadline

哪一種排序方法，「最大的延遲的時間」最小

所謂延遲的時間，就是該工作的結束時間，減掉期限
如果全部的結束時間比期限早的話，當然就是 0

跟第一個排程問題一樣，先來個腦力激盪，想出 Greedy Rules

最短的工作時長
- 就像是最簡單的先做
最小的工作後剩餘時間(slack)
- 是將工作結束後的時間，減掉期限。越小的話代表你做完很快就要交了
最早的期限
- 就是看輕重緩急

最短的工作時長

先做工作 1，會導致工作 2 遲交 1 天，
但是先做工作 2 的話，工作 1 並不會遲交，反例成立

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

最小的工作後剩餘時間(slack)

先做工作 2 ，工作 1 會遲交 9 天，
但是先做工作 1 ，工作 2 只遲交 1 天，反例成立

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

最早的期限

沒錯，他就是最佳解；似乎也蠻符合常理的

Pseudo Code

Min_Lateness(R,s)
// f is the finishing time of the last scheduled request
1. {d1...dn} = sort requests in ascending order of their deadlines
2. f = s // s 應該是 0
3. for i from 1 to n do 
4.    assign request i to the time interval, from s(i)=f to f(i)=f+ti
5.    f = f + ti
6. return the set of scheduled intervals [s(i),f(i)) for all i = 1...n

證明交換無異 Exchange arguments

回顧一次何謂「交換無異」

假設有一個很厲害的人，他不用 Greedy 找到一個最佳解 O
我假設我的解 A ，不是最佳解；然後把他的解，跟我的解不一樣的地方，替換成我的部分

不一樣的形式有兩個：

O 有一個元素，在 A 沒有；A 也有一個元素，在 O 沒有
O 裡面有兩個連續元素的順序，在 A 當中是顛倒的，也就是 Inversion

之後就開始替換：如果是第 1 種，就是把這兩者交換 (swap)
如果是第二種，一樣，交換這兩個的順序

也就是根據我的 Greedy Rule 修正成像 Greedy 的樣子

最後只要證明我每一個交換的步驟，都不會損害 O 的品質，讓他變成不是最佳解
那就代表我的解 A 跟 O 是一樣好的，跟我一開始假設 A 不是最佳解矛盾

因此，我們要先蒐集一些證明所需的推論

Lemma 1

一定存在一個最佳解，該解並沒有空閒時間 Idle Time

證明

假設我有一個最佳解，他有空閒時間。那我把空閒時間壓縮後，依舊是一個最佳解

Inversion

兩個工作A, B，明明 A 的期限比 B 早，但是 A 卻比 B 晚做
這在我們的 greedy rule 中是顛倒的存在

s (B) < s (A), d_{A} < d_{B}

Lemma 2

All schedules without inversions and without idle time have the same maximum lateness

證明

如果兩個排程，既沒有 Inversion 也沒有 Idle Time，那麼唯一順序可能產生不一樣的地方，只有那些期限是一樣的工作
不過既然這兩個工作的期限是一樣的，那他們一定會排在其他期限比他們早的工作後，以及期限比他們晚的工作前
所以你把兩份工作合在一起看，超出期限的「最大部分」一定是一樣長的

Lemma 3

There's an optimal schedule with no inversion and no idle time

證明

跟上面的很像，就是假設有一個最佳解存在 Inversion

如果他有 Idle Time ，我們可以用上面的 Lemma 1 得到一個壓縮過的排程，或者說

An optimal schedule with inversion and no idle time

而對於「任何」發生 Inversion 的兩個工作，我把他前後位置調換
這樣並不會使結果產生影響，畢竟你把一個本應該早點做的工作，從晚做變成早做
最大延遲時間只會是小於或是等於的

「任何」的重要性是代表你所做的每一步

串起來

我用 Greedy 找到的解法叫做 A ，假設他不是一個最佳解

假設有一個人找到了最佳解 O

假設 O 有 Idle Time
- 那我可以用 Lemma 1 壓縮成「沒有 Idle Time」
假設 O 沒有 Idle Time，沒有 Inversion
- 那其實 A 跟 O 是一樣的，代表 A 是最佳解，矛盾
假設 O 沒有 Idle Time，但存在 Inversion
- 那我可以用 Lemma 3 ，找到另一個沒有 Inversion 的解
- 這個沒有 Inversion 的解其實就是根據我的 Greedy rule 找到的解 A
- 也就是說我的 A 跟 O 至少是一樣好的(可能會更好)，矛盾

總結

Greedy Algo 就是你找的一個「最好」的準則，然後就頭也不回的做下去
常用的證明方法就是「保持領先」跟「交換無異」

但是通常，通常，很難遇到一個問題，可以單純用 Greedy 就解完的
而且如果真的被你遇到，不要猶豫，就用吧

Interval-Graph？

一開始有說這類問題叫 Interval-Graph 不過到目前為止，根本沒看過任何 Graph ？
其實只要把每個工作當作一個節點，有重疊的就建一個 Edge
這樣的 Graph 就叫做 Interval-Graph，或叫做 Intersection-Graph

而先前的 Interval partitioning 問題，換到 Graph 上
其實就是相連兩點顏色要不一樣的問題

輕重緩急 ?

如果給一個圖表

又重又急的事情一定要排第一
又輕又緩的事情一定要排最後
那剩餘兩個怎麼排？
通常，或著說現實中，往往會選擇輕急的事情做

但是老師說，一個很有名已過世的學者建議，要挑重緩的事情做
因為一旦拖延重緩，到後面他會變成重急，而人遇到重急的事情，壓力會變大
這種事情越多，壓力越大，對人的影響也越大

References

參考 1

Interval Partitioning

我最少要多少個 Resource ，才可以把工作做完？

Depth / d

至少所需的 Resource / Lower Bound

證明

如何找到 Depth

Pseudo Code

最佳解證明：Line 6 不會發生

正確性證明：不會有重疊的工作有同個 Label

Scheduling to Minimize Lateness

哪一種排序方法，「最大的延遲的時間」最小

最短的工作時長

最小的工作後剩餘時間(slack)

最早的期限

Pseudo Code

證明 交換無異 Exchange arguments

Lemma 1

證明

Inversion

Lemma 2

證明

Lemma 3

證明

串起來

總結

Interval-Graph？

輕重緩急 ?

References

Read more

Multiplexers, Decoders, and Programmable Logic Devices|第八週

辦第二隻 Line 帳號

計算機概論傳送門

第 2 週 - 補數、採樣、與其他

證明交換無異 Exchange arguments