---
tags: INFO2022
---
# 2022-09-20 課堂討論簡記
- [ ] video: [叫 foodpanda「送~」](https://youtu.be/f0Nr0XywxnA)
* 雖然是廣告短片 (本課程謝絕業配),但卻刻劃著職場生態,不是你委曲求全,公司主管就會理你,後者寧可花更多錢找其他人
* 你在簡歷中宣揚自身的智慧,他人可能會認定是[小惠](https://dict.revised.moe.edu.tw/dictView.jsp?ID=106818)
- [ ] Nic: [學習資料結構、演算法在工作上真的有用嗎?](https://youtu.be/-Y_4rOXeqHQ)
* [冷次定律 (Lenz's law)](https://zh.m.wikipedia.org/wiki/%E6%A5%9E%E6%AC%A1%E5%AE%9A%E5%BE%8B)
* 何時用得到 linked list merge 呢?
+ [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list)
+ memory allocator!
* 成大某教授:「很多知識和理論之所以讓你覺得沒有用,其實是因為你尚未到達需要他們的高度」
- [ ] [臺灣人從事軟體行業直接出國工作,機會成本最低的國家大概是日本和新加坡](https://www.facebook.com/permalink.php?story_fbid=pfbid0aU7pvpFzS5GJtaaDwTE3LvW7wiwhmBdMqXt5HsV36JiPjLhdT18FaVCnigPJG5QXl&id=100000201416442)
## 課程問答
Q1: Merge Two Sorted Lists 真實案例
linux/list.h 是 Linux os 中實用的 doubly linked list 封裝,只要在自定義的結構中加入 struct list_head,就能使用 linux 所提供的一系列 api,像是 `list_add `, `list_add`,`list_empty` 等...
在 linux 原始碼 lib/list_sort.c 可以看到以下 **merge** 實作,功能是將兩個排序好的 linked list 給整合
:::spoiler merge()
```c
/*
* Returns a list organized in an intermediate format suited
* to chaining of merge() calls: null-terminated, no reserved or
* sentinel head node, "prev" links not maintained.
*/
__attribute__((nonnull(2,3,4)))
static struct list_head *merge(void *priv, list_cmp_func_t cmp,
struct list_head *a, struct list_head *b)
{
struct list_head *head, **tail = &head;
for (;;) {
/* if equal, take 'a' -- important for sort stability */
if (cmp(priv, a, b) <= 0) {
*tail = a;
tail = &a->next;
a = a->next;
if (!a) {
*tail = b;
break;
}
} else {
*tail = b;
tail = &b->next;
b = b->next;
if (!b) {
*tail = a;
break;
}
}
}
return head;
}
```
:::
\
而這個 function 在 **list_sort** 有被使用到,用以實作 linked list 高效率 merge sort
:::spoiler list_sort
```c
__attribute__((nonnull(2,3)))
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp)
{
struct list_head *list = head->next, *pending = NULL;
size_t count = 0; /* Count of pending */
if (list == head->prev) /* Zero or one elements */
return;
/* Convert to a null-terminated singly-linked list. */
head->prev->next = NULL;
/*
* Data structure invariants:
* - All lists are singly linked and null-terminated; prev
* pointers are not maintained.
* - pending is a prev-linked "list of lists" of sorted
* sublists awaiting further merging.
* - Each of the sorted sublists is power-of-two in size.
* - Sublists are sorted by size and age, smallest & newest at front.
* - There are zero to two sublists of each size.
* - A pair of pending sublists are merged as soon as the number
* of following pending elements equals their size (i.e.
* each time count reaches an odd multiple of that size).
* That ensures each later final merge will be at worst 2:1.
* - Each round consists of:
* - Merging the two sublists selected by the highest bit
* which flips when count is incremented, and
* - Adding an element from the input as a size-1 sublist.
*/
do {
size_t bits;
struct list_head **tail = &pending;
/* Find the least-significant clear bit in count */
for (bits = count; bits & 1; bits >>= 1)
tail = &(*tail)->prev;
/* Do the indicated merge */
if (likely(bits)) {
struct list_head *a = *tail, *b = a->prev;
a = merge(priv, cmp, b, a);
/* Install the merged result in place of the inputs */
a->prev = b->prev;
*tail = a;
}
/* Move one element from input list to pending */
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;
} while (list);
/* End of input; merge together all the pending lists. */
list = pending;
pending = pending->prev;
for (;;) {
struct list_head *next = pending->prev;
if (!next)
break;
list = merge(priv, cmp, pending, list);
pending = next;
}
/* The final merge, rebuilding prev links */
merge_final(priv, cmp, head, pending, list);
}
```
:::
\
根據 May 15, 2019 的 commit 紀錄指出,高效率 merge sort 概念源自論文 [Queue-mergesort](https://www.sciencedirect.com/science/article/pii/002001909390088Q)
:::spoiler commit 紀錄
> lib/list_sort: optimize number of calls to comparison function
>
> CONFIG_RETPOLINE has severely degraded indirect function call
> performance, so it's worth putting some effort into reducing the number
> of times cmp() is called.
>
> This patch avoids badly unbalanced merges on unlucky input sizes. It
> slightly increases the code size, but saves an average of 0.2*n calls to
> cmp().
>
> x86-64 code size 739 -> 803 bytes (+64)
>
> Unfortunately, there's not a lot of low-hanging fruit in a merge sort;
> it already performs only n*log2(n) - K*n + O(1) compares. The leading
> coefficient is already at the theoretical limit (log2(n!) corresponds to
> K=1.4427), so we're fighting over the linear term, and the best
> mergesort can do is K=1.2645, achieved when n is a power of 2.
>
> The differences between mergesort variants appear when n is *not* a
> power of 2; K is a function of the fractional part of log2(n). Top-down
> mergesort does best of all, achieving a minimum K=1.2408, and an average
> (over all sizes) K=1.248. However, that requires knowing the number of
> entries to be sorted ahead of time, and making a full pass over the
> input to count it conflicts with a second performance goal, which is
> cache blocking.
>
> Obviously, we have to read the entire list into L1 cache at some point,
> and performance is best if it fits. But if it doesn't fit, each full
> pass over the input causes a cache miss per element, which is
> undesirable.
>
> While textbooks explain bottom-up mergesort as a succession of merging
> passes, practical implementations do merging in depth-first order: as
> soon as two lists of the same size are available, they are merged. This
> allows as many merge passes as possible to fit into L1; only the final
> few merges force cache misses.
>
> This cache-friendly depth-first merge order depends on us merging the
> beginning of the input as much as possible before we've even seen the
> end of the input (and thus know its size).
>
> The simple eager merge pattern causes bad performance when n is just
> over a power of 2. If n=1028, the final merge is between 1024- and
> 4-element lists, which is wasteful of comparisons. (This is actually
> worse on average than n=1025, because a 1204:1 merge will, on average,
> end after 512 compares, while 1024:4 will walk 4/5 of the list.)
>
> Because of this, bottom-up mergesort achieves K < 0.5 for such sizes,
> and has an average (over all sizes) K of around 1. (My experiments show
> K=1.01, while theory predicts K=0.965.)
>
> There are "worst-case optimal" variants of bottom-up mergesort which
> avoid this bad performance, but the algorithms given in the literature,
> such as queue-mergesort and boustrodephonic mergesort, depend on the
> breadth-first multi-pass structure that we are trying to avoid.
>
> This implementation is as eager as possible while ensuring that all
> merge passes are at worst 1:2 unbalanced. This achieves the same
> average K=1.207 as queue-mergesort, which is 0.2*n better then
> bottom-up, and only 0.04*n behind top-down mergesort.
>
> Specifically, defers merging two lists of size 2^k until it is known
> that there are 2^k additional inputs following. This ensures that the
> final uneven merges triggered by reaching the end of the input will be
> at worst 2:1. This will avoid cache misses as long as 3*2^k elements
> fit into the cache.
>
> (I confess to being more than a little bit proud of how clean this code
> turned out. It took a lot of thinking, but the resultant inner loop is
> very simple and efficient.)
>
> Refs:
> Bottom-up Mergesort: A Detailed Analysis
> Wolfgang Panny, Helmut Prodinger
> Algorithmica 14(4):340--354, October 1995
> https://doi.org/10.1007/BF01294131
> https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.5260
>
> The cost distribution of queue-mergesort, optimal mergesorts, and
> power-of-two rules
> Wei-Mei Chen, Hsien-Kuei Hwang, Gen-Huey Chen
> Journal of Algorithms 30(2); Pages 423--448, February 1999
> https://doi.org/10.1006/jagm.1998.0986
> https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.5380
>
> Queue-Mergesort
> Mordecai J. Golin, Robert Sedgewick
> Information Processing Letters, 48(5):253--259, 10 December 1993
> https://doi.org/10.1016/0020-0190(93)90088-q
> https://sci-hub.tw/10.1016/0020-0190(93)90088-Q
>
> Feedback from Rasmus Villemoes <linux@rasmusvillemoes.dk>.
>
> Link: http://lkml.kernel.org/r/fd560853cc4dca0d0f02184ffa888b4c1be89abc.1552704200.git.lkml@sdf.org
> Signed-off-by: George Spelvin <lkml@sdf.org>
> Acked-by: Andrey Abramov <st5pub@yandex.ru>
> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Daniel Wagner <daniel.wagner@siemens.com>
> Cc: Dave Chinner <dchinner@redhat.com>
> Cc: Don Mullis <don.mullis@gmail.com>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> @torvalds
> George Spelvin authored and torvalds committed on 15 May 2019
:::
\
Mordecai J. Golin, Robert Sedgewick **Queue-Mergesort** 論文裡的示意圖
,方法為把 linked list 視為一個 queue。每次的 merge 對象為 queue 中的最前面二個節點,merge 後放到 tail 並視為一個節點,重複步驟直到全部 merge 完成。
> 
因此只要有用到 list_sort 的地方,背後都有用到 merge two sorted list 的概念
Q2: 範例一缺失 (1)
Ans: 在說明題目時未提到 input 的 data type,以及沒有提到 Binary Tree 類型 (constraint),例如說:是 complte binary tree 還是 full binary tree...等。最後,沒有說明 node 的型態為何。
Q3 - 1: 範例一缺失 (2)
Ans:
* 🐶 interviewer:
* 開頭題目說明的環節,建議一開始就搭配 example 1 的圖片優先說明本題目會給一棵 binary tree,每個節點存放整數數值。從 root node 至 leaf node 可以連成一條路徑,例如:「5-4-11-2」是一條 root-to-leaf 路徑,並且我們可以將 root-to-leaf 的每個節點數值加總得到 **targetSum**,而且加總的路徑是單向不可折返。 <br><br>接著表達本題的需求是「要 interviewee 寫出一個 function 來判斷給定的 targetSum 是否存在於 binary tree 之中」<br>並且應明確表達**第一個參數會給這棵 binary tree 的 root,第二個參數是題目欲判斷的 targetSum**。
* 上課筆記: [1:27](https://youtu.be/vcccBHvvoJw?t=87) example 的部分應由 interviewer 進行解說,而非 interviewee。
* [3:13](https://youtu.be/vcccBHvvoJw?t=193) 以我觀察 google coding interview 通常不會讓 interviewee 直接寫程式或 pseudocode,而是要先讓 interviewer 聽懂 interviewee 的方法後 ([Google Coding Interview: 15:20-15:41](https://youtu.be/rw4s4M3hFfs?t=920)),再讓 interviewee 撰寫 pseudocode ([Google Coding Interview: 16:54-16:57](https://youtu.be/rw4s4M3hFfs?t=1004)),故 [3:16-3:23](https://youtu.be/vcccBHvvoJw?t=196) 應為 interviewer 的台詞。
* 🐼 interviewee:
* [2:23](https://youtu.be/vcccBHvvoJw?t=123) 其實**可以直接借用 example 1 的圖片進行舉例提問**,節省時間,舉例也更具體。
* [3:28-3:42](https://youtu.be/vcccBHvvoJw?t=208) 先說明方法再寫程式是正確的,**建議可以先強調自己要使用的演算法名稱 (例如本題的 method 1 使用 BFS)**,此處對於 queue 的功能說明不夠明確,即便影片提到「佇列資料結構裡面存的型態應該是節點跟它減完 targetSum 的值」,若改說「**我預計使用 BFS 演算法,並於走訪各節點的時候,使用 queue 來存放走訪到的數值,並且這個數值並非直接儲存 node value,而是技巧性地儲存 targetSum 與 node value 的差**」應會更清楚。若有清楚說明 queue 的用途,後續說明 `Approach` 及 [6:37](https://youtu.be/vcccBHvvoJw?t=397) 撰寫程式碼時,interviewer 比較容易明白你操作 queue 的 pop() 或 push() 的時機與用意。
* [3:55](https://youtu.be/vcccBHvvoJw?t=235) 此處[影片中的 Panda 同學](https://hackmd.io/@sysprog/BkeIYF8-j#112-Path-Sum)應是想使用 `REACTO`,搭配 `Example` 輔助講解 `Approach`,但是純口頭說明演算法對聽者 (interviewer) 較難想像。<br>建議口頭說明的同時,於 Google Docs 打出 queue 的示意圖以輔助講解,類似下方的文字註解方式,當你要做 pop 或 push 時,亦可同時更改下方的內容。
```c
// queue
// -------------------------
// (17-8), (17-4), (22-5)
// -------------------------
```
* 說明 `Approach` 時,建議用**程式註解的方式,先粗略寫出各步驟、策略,確認 interviewer 有跟上後,再寫出各步驟細部的 pseudocode 或 coding 實作。** (例如: [Google Coding Interview With A High School Student: 17:08-32:33](https://youtu.be/qz9tKlF431k?t=1028))<br>先寫演算法步驟的好處是後續寫 code 可以參考每個步驟,也能幫助 interviewer 跟上 interviewee 的節奏。
Q3 - 2: 範例一的其他解法
Ans:
我嘗試使用課程中老師所提示的 Python 的 [dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) 實作第三種方法,以下從 `REACTO` 的 `Approach` 步驟開始,並一邊說明 `Approach` 一邊用 example 1 作 `Example`:
* Interviewee:「我認為可以使用**遞迴實作 DFS** 來走訪二元樹的所有節點,(參見下方註解框,可同步於 docs 寫下 **`DFS`**),並且從樹的底部也就是 leaf node,一路反向回傳路徑上的 current summation。並搭配 **dictionary 紀錄每條路徑的 current summation**,我可以用 dictionary key 來記錄 current node value summation、dictionary value 紀錄該 current summation 出現次數 (於 docs 寫下 `dict of (current sum, counts)`),最後遍歷整棵樹之後可以拿到所有路徑的 node value summation,若存在符合 TargetSum 的路徑就回傳 True,否則回傳 False」

* Interviewee:「(搭配 example 1 範例圖,路徑 5-4-11-7),遞迴的 **base case** 就是當左右子節點都是 None 表示走到 leaf node,接著遞迴會從 ++node 7++ 將 value 7 包進 dictionary 回傳給 ++node 11++,接著走訪 ++node 2++ 也會將 node value 包進 dictionary 回傳,此時 ++node 11++ 可以將左右子節點回傳的兩個 dictionary Merge 成一個 dictionary,同時計算 ++node 11++ 的 current summation。」
* Interviewee:「++node 11++ 會新增兩個 current summation 至 dictionary,key 值是 node.value 減去左右節點的 node value (於 docs 寫下 `recursive step:...`),回傳到 root 時會先走訪右子樹,走訪完回到 root 的時候,dictionary 就能紀錄四筆路徑的 summation」
* 接者寫出演算法個步驟: (下方 `#1 ~ #5`)
```
# Main Approcch: DFS, dict of (current sum, counts)
# recursive step: HT[node.val - sub_leftnode.val], HT[node.val - sub_rightnode.val]
# 1) Use DFS traverse this tree
# 2) When reach leaf node, make a dict. of map_key is node.val, map_value is count
# 3) Get current sum merge dicts from left-subtree and right-subtree
# 4) End of DFS, get all paths sum and counts
# 5) if (HT[targetSum] > 0) return True else False
```
* `Code` 實作參考
```python=
class Solution:
# T_whole(?) = O(1) + T_DFS(?) + O(1)
# = O(n * log(n))
def hasPathSum(self, root: Optional[TreeNode], targetSum: int) -> bool:
# O(1)
# if tree is empty
if not root:
return False
# Do DFS and get dict of (value sum, counts) record all paths
HT_full = self.DFS(root)
if HT_full.get(targetSum, 0) > 0: # if HT_full[targetSum] exist
return True
else:
return False
# T_DFS(?) = base case + recurrsive steps
# = O(counts of null nodes) + (k * merge time)
# = O(n) + O(k * 2^{k-1}) # k = log(n+1)
# = O(n) + O(log(n) * n)
# = O(n * log(n))
def DFS(self, node):
# O(1)
'''Base case: reach NULL child node'''
if not node: return {}
# O(?) = O(k * merge time)
'''Recurrsive steps:'''
# O(1)
# reach leaf node, return a dict(): map_key is node.val, map_value is count of node.val (init. 1)
if (not node.left) and (not node.right):
return {node.val: 1}
HT1 = self.DFS(node.left)
HT2 = self.DFS(node.right)
# T_MergeHT(?) = O(n)
HT_merged = self.MergeHT(node.val, HT1, HT2)
return HT_merged
# T_MergeHT(?) = O(counts of key in input Hashtable)
# = O(counts of paths) = O(counts of leaf nodes)
# = O(2^{k-1}) = O(n/2) # k = log(n+1)
def MergeHT(self, current_val, HT1, HT2):
if HT1 and (current_val != 0):
for k in list(HT1.keys()):
HT1.update({k + current_val:HT1.get(k + current_val, 0)+HT1[k]})
del[HT1[k]]
if HT2:
for k in list(HT2.keys()):
HT1.update({k + current_val:HT1.get(k + current_val, 0)+HT2[k]})
del HT2
return HT1
```
* 程式碼重點解釋
* 第 3~13 行 `hasPathSum()` 是主要 function
* 第 15~27 行實作 DFS
* 第 29~38 行實作 `#recursive step` 提到的 dictionary merge
* 第 10、32、36 行 [dict.get()](https://docs.python.org/3/library/stdtypes.html#dict.get) 會依 map_key 查 map_value, 若查不到 key 則生成該 map_key, 並初始化 map_value 為 0
* 第 32、36 行 [dict.update()](https://docs.python.org/3/library/stdtypes.html#dict.update) 是 python dict 新增 key, value 的方法
* `Test`
* 假設是 full binary tree、輸入的樹有 $n$ 個節點、樹高為 $k$
* `hasPathSum()` 的時間複雜度會受 `DFS()` 影響。
* `DFS()` 的時間複雜度是 base case 次數 + recursive setps 次數
* base case 是走訪到 NULL 節點數量,即 $O(2^{k})$
* recursive setps 會「切分左右子樹執行遞迴」以及「執行 `MergeHT()`」,所以會是 $k$ 乘上 `MergeHT()` 執行時間
* 假設所有 distionary 查找的時間複雜度是 $O(1)$,則 `MergeHT()` 時間會是兩個 hashtable 的 keys 數量的和,最終 keys 數量是 paths 數,也就是 leaf nodes 數量,leaf nodes 數量是 $2^{k-1}$
* 整理得 $T_{recursive\;setps}(?) = O(k \cdot 2^{k-1})$,而 $k = log_2(n+1)$
* 故`hasPathSum()` 的時間複雜度是 $O(n) + O(n \cdot log_2(n)) = O(n \cdot log_2(n))$
* `Optimization`: ==這裡延伸補充一下我在課堂中,最初提到使用 **DFS 走訪**,並用**全域變數紀錄 current summation** 的方法,剛好能作為 `Optimization`。==
* Interviewee:「原本的方法需要走訪完所有的路徑,但本題其實找到一條符合 TargetSum 的路徑就可以回傳 True,所以我可以設法在找到正確路徑時就提早終止走訪,另外這題在走訪過程中似乎不需要紀錄這麼多資訊,事實上我只需要一個全域變數記錄當前走訪節點的 current summation 即可。
* `code` 在下方 (此提供++延伸修改原本的 python code++ 及 ++c code 版本可與 panda 同學比較++)
* Interviewee: 「(以下可以一邊修改複製後的原程式碼,一邊說明) **首先我要把 dict 及 `MergeHT()` 拿掉,因為要改成使用單一全域變數來記錄 current summation**,這裡一般作法會把 current summation 當作 recursion 的 parameter,但副作用是每次遞迴都要暫存 current summation 到記憶體的 stack,如果改用全域變數來管理 current summation,執行時就只需要用 4 bytes (假設是在 C 語言並使用 integer 型態儲存的話),在 python 裡的 class 可以用 `self.current_sum` 來宣告 class 裡的全域變數。再來 **DFS 的 return 要改為 boolean,DFS 的 Recursive step 變成每次走訪到一個節點就做 `self.current_sum += node.val`,最後在左右節點走訪後都各做一次提前結束的檢查,也就是 `if result==true: return True`。**」
* `Test` Interviewee:「修改後的程式碼,假設總共有 n 個節點、樹高是 k (root 高度從 1 開始),最差情況是 $O(2^{k}-1)$ 因為每個節點都要走訪一次,最佳情況是第一條路徑就找到答案就會是 $\Omega(k)$。而樹高 k,假設是極度歪斜樹 $k = n$,相當於走訪一條 linked list;假設是 full binary tree,則 $k=log_{2}(n+1)$,因為總節點數是 $n=2^{k}-1$,則 $k=log_{2}(n+1)$。」
:::spoiler {state="open"} (<--可展開)**(Optimization) 延伸修改 Python code**
```python
# Optimization 參考程式碼: DFS + global current_sum
class Solution:
def __init__(self):
self.current_sum = 0
self.TargetSum = 0
def hasPathSum(self, root: Optional[TreeNode], targetSum: int) -> bool:
self.TargetSum = targetSum
return self.DFS(root) # Do DFS and get boolean result
def DFS(self, node):
'''Base case: reach NULL child node'''
if not node: return False
'''Recursive step'''
self.current_sum += node.val
# reach leaf node
if (not node.left) and (not node.right):
if (self.current_sum == self.TargetSum):
return True
result = self.DFS(node.left)
if result: return True # Early stop
result = self.DFS(node.right)
if result: return True # Early stop
# if result == false
self.current_sum -= node.val
return False
:::
:::spoiler {state="close"} (<--可展開) **(Optimization) C 語言實作參考**
```c
/* Optimization 參考程式碼: DFS + global current_sum */
int current_sum = 0;
bool DFS(struct TreeNode* node, int targetSum);
bool hasPathSum(struct TreeNode* root, int targetSum)
{
current_sum = 0;
return DFS(root, targetSum);
}
bool DFS(struct TreeNode* node, int targetSum)
{
/* baes case */
if (!node) return false;
/* recursive step */
current_sum += node->val; // add curr sum,
// print traversal //printf("node %d\n", node->val);
// leaf node check
if (!(node->left) && !(node->right))
if (current_sum == targetSum)
return true;
bool result;
result = DFS(node->left, targetSum);
if (result) return result;
result = DFS(node->right, targetSum);
if (result) return result;
// result = false: give up this node value, not add to curr sum
current_sum -= node->val;
return false;
}
```
:::
<br>
Q4: DFS 和 BFS 手法 (實作) 對 "hasPathSum" 的差異
Ans: BFS 的空間複雜度為 $O(B^M)$ ,較 DFS 的空間複雜度 $O(BM)$ 大很多。而 "hasPathSum" 必須且只需找到一個符合條件的葉節點, BFS "分支優先找最佳解" 的優勢無法發揮,反而花大量空間在保存待檢查節點,複製節點也花了不少時間。