CS:APP Ch7 Linking

為什麼要學習 Linking

理解 linker （連結器）幫助你建構大型程式
- 大型程式會包含許多 libraries ，了解如何 linking 可以幫助你處理棘手的編譯錯誤。
理解 linker 可以避免寫程式上犯下難以抓出的錯誤
- linker 執行 symbol resolution 所做的決定將大大的影響程式執行。
理解 linking 幫助你理解 scope 的概念
- global 跟 local variable 之間的差別
- static 的作用
理解 linking 幫助你理解重要的系統概念
理解 linking 讓你更理解如何使用 shared library

Linking

Source code file 經過

cpp ( preprocessor )
cc1 ( gcc compiler )
as ( assembler )

最終變為 Relocatable object file ( relocatable 代表此 file 可以再與其他 object file 結合)，如同上圖中的 main.o 和 sum.o 。

Linker 將所需的 object file 結合為 executable object file ，如同上圖的 prog。

shell 呼叫作業系統中的 loader 將 executable object file 中的程式碼與資料複製到 main memory 中，並將控制權交給程式，讓它開始執行。


$ ./prog

preprocessor, compiler, assembler 和 linker 合稱為 compiler driver 。

Why Linkers ?

Modularity

共同的 Functions 可以被寫為 library 達到模組化的作用。

Efficiency

time :
- 每次 Recompile 可以只對修改過的檔案編譯。
- separate compilation: 可以同時編譯多個檔案。
space :
- Common function 可以被放到同個檔案成為 libraries。

What Do Linkers Do

linker 將不同的 relocatable object file 作為輸入，最終產生可以載入且運行的 executable object file 。

Symbol resolution

在程式碼中，我們不斷的 define 或 reference symbol ，用以下例子來說明。

void rand () { ... }  /* 定義 symbol rand */

int main ()  
{
    int a = 0;            /* 定義 symbol a */
    int *ptr = &a;        /* 定義 symbol ptr, reference symbol a */
    int b = rand ();        /* 定義 symbol b, reference symbol rand */
}

在 symbol rsolution 階段， linker 需要將每個 symbol reference 連結到正確且唯一的 symbol definition ， symbol definition 則從所有的 relocatable object file 中的 symbol table 尋找。
在本篇後面的章節有更細的討論。

Relocation

在 symbol resolution 章節中， linker 為每個 symbol reference 到一個 symbol definition ，這時 linker 就可以知道每個 module 的 .text 和 .data 確切大小，即可開始做 relocation 。

Relocation section and symbol definition

各個 relocatable object file 都有 code 和 data sections ，這個階段會將分散在各個 file 相同類型的 section 合為單一的 section ，如同上圖所示。 Linker 替每個 symbol definition (每個 function 跟全域變數) 分配一個執行時的唯一 address 。

這邊也呼應到在區別宣告跟定義的時候，會將定義認定為替某個 symbol 分配記憶體空間，當 linker 在執行 linking 的時候，也是找出唯一的一個 strong symbol ，並為它分配一個執行時的唯一 address 。

Relocation symbol reference

在 symbol resolution 步驟中，每個 symbol reference 都被連結到某一個 symbol definition ，在這個步驟就將 symbol reference 指到正確的執行時 address 。

經過上述兩個步驟後，最終產生 executable object file ，裏面的 .text 和 .data section 內的 symbol 都已經被重新定位， loader 可以直接將這些 section 複製到 memory 中即可開始執行程式，不需要再修改任何指令。

reference 這個單字很常在 compile error log 中看到:

/tmp/asskutu.o(.txex+0x12): undefined reference to 'foo'

這就是 linker 執行 symbol resolution 過程中，它無法為 foo 這個 symbol 找到一個 symbol definition 來 reference 。

Object file

Object file 有以下三種格式，object file 可以理解為 module (.c file) 以 byte sequence 的形式儲存在磁碟中。

Relocatable object file (.o file)
Executable object file (a.out file)
Shared object file (.so file)

ELF

在 linux 中，使用 Executable and Linkable Format 來作為以上三種 object file 的統一格式，這邊就不細講各個欄位。

.bss 是用來儲存未初始化或是初始為0的全域或 static 變數，在 object file 中，他們是不佔記憶體空間的，在運行時才會分配記憶體，因此，.bss 可以記憶為 "Better Save Space" ，幫助區分 .bss 和 .data 。

下圖是一個典型的 reloctable object file 的 ELF 格式:

Executable object file

處理 object file 的工具

ar : 用來建立 static libraries
strings : 顯示在檔案中可以列印的字串
strip : 從 object file 刪除 debug symbol table 。在將 object file 放到目標環境之前，應該進行清理動作來縮小 object file
nm : 列出 object file 的 symbol
size : 顯示分區的名字和大小
readelf : 顯示 elf 格式 object file 的資訊。包含 elf header 中的資訊，包含 size 和 nm 功能。
objdump : 可以顯示 object file 的所有資訊。他最大功能用來 disassemble .text section 中的 binary format command 。
ldd : 列出 object file 執行時所需的 shared libraries 。

Symbol and Symbol table

在每個 relocatable object file 中都會維護一個 system table ，透過 $ readelf -s 可以看到的 .symtab section 就是 symbol table ，裏面紀錄著該 module 定義與引用 symbol 的資訊，有以下三種 symbol：

global symbols
- module 自己定義的 global variable 或 function ，可以被其他 module reference 。
external symbols
- module 內 reference 別人定義的 symbol 。
local symbols
- 透過 static 宣告的 global variable 或 function ，無法被其他 module reference 。

Local variable 並不是 Local symbol ，local symbol 指的是以 static 定義的函式與變數，會被放在 ELF 中的 .bss 或 .data 區塊。 Local variable 是由 compiler 所負責的，執行時放在 stack 中進行管理，因此 Linker 對 local variable 是一無所知的。

Symbol resolution

執行 Symbol resolution 會面臨到一個問題，我們看以下例子，

[func1.c]

int p = 1;

func1 () {
}

[func2.c]

int p;

func2 () {
}

func1 和 func2 都定義了名為 p 的 global symbol ，若是有人 reference p 這個 symbol 時， Linker 該怎麼去做 Linking ? 換句話說，若是有多個 module 都定義了相同名字的 global symbol ， Linker 該怎麼去做 linking 呢？

How linkers resolve Duplicate Symbol Definitions

在編譯階段， Compiler 會將每個 symbol 分類為 strong 或 weak 。
- Strong Symbol ：包含 procedures 和被初始化過的全域變數。
- Weak Symbol ：未被初始化的全域變數。
Assembler 會將 symbol 是 strong 還 weak 紀錄在 reloctable object file 的 symbol table 中。
Linker 就可以藉此特性與 Linker's symbol rule 來解決上述的問題。

Linker Symbol's Rule

Linker 利用以下列規則來決定如何做 Linking :

同時存在多個同名的 strong symbol 是不允許的。
假設有一個 strong symbol 與多個 weak symbol ， Linker 應選擇 strong symbol 。
如果只有多個 weak symbol ，任意選擇其中一個。

透過以下例子來理解以上的概念：

example 1

[foo1.c]

int x = 5566;

int main()
{
    return 0;
}

[foo2.c]

int x = 666;

void f()
{
}

因為有兩個 strong symbol ， Linker 會報出錯誤。

example 2

[foo3.c]

int x = 5566;

int main()
{
    f();
    printf ("%d", x);
    return 0;
}

[foo4.c]

int x;

void f()
{
    x = 64;
}

在 [foo4.c] 中的 x 是一個 weak symbol ，根據規則2， Linker 會將其 link 到 [foo3.c] 的 x ，但這邊會有一個問題，在 main 中你印出的值將變成 64 ，這對於 main() 的作者來說，應該不是預期的結果且相當討厭。

example 3

[bar1.c]

int y = 5566;
int x = 1;

int main ()
{
    f();
    printf ("x = %d, y = %d", x, y);
    return 0;
}

[bar2.c]

double x;

void f()
{
    x = 0.1; 
}

在 [bar2.c] 的 x 是一個 weak symbol ， Linker 會將其 link 到 [bar1.c] 的 x ，這邊除了存在類似範例 2 的問題，這兩個 symbol 還是不同 type ， double 在我的機器上是 8 byte 大小，而 integer 是 4 byte 大小，因此在 main() 中，因為 x 被轉變為 8 byte 大小， y 會因此被覆蓋… 這是一個細微難以察覺的 bug ，尤其是因為他只會觸發 Linker 爆出一條 warning ，且通常要在程式執行很久之後才會表現出來。如果你懷疑這類問題，用像 GCC-fno-common 的 flag 來呼叫 linker ，這個 flag 告訴 linker 遇到多重定義的 global symbol 時，觸發一個錯誤，或者使用 -Werror 把所有 warning 視為 error 。

該如何避免這樣的錯誤呢？
儘量避免使用全域變數，如果要使用的話，

如果可以的話使用 static
確保全域變數都要初始化
使用 extern 來標示你 reference 到外部的全域變數。不過若是在某個 file 忘記使用的話，依然會遇到一樣的問題…

範例3的錯誤若是在大型程式中，是個相當難發現的錯誤，這也是為什麼要了解 linking 是如何運作的。

Loading Exectuable Object file

Linking with Libraries

Static libraries

開發程式中，一定都會引用標準函式庫，裏面包含常見的函式，讓大家免於重新造輪子的痛苦，這樣的標準函式庫如同 reloctable object file 一樣可以作為 Linker 的輸入，同樣參與 symbol resolution 的過程， linker 只複製 static libraries 裡被 application 引用的目標 module 。

以下舉例

編譯所下指令為

$ gcc -static -o prog2c main2.o -L. -lvector

-static 告訴 compiler driver ， Linker 應該產生一個可以完全 Linking 的 Executable object file ，可以直接載入到記憶體執行，無需做其他 linking 動作。
-L. 告知 Linker 當前目錄下尋找 library 。
-lvector 則是 libvector.a 的縮寫。
libc.a 是標準函式庫，不用特別在輸入中指定

對於 linker 來說，他會照著輸入到 compiler driver 的順序來做 Linking ，若是有 symbol reference 找不到對應的 definition，就會暫時紀錄下來，從接下來輸入的 library 和 object file 中繼續做 symbol resolution ，當 Linker 都掃描完所有輸入文件後，所有 symbol reference 應該都找到對應的 definition 並產生出 executable object file ，否則 Linker 將報出錯誤。

Linking order Error

上述規則會造成做 Linking 時的困擾，如果我們將輸入到 compiler driver 的順序改變一下，如同以下

$ gcc -static -o prog2c  -L. -lvector main.o

輸入到 Linker 的順序中， main.o 是最後一個輸入，若是裏面有使用到定義於 library 中的 symbol ，由於他後面沒有任何文件輸入，在 main.o 中的 symbol reference 將連結不到定義，而造成 Linker 報出錯誤…

因此，一般建議將 library 放到 compiler driver 輸入的結尾來避免這種困擾。

Static libraries disadvantage

Static library 存在以下缺點，也使得後來出現 shared library 來解決這些問題:

若是 library 需要更新，程式必須跟更新過後的 library 重新做 Linking 。
每一個 executable object file 都需要把用到的 static library 複製一份到檔案內；且每個程式運行時都會將自己用到的 static library 複製一份到記憶體中。
- 幾乎每個 C program 都會用到 printf ， Linker 在執行時，都需要將 printf 的 object file 複製到最終的 executable object file 中。在程式執行時，這些常用函數又會被複製到執行中的 process 的 text 中。在一個執行數百個程式的系統中，將對記憶體資源造成極大的浪費。

7.11 Shared library

shared libray 是一個 object module ，在載入（load-time linking）或是執行階段 (run-time linking)，都可以被載入到任意的位址，由 dynamic linker 負責將 shared library 和一個在記憶體中的程式做到 linking ，這樣的過程稱為 dynamic linking 。

以上圖來看如何使用 shared library 做到 load-time linking 。

產生 shared library

$ gcc -shared -fpic -o libvector.so addvec.c multvec.c -fpic

-fpic 代表請 compiler 生成與位置無關的程式碼
-shared 指示 Linker 建立共享的 object file

Partially static linking

$ gcc -o prog21 main2.o ./libvector.so

靜態執行部份的 linking ，建立 executable object file prog21 。

什麼是靜態執行部份 linking ？

以上圖例子來說，關鍵在於沒有任何 libvector.so 的程式碼或是 data 被複製到 prog21 中（這點就是 static library 消耗過多記憶體的原因）。
相反地， Linker 複製了跟 libvector.so 有關 relocation 和 symbol table 等資訊，以便在載入或執行 prog21 時候可以去做 relocation 。

Fully linked executable in memory

當 Loader 載入 prog21 到記憶體中， Loader 藉由檢查 executable file 中是否包含 .interp section 來決定是否已經完全的做完 linking ，如果已經完成所有 linking ， loader 會將控制權交給程式開始執行，否則會將 .interp 內包含的 dynamic linker 的執行路徑讀出，載入並執行 dynamic linker 。
dynamic loader 會從 .dynamic section 中知道該使用哪些 shared libraries 。
dynamic loader 執行剩下的未完成的 relocation 。
dynamic loader 將控制權交給執行程式， shared libraries 的位置將固定下來，在程式執行過程中都不會改變。

以上圖例子來說， Loader （在 linux 中會是 execve）會從 prog21 中發現 .interp section 裏面包含 dynamic linker 的執行路徑（在 linux 中會是 ld-linux.so）並執行， dynamic linker 接著執行以下 relocation 來完成 linking 。

relocate libc.so 的 .text 和 .data 到某個記憶體位址上。

relocate libvector.so 的 .text 和 .data 到某個記憶體位址上。

relocate 在 prog21 中的特定 symbol reference ，這些 symbol reference 會連結到 libc.so 和 libvector.so 中的 symbol defintion 。

當所有 linking 都完成，控制權將交給程式，程式即可開始執行。

Linking Summary

Linking 是一種結合多個 object file 成為一個 program 的技術。包含以下兩個步驟：
- Symbol resolution
- Relocation
Linking 可以發生在不同的階段
- compile time
- load time
- run time
了解 Linking 過程可以幫助你遠離討人厭的錯誤，使你成為一個更好的 programmer 。

tags: `CS:APP`

投影片來源：http://www.cs.cmu.edu/afs/cs/academic/class/15213-f19/www/lectures/14-linking.pdf

上課影片：https://youtu.be/wJRpLEP6rHU

待看補充資料

Articles in tag "Linkers and Loaders" - Eli Bendersky's website

CS:APP Ch7 Linking

為什麼要學習 Linking

Linking

Why Linkers ?

Modularity

Efficiency

What Do Linkers Do

Symbol resolution

Relocation

Relocation section and symbol definition

Relocation symbol reference

Object file

ELF

Executable object file

處理 object file 的工具

Symbol and Symbol table

Symbol resolution

How linkers resolve Duplicate Symbol Definitions

Linker Symbol's Rule

example 1

example 2

example 3

Loading Exectuable Object file

Linking with Libraries

Static libraries

Linking order Error

Static libraries disadvantage

7.11 Shared library

產生 shared library

Partially static linking

Fully linked executable in memory

Linking Summary

tags: CS:APP

待看補充資料

Read more

CS:APP Ch2 Presentation and Maipulating Information

CS:APP Ch3 Machine-level Representation of Programs

CS:APP 學習筆記

2020q1 Homework6 (fiber)

tags: `CS:APP`