RusselCK
  • NEW!
    NEW!  Connect Ideas Across Notes
    Save time and share insights. With Paragraph Citation, you can quote others’ work with source info built in. If someone cites your note, you’ll see a card showing where it’s used—bringing notes closer together.
    Got it
      • Create new note
      • Create a note from template
        • Sharing URL Link copied
        • /edit
        • View mode
          • Edit mode
          • View mode
          • Book mode
          • Slide mode
          Edit mode View mode Book mode Slide mode
        • Customize slides
        • Note Permission
        • Read
          • Only me
          • Signed-in users
          • Everyone
          Only me Signed-in users Everyone
        • Write
          • Only me
          • Signed-in users
          • Everyone
          Only me Signed-in users Everyone
        • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invite by email
        Invitee

        This note has no invitees

      • Publish Note

        Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

        Your note will be visible on your profile and discoverable by anyone.
        Your note is now live.
        This note is visible on your profile and discoverable online.
        Everyone on the web can find and read all notes of this public team.

        Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

        Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

        Explore these features while you wait
        Complete general settings
        Bookmark and like published notes
        Write a few more notes
        Complete general settings
        Write a few more notes
        See published notes
        Unpublish note
        Please check the box to agree to the Community Guidelines.
        View profile
      • Commenting
        Permission
        Disabled Forbidden Owners Signed-in users Everyone
      • Enable
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
        • Everyone
      • Suggest edit
        Permission
        Disabled Forbidden Owners Signed-in users Everyone
      • Enable
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
      • Emoji Reply
      • Enable
      • Versions and GitHub Sync
      • Note settings
      • Note Insights New
      • Engagement control
      • Make a copy
      • Transfer ownership
      • Delete this note
      • Save as template
      • Insert from template
      • Import from
        • Dropbox
        • Google Drive
        • Gist
        • Clipboard
      • Export to
        • Dropbox
        • Google Drive
        • Gist
      • Download
        • Markdown
        • HTML
        • Raw HTML
    Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
    Create Create new note Create a note from template
    Menu
    Options
    Engagement control Make a copy Transfer ownership Delete this note
    Import from
    Dropbox Google Drive Gist Clipboard
    Export to
    Dropbox Google Drive Gist
    Download
    Markdown HTML Raw HTML
    Back
    Sharing URL Link copied
    /edit
    View mode
    • Edit mode
    • View mode
    • Book mode
    • Slide mode
    Edit mode View mode Book mode Slide mode
    Customize slides
    Note Permission
    Read
    Only me
    • Only me
    • Signed-in users
    • Everyone
    Only me Signed-in users Everyone
    Write
    Only me
    • Only me
    • Signed-in users
    • Everyone
    Only me Signed-in users Everyone
    Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- tags: 你所不知道的 C 語言, 進階電腦系統理論與實作, NCKU Linux Kernel Internals, 作業系統 --- # 動態連結器、連結器和執行檔資訊、執行階段程式庫 (CRT) contributed by <`RusselCK` > ###### tags: `RusselCK` ## [動態連結器篇](https://hackmd.io/@sysprog/c-dynamic-linkage?type=view#你所不知道的-C-語言:動態連結器篇) “lingua franca” (IPA 音標 [ˌlɪŋgwə ˈfræŋkə]) 一詞源自 17 世紀義大利語稱呼「法蘭克語/口音」,後來引申為橋接用的語言 * 現代英語就扮演這樣的角色,讓世界各國、不同文化背景的人,得以透過共通的英語來交流。 * [How is English Used as a Lingua Franca Today?](https://www.altalang.com/beyond-words/how-is-english-used-as-a-lingua-franca-today/) * 北京普通話 $\leftrightarrow$ 台灣國語 $\leftrightarrow$ 台語 * 而對近代程式語言來說,就是指 C 語言。 以 Java 程式語言來說,儘管有 Java 虛擬機器,甚至能用 Java 開發 Java 虛擬機器 (如 [Jikes RVM](http://www.jikesrvm.org/), [Maxine VM](https://en.wikipedia.org/wiki/Maxine_Virtual_Machine), [Graal VM](http://openjdk.java.net/projects/graal/)),但和作業系統相關的操作仍需要透過 C 語言 (或 C++),連同呼叫原本用 C/C++ 開發的函式庫在內。 :::warning * [P-code 與 Kenneth Bowles 教授](https://www.facebook.com/JservFans/posts/1711808435612150) * [UCSD Pascal pioneer Ken Bowles has died](https://news.ycombinator.com/item?id=18161217) - 若 UCSD Pascal 裡頭的解譯 P-code 的直譯器,是用 C 語言開發的話,這個直譯器應該會用到 C 語言的函式庫,那麼,是否可與其他用 C 語言所開發的程式做連結,使用彼此的功能 ? * 可以,但要先解決動態連結的問題 ::: ### 用 `LD_PRELOAD` 做壞壞的事 #### 如何得知 malloc/free 的呼叫次數? ##### 簡單的作法 (使用 巨集) ```c int malloc_count = 0, free_count = 0; #define MALLOC(x) do { if (malloc(x)) malloc_count++; } while (0) ``` :::danger 1. 要改寫原始程式碼,將 `malloc` 換成 `MALLOC` 巨集 2. 對 C++ 不適用 (`new` 和 `delete`) * 即便底層 [libstdc++](https://gcc.gnu.org/onlinedocs/libstdc++/) 也用 malloc()/free() 來實做 new 和 delete 3. 若使用到的函式庫 (靜態和動態) 裡頭若呼叫到 malloc()/free(),也無法追蹤到 ::: ##### Interpositioning (使用 動態連結器 (dynamic linker)) * 以 GNU/Linux 搭配 [glibc](https://www.gnu.org/software/libc/) 為例,建立檔案 **malloc_count.c** ```clike #include <stddef.h> #include <string.h> #include <stdio.h> #include <dlfcn.h> void *malloc(size_t size) { char buf[32]; static void *(*real_malloc)(size_t) = NULL; if (real_malloc == NULL) { real_malloc = dlsym(RTLD_NEXT, "malloc"); } sprintf(buf, "malloc called, size = %zu\n", size); write(2, buf, strlen(buf)); return real_malloc(size); } ``` 想要以新的 `malloc` 取代 原有的 `malloc` :::info * **`void *dlsym(void *handle, const char *symbol);`** * obtain **address of a symbol** in a shared object or executable * shared object : dll 檔, ... * executable : 執行檔 * RTLD_NEXT * 告知動態連結器,我們想從下一個載入的動態函式庫載入 malloc 的程式碼位址 ::: :::warning 為什麼 **malloc_count.c** 不需要 `void *p = dlopen(“libc.so.6”, RTLD_LAZY);` ? * 因為 早就打開了 * **malloc_count.c** 是用 C 語言開發的,會需要 C 語言函式庫 同理,也不用主動關掉,因為是共用 `libc` 的 ::: * 編譯和執行: ```shell $ gcc -D_GNU_SOURCE -shared -ldl -fPIC -o /tmp/libmcount.so malloc_count.c $ LD_PRELOAD=/tmp/libmcount.so ls ``` 即可得知每次 `malloc()` 呼叫對應的參數,甚至可以統計記憶體配置,完全不需要變更原始程式碼 :::info * -shared * -fPIC * LD_PRELOAD ::: :::success 【流程解析】 * 透過設定 LD_PRELOAD 環境變數,glibc 的 dynamic linker ([ld.so](http://man7.org/linux/man-pages/man8/ld.so.8.html)) 會在載入和重定位 (relocation) `libc.so` **之前**,載入我們撰寫的 `/tmp/libmcount.so` 動態連結函式庫 * 如此一來,我們實做的 malloc 就會在 `libc.so` 提供的 malloc 函式之前被載入。 * 當然,我們還是需要「真正的」 malloc,否則無法發揮作用,所以透過 dlsym 去從 `libc.so` 載入 malloc 程式碼 ::: * [glibc-abi](https://pagure.io/glibc-abi) * [application binary interface, ABI](https://zh.wikipedia.org/wiki/应用二进制接口) #### Unrandomize (暫時不「亂」的亂數) - [ ] [Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs](https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/) ```c= #include <stdlib.h> #include <time.h> int main(){ srand(time(NULL)); int i = 10; while(i--) printf("%d\n",rand()%100); return 0; } ``` * 建立檔案 **unrandom.c** ```c int rand(){ return 42; //the most random number in the universe } ``` * 編譯和執行: ```shell gcc -shared -fPIC unrandom.c -o unrandom.so LD_PRELOAD=$PWD/unrandom.so ./random_nums ``` ### Interpositioning 的其他應用 - 遊戲破解 - 執行時期追蹤 - sandboxing / software fault isolation (SFI) - profiling - 效能最佳化的函式庫 (如 [TCMalloc](http://goog-perftools.sourceforge.net/doc/tcmalloc.html))。 延伸閱讀: * [Tutorial: Function Interposition in Linux](http://jayconrod.com/posts/23/tutorial-function-interposition-in-linux) * [List of resources related to LD_PRELOAD](https://github.com/gaul/awesome-ld-preload) 也可使用 `_ld --wrap=symbol` 的方式,詳見_ [How to wrap a system call (libc function) in Linux](http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/)。 ### Symbolism (what does `-Bsymbolic` do?) - GNU ld 有個選項 `-Bsymbolic-functions` 會影響 LD_PRELOAD 的行為 >When creating a shared library, bind references to global function symbols to the definition within the shared library, if any. This option is only meaningful on ELF platforms which support shared libraries. - [ ] [Symbolism and ELF files (or, what does -Bsymbolic do?)](https://blog.flameeyes.eu/2012/10/symbolism-and-elf-files-or-what-does-bsymbolic-do) 自己看 ### ELF files ("No such file or directory" 可能跟你猜想的不一樣) - [ ] [where is ELF interpreter](https://github.com/imay/imay.github.io/blob/master/_posts/2014-11-02-linker-loader.md) - [ ] [PatchELF](https://nixos.org/patchelf.html) ==在動態連結的環境中,ELF interpreter 其實就是 dynamic linker!== > Main binary specifies which loader to use * linux 核心的程式碼 [fs/binfmt_elf.c](http://lxr.free-electrons.com/source/fs/binfmt_elf.c) ```c=2 /* * linux/fs/binfmt_elf.c * * These are the functions used to load ELF format executables as used * on SVr4 machines. Information on the format may be found in the book * "UNIX SYSTEM V RELEASE 4 Programmers Guide: Ansi C and Programming Support * Tools". * * Copyright 1993, 1994: Eric Youngdale (ericy@cais.com). */ ``` ```c=690 static int load_elf_binary(struct linux_binprm *bprm) ``` ```clike=751 if (elf_ppnt->p_type == PT_INTERP) { /* This is the program interpreter used for * shared libraries - for now assume that this * is an a.out format binary */ ... elf_interpreter = kmalloc(elf_ppnt->p_filesz, GFP_KERNEL); if (!elf_interpreter) goto out_free_ph; retval = kernel_read(bprm->file, elf_ppnt->p_offset, elf_interpreter, elf_ppnt->p_filesz); ``` program interpreter 找尋程式需要哪些共享的 libraries :::info * [UNIX System V](https://zh.wikipedia.org/wiki/UNIX_System_V) * System V是AT&T的第一個**商業**UNIX版本的加強 ::: 延伸閱讀: - [ ] [Hacking Your ELF For Fun And Profit](http://mgalgs.github.io/2013/05/10/hacking-your-ELF-for-fun-and-profit.html) - [ ] 《[Binary Hacks](https://ncku365-my.sharepoint.com/:b:/g/personal/p76091624_ncku_edu_tw/Ef-hwrKrKG5MvCwSIf688HUBiP3Ot2qzjzne0K1ttGH7BQ?e=o35u4p)》 * [ELF Hacks](https://maskray.me/blog/2015-03-26-elf-hacks) ### 複習 編譯器最佳化原理 :dart: [編譯器最佳化原理 筆記](https://hackmd.io/FspGCG57QfO5u6oaIYj0uQ?both#編譯器與最佳化原理、案例分析) * Compilation units * LTO (Link Time Optimization) 影片 1:16:00 ~ 1:37:00 ### 《[Modern C](https://modernc.gforge.inria.fr/)》 * [C/C++ 中的 static, extern 的變數](https://medium.com/@alan81920/c-c-中的-static-extern-的變數-9b42d000688f) Rule 摘錄 [ **Rule 4.22.2.1** ] File scope static const objects may be replicated in all compilation units that use them. (Page 169) >英語中 replicate 有「複製」或「重複」的意思 ==一旦物件宣告為 `static const`,那麼編譯器就可以施加更多樣的最佳化策略== ```c static const x = 42; // 編譯器可將程式裡的 `x` 全部換成 `42` ``` [ **Rule 4.22.2.2** ] File scope static const objects cannot be used inside inline functions with external linkage. Another way is to declare them ```clike extern listElem const singleton; ``` and to define them in one of the compilation units: ```clike listElem const singleton = { 0 }; ``` This second method has the big **disadvantage** that **the value of the object is not available in the other units that include the declaration**. Therefore we may miss some opportunities for the compiler to optimize our code. 考慮以下程式碼: ```clike inline listElem *listElem_init(listElem *el) { if (el) *el = singleton; return el; } ``` 如果編譯器已經得知 `singleton` 的內含值,那麼原本指定數值的操作就不用重複自記憶體載入,而且呼叫 `listElem_init()` 的地方就能更緊湊,對效能和程式追蹤有助益。 ### Symbol Visibility 預設情況下,所有「不是 static」的 symbol (函式/變數) 都可會開放給其他 compilation unit 去存取,這樣的行為我們稱為 "**export**"。 * 一個 symbol 一旦 export,就可能遇到前述的 interpositioning,這很可能會導致非預期的行為。 解決方法是,妥善地設定 symbol visibility。 :::info ==gcc 和 clang 都支援 [visibility](https://gcc.gnu.org/wiki/Visibility) 屬性和 **`-fvisibility`** 編譯命令==,以便對每個 object file 來進行全域設定: * **default** : 不修改 visibility * **hidden** : 對 visibility 的影響與 **static** 這個 qualifier 相同。此 symbol 不會被放入 dynamic symbol table,其他動態連結函式庫或執行檔看不到此 symbol ::: > [Standard Template Library, STL](https://zh.wikipedia.org/wiki/标准模板库) > 《[The Annotated STL Source](http://read.pudn.com/downloads190/ebook/894697/STL-InsightByFU.pdf)》侯傑 範例 : 新酷音 [ [source](https://github.com/chewing/libchewing/blob/master/include/global.h) ] ```clike if (__GNUC__ > 3) && (defined(__ELF__) || defined(__PIC__)) # define CHEWING_API __attribute__((__visibility__("default"))) # define CHEWING_PRIVATE __attribute__((__visibility__("hidden"))) #else # define CHEWING_API # define CHEWING_PRIVATE #endif ``` * [C 语言中__attribute__的作用](https://winddoing.github.io/post/12087.html) 實驗: - [ ] [Why symbol visibility is good](https://www.technovelty.org/code/why-symbol-visibility-is-good.html) * 慮以下程式碼: (syms.c) ```clike static int local(void) { } int global(void) { } int __attribute__((weak)) weak(void) { } ``` :::info * **`attribute((weak))`** * 若 linker 發先別人也有相同名稱的 Symbol,則會先用別人的 ::: * 編譯和分析: (Symbol table) ```shell $ gcc -o syms.o -c syms.c $ LC_ALL=C readelf --syms syms.so Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS syms.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 5: 0000000000000000 7 FUNC LOCAL DEFAULT 1 local 6: 0000000000000000 0 SECTION LOCAL DEFAULT 5 7: 0000000000000000 0 SECTION LOCAL DEFAULT 6 8: 0000000000000000 0 SECTION LOCAL DEFAULT 4 9: 0000000000000007 7 FUNC GLOBAL DEFAULT 1 global 10: 000000000000000e 7 FUNC WEAK DEFAULT 1 ``` 對照看之前的 malloc_count: ```shell $ readelf --syms /tmp/libmcount.so | grep malloc 15: 00000000000007c0 163 FUNC GLOBAL DEFAULT 12 malloc 35: 0000000000000000 0 FILE LOCAL DEFAULT ABS malloc_count.c 36: 0000000000201050 8 OBJECT LOCAL DEFAULT 24 real_malloc.3854 61: 00000000000007c0 163 FUNC GLOBAL DEFAULT 12 malloc ``` * 修改 malloc_count.c,讓定義的程式碼變更為以下: ```clike __attribute__((visibility("hidden"))) void *malloc(size_t size) { ... 其餘不變 ... } ``` 就會發現 `LD_PRELOAD=/tmp/libmcount.so ls` 沒有效果。 換言之,我們定義的 `malloc` 已經變成 local,不會影響到其他動態連結函式庫和執行檔。 * 重新觀察: ```shell $ readelf --syms /tmp/libmcount.so | grep malloc 35: 0000000000000000 0 FILE LOCAL DEFAULT ABS malloc_count.c 36: 0000000000201050 8 OBJECT LOCAL DEFAULT 24 real_malloc.3854 46: 00000000000007a0 163 FUNC LOCAL DEFAULT 12 malloc ``` 可見到 visibility 從原本的 GLOBAL 變更為 LOCAL。 ### 動態連結支援 (Dynamic Linking) - [ ] [Linking](http://www.scs.stanford.edu/15wi-cs140/notes/linkers.pdf) (老師影片用的) - [ ] [Linking](http://www.scs.stanford.edu/18wi-cs140/notes/linking.pdf) (new) ![](https://i.imgur.com/kL6uJkE.png) * `f.o` 和 `c.o` 在 linking 會互相參照 ![](https://i.imgur.com/SZmL379.png) * 在動態連結,我們關注在 Compile time 之後的 Load/Run time ![](https://i.imgur.com/aJevea4.png) * 此時並不知道 `printf` 的真實位址 * 因為尚未參照與 `printf` 有關的函式庫 ![](https://i.imgur.com/mYBYuGD.png) > 此時還是不知道 `printf` 的真實位址 ![](https://i.imgur.com/pM84yO3.png) * 連結參照到 動態連結函式庫、glibc 等之後,便可知道確切位置 (Relocation) ### Shared Libraries * Make upgrading, bug fixing, and security patches easier * Reduces total code size installed * Plugins ![](https://i.imgur.com/8vmvH1X.png) * `libc.a` 出現好多次 :::info * [ar](https://linux.die.net/man/1/ar) * create, modify, and extract from **archives** * 裡面放 **relocatable object** > `printf.o` 、`scanf.o` ... ::: ![](https://i.imgur.com/xBCjMtv.png) :::info * -fpic * 允許在連結時期調整位址 ![](https://i.imgur.com/GAZ515M.png) ::: ### 如何做到動態載入 Symbol ? (PLT、GOT) ![](https://i.imgur.com/nLzi2FZ.png) * 查表 (PTL) * 表裡面還有表 (GOT) :::info * **procedure Linkage table, PLT** * entry for the function so that the dynamic loader can re-direct the function call. * **global offset table, GOT** * The PLT is a trampoline which gets the correct address of the function being called (from the _global offset table_, GOT) and bounces the function call to the right place. ![](https://i.imgur.com/lfKuo8e.png) ::: ![](https://i.imgur.com/59c2exc.png) * 這裡的 `&dlfixup` 是假的位址 (Lazy Dynamic Linking) * Program 載入後,**有需要才去找尋**真正的位址並更新 ![](https://i.imgur.com/sGmj8uf.png) 後面還有 資安 的部分 ~ ### 其他必看 - [ ] [Computer Science from the Bottom Up](https://www.bottomupcs.com/) ([Ian Wienand](https://github.com/ianw) 的電子書) * [Chapter 9\. Dynamic Linking](https://www.bottomupcs.com/chapter08.xhtml) - [ ] [C語言編程透視](https://github.com/tinyclub/open-c-book) (電子書) * [Chap 4](https://github.com/tinyclub/open-c-book/blob/master/zh/chapters/02-chapter4.markdown) - [ ] [Anatomy of Linux dynamic libraries](http://www.ibm.com/developerworks/library/l-dynamic-libraries/) 原共筆還有其他延伸閱讀及相關實際應用 ~ ## [連結器和執行檔資訊](https://hackmd.io/@sysprog/c-linker-loader?type=view#你所不知道的-C-語言:連結器和執行檔資訊) 探討 gold (別懷疑,真的有一個 linker 名稱就叫做「金」) 如何讓 Linux 核心發揮 Link-Time Optimization (LTO) 效益,編譯出更精簡且更高效的 Linux 核心映像檔。 ### 嵌入一個二進位檔案到執行檔 #### 範例 I : 假設有個二進位檔案名為 `blob`,可善用 `xxd` 工具: ```shell $ xxd --include blob ``` :::info * $ [xxd](https://linux.die.net/man/1/xxd) * make a hexdump or do the reverse * -i * include * output in C include file style ::: 為了解說方便,先製造一個檔案,紀錄長度: ```shell $ uname -a > blob $ uname -a | wc -c // 檢查字串長度 (bytes) 105 ``` :::info * $ [uname](https://linux.die.net/man/1/uname) * print system information * -a * print all information ::: * 將 `blob` 納入 ELF 檔案中: ```shell $ ld -s -r -b binary -o blob.o blob ``` :::info - [ ] [10分鐘讀懂 linker scripts](https://blog.louie.lu/2016/11/06/10%E5%88%86%E9%90%98%E8%AE%80%E6%87%82-linker-scripts/) - [ ] [Linker Script 初探 - GNU Linker ld 手冊略讀](http://wen00072.github.io/blog/2014/03/14/study-on-the-linker-script/) * $ [ld](https://linux.die.net/man/1/ld) * The GNU linker * ld combines a number of object and archive files, **relocates** their data and ties up symbol references. * Usually the linker is invoked with at least one object file, but you can specify other forms of binary input files * -b * input-format * 此例指定 二進位檔案 (binary) 為 input * -o * output ::: 觀察產生的 `blob.o` 有哪些 Symbol: ```shell $ objdump -t blob.o blob.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l d .data 0000000000000000 .data 0000000000000069 g .data 0000000000000000 _binary_blob_end 0000000000000000 g .data 0000000000000000 _binary_blob_start 0000000000000069 g *ABS* 0000000000000000 _binary_blob_size ``` 示範如何合併: * 寫個測試程式 (`test.c`): ```clike= #include <stdio.h> int main(void) { extern void *_binary_blob_start, *_binary_blob_end; void *start = &_binary_blob_start, *end = &_binary_blob_end; printf("Data: %p..%p (%zu bytes)\n", start, end, end - start); return 0; } ``` :::warning * `#2` : `_binary_blob_start` 、 `_binary_blob_end` 是 `blob.o` 裏頭的 Symbol,並不存在於 `test.c`,因此需要使用 `extern` 來宣告 ::: * 編譯、連結,和執行: ```shell $ gcc test.c blob.o -o test $ ./test Data: 0x55ed5ed15010..0x55ed5ed15079 (105 bytes) ``` 對照上面的 `105` bytes,符合。 :::info * $ [strings](https://linux.die.net/man/1/strings) * print the strings of printable characters in files. ::: 回頭看稍早產生的 `blob.o`: ```shell $ readelf -S blob.o There are 5 section headers, starting at offset 0x180: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .data PROGBITS 0000000000000000 00000040 0000000000000069 0000000000000000 WA 0 0 1 [ 2] .symtab SYMTAB 0000000000000000 000000b0 0000000000000078 0000000000000018 3 2 8 [ 3] .strtab STRTAB 0000000000000000 00000128 0000000000000037 0000000000000000 0 0 1 [ 4] .shstrtab STRTAB 0000000000000000 0000015f 0000000000000021 0000000000000000 0 0 1 ``` :::info * $ [readelf](https://linux.die.net/man/1/readelf) * Displays information about ELF files. * -S * sections、section-headers * Displays the information contained in the file's section headers * 檔名前面有 `.` 代表 section ::: #### 範例 II : ( [objcopy_to_carray](https://github.com/vogelchr/objcopy_to_carray) ) * 觀察 **Makefile** ```shell=20 passwd.o : /etc/passwd objcopy -I binary $(OBJCOPY_ARCH) \ --rename-section .data=.rodata,alloc,load,readonly,data,contents \ --add-section ".note.GNU-stack"=/dev/null \ --set-section-flags ".note.GNU-stack"=contents,readonly \ $< $@ || (rm -f $@ ; exit 1) ``` 複製 /etc/passwd 裡的內容,產生 **passwd.o** :::info * $ [objcopy](https://linux.die.net/man/1/objcopy) * **copy** and translate **object files** * objcopy uses the **GNU BFD** Library to read and write the object files. ::: - [ ] [`readelf` vs. `objdump`: why are both needed](https://stackoverflow.com/questions/8979664/readelf-vs-objdump-why-are-both-needed) * **testme.c** ```c= #include <unistd.h> #include <stdlib.h> #include <stdio.h> extern void _binary__etc_passwd_size; extern char _binary__etc_passwd_start; extern char _binary__etc_passwd_end; int main(int argc, char **argv) { (void)argc; (void)argv; /* try one of the following two lines */ // size_t size = &_binary__etc_passwd_size; size_t size = &_binary__etc_passwd_end - &_binary__etc_passwd_start; printf("Dumping /etc/passwd, in memory @%p, size is %zu.\n", &_binary__etc_passwd_start, size); write(1,&_binary__etc_passwd_start,size); // 輸出內容 exit(0); } ``` ### Init hooks/script (允許特定程式碼在核心啟動早期就執行) - [ ] [10分鐘讀懂 linker scripts](https://blog.louie.lu/2016/11/06/10%E5%88%86%E9%90%98%E8%AE%80%E6%87%82-linker-scripts/) - [ ] [Linker Script 初探 - GNU Linker ld 手冊略讀](http://wen00072.github.io/blog/2014/03/14/study-on-the-linker-script/) 在 [F9 microkernel](https://github.com/f9micro/f9-kernel) 有個特徵 [Init hooks](https://github.com/f9micro/f9-kernel/blob/master/Documentation/init-hooks.txt),**允許特定程式碼在核心啟動早期就執行** 使用方式: ```clike #include <init_hook.h> #include <debug.h> void hook_test(void) { dbg_printf(DL_EMERG, "hook test\n"); } INIT_HOOK(hook_test, INIT_LEVEL_PLATFORM - 1) ``` * 透過 GNU extension 去指定 ELF section: [include/init_hook.h](https://github.com/f9micro/f9-kernel/blob/master/include/init_hook.h): ```clike #define INIT_HOOK(_hook, _level) \ const init_struct _init_struct_##_hook \ __attribute__((section(".init_hook"))) = { \ .level = _level, \ .hook = _hook, \ .hook_name = #_hook, \ }; ``` 將有用到 `INIT_HOOK` 的 Symbol (不論在何處使用),全部丟到 `.init_hook` * 在 [platform/stm32f4/f9.ld](https://github.com/f9micro/f9-kernel/blob/master/platform/stm32f4/f9.ld) 配置了 `.init_hook` 的空間: ```clike init_hook_start = .; KEEP(*(.init_hook)) init_hook_end = .; ``` 用 `init_hook_start`/`end` 包住`.init_hook` 的空間 * 最後在 [kernel/init.c](https://github.com/f9micro/f9-kernel/blob/master/kernel/init.c) 就清晰了: ```clike extern const init_struct init_hook_start[]; extern const init_struct init_hook_end[]; static unsigned int last_level = 0; int run_init_hook(unsigned int level) { unsigned int max_called_level = last_level; for (const init_struct *ptr = init_hook_start; ptr != init_hook_end; ++ptr) if ((ptr->level > last_level) && (ptr->level <= level)) { max_called_level = MAX(max_called_level, ptr->level); ptr->hook(); } last_level = max_called_level; return last_level; } ``` ### 連結器在軟體最佳化扮演重要角色 在**雲端運算** (伺服器超級多台) 中,每一個(小)程式都很重要,找不太到效能瓶頸 * Linker 的最佳化是個很好的切入點 * 程式最佳化程式 * 只改善 1% 就有很大的幫助 :dart: [Linker 筆記](https://hackmd.io/FspGCG57QfO5u6oaIYj0uQ#Linker) * Linker 命令列 本身就是一種語言 * GNU ld linker * Google gold linker ### 教材 : Computer Science from the Bottom Up - [ ] [Computer Science from the Bottom Up](https://www.bottomupcs.com/) ([Ian Wienand](https://github.com/ianw) 的電子書) * [Chapter 7. The Toolchain](https://www.bottomupcs.com/linker.xhtml) 術語解說 * [name decoration](https://zh.wikipedia.org/wiki/名字修饰) 符號 (Symbol) 處理 * [compilation example](https://www.bottomupcs.com/compilation_example.xhtml) * UND : undefined **hello.c** ```c #include <stdio.h> /* We need a prototype so the compiler knows what types function() takes */ int function(char *input); /* Since this is static, we can define it in both hello.c and function.c */ static int i = 100; /* This is a global variable */ int global = 10; int main(void) { /* function() should return the value of global */ int ret = function("Hello, World!"); exit(ret); } ``` **function.c** ```c #include <stdio.h> static int i = 100; /* Declard as extern since defined in hello.c */ extern int global; int function(char *input) { printf("%s\n", input); return global; } ``` Compiling ```shell $ gcc -S hello.c $ gcc -S function.c ``` Assemply ```shell $ as -o function.o function.s $ as -o hello.o hello.s $ ls function.c function.o function.s hello.c hello.o hello.s ``` ```shell $ readelf --symbols ./hello.o Symbol table '.symtab' contains 15 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS hello.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 4 OBJECT LOCAL DEFAULT 5 i 7: 0000000000000000 0 SECTION LOCAL DEFAULT 6 8: 0000000000000000 0 SECTION LOCAL DEFAULT 7 9: 0000000000000000 0 SECTION LOCAL DEFAULT 8 10: 0000000000000000 0 SECTION LOCAL DEFAULT 10 11: 0000000000000004 4 OBJECT GLOBAL DEFAULT 5 global 12: 0000000000000000 96 FUNC GLOBAL DEFAULT 1 main 13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND function 14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND exit ``` * 在 `hello.o` 中 `function` 為 undefined ```shell $ readelf --symbols ./function.o Symbol table '.symtab' contains 14 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS function.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 4 OBJECT LOCAL DEFAULT 5 i 7: 0000000000000000 0 SECTION LOCAL DEFAULT 6 8: 0000000000000000 0 SECTION LOCAL DEFAULT 7 9: 0000000000000000 0 SECTION LOCAL DEFAULT 8 10: 0000000000000000 0 SECTION LOCAL DEFAULT 10 11: 0000000000000000 128 FUNC GLOBAL DEFAULT 1 function 12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf 13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND global ``` * 在 `function.o` 中 `function` 為 defined Linking :::info we can spy on what gcc is doing under the hood with the `-v` (verbose) flag. ::: ```shell /usr/lib/gcc-lib/ia64-linux/3.3.5/collect2 -static /usr/lib/gcc-lib/ia64-linux/3.3.5/../../../crt1.o /usr/lib/gcc-lib/ia64-linux/3.3.5/../../../crti.o /usr/lib/gcc-lib/ia64-linux/3.3.5/crtbegin.o -L/usr/lib/gcc-lib/ia64-linux/3.3.5 -L/usr/lib/gcc-lib/ia64-linux/3.3.5/../../.. hello.o function.o --start-group -lgcc -lgcc_eh -lunwind -lc --end-group /usr/lib/gcc-lib/ia64-linux/3.3.5/crtend.o /usr/lib/gcc-lib/ia64-linux/3.3.5/../../../crtn.o ``` * crtbegin.o / crtend.o * -lgcc / -gcc_eh (exception handling) The first thing you notice is that a program called `collect2` is being called. This is a simple wrapper around `ld` that is used internally by gcc. ### Startup Time Issue - [ ] [Optimizing large applications](http://www.ucw.cz/~hubicka/slides/labs2013.pdf) (2013) Firefox - Mozilla - gecko - [XUL:Lib XUL](https://wiki.mozilla.org/XUL:Lib_XUL) - [ ] [使用 Prelink 加速程序啓動](https://www.twblogs.net/a/5c7d7b8ebd9eee114211f91d) (2004) - [ ] .gnu.hash (2006) * 原本的執行路徑 ![](https://i.imgur.com/nueVA8l.png) ![](https://i.imgur.com/gjXPT07.png) * 程式在執行之初,都在動態連結相關的 section 之間忙碌 * `.data.rel.ro` 和 `.rela.dyn` - [ ] [elfhack](https://wiki.mozilla.org/Elfhack) (2010) 調整後 - [ ] [Improving libxul startup I/O by hacking the ELF format](https://glandium.org/blog/?p=1177) ![](https://i.imgur.com/9k8f43r.png) * 程式執行之初,有機會執行程式內容相關的 section * `.text` - 20% of Firefox libxul image are relocations - **relocation 耗時** - ELF relocations are not terribly size optimized - **relocation 占空間** * REL relocations on x86 take 8 bytes * RELA relocation on x86-64 take 24 bytes - Elfhack compress the relocations * ELFhack removes IP relative ELF relocations and store them in compact custom format. It handles well sequences of IP relative relocations in **vtables**. * After ELF linking, ELFhack linking completes the process. * ELFhack is general tool but not compatible with -z relro security feature. - `7.5` MB of relocations → `0.3` MB. - [ ] Feedback directed reordering (2013) - [ ] [valgrind](https://zh.wikipedia.org/wiki/Valgrind) - [ ] GCC FDO ![](https://i.imgur.com/JOGlITd.png) - [ ] [AutoFDO](https://gcc.gnu.org/wiki/AutoFDO/Tutorial) - [ ] [AutoFDO: Automatic feedback-directed optimization for warehouse-scale applications](https://static.googleusercontent.com/media/research.google.com/zh-TW//pubs/archive/45290.pdf) ### Linktime optimization (LTO) * First released in GCC 4.5. * 需要更多的 CPU 及 記憶體 資源 - 有機會產生更有效率的程式 ![](https://i.imgur.com/qtxShw3.png) - [ ] Linktime optimization in GCC (2014) - [ ] [part 1 - brief history](http://hubicka.blogspot.com/2014/04/linktime-optimization-in-gcc-1-brief.html) * [Tree-SSA](http://gcc.gnu.org/projects/tree-ssa/) - [ ] [part 2 - Firefox](http://hubicka.blogspot.com/2014/04/linktime-optimization-in-gcc-2-firefox.html) * 裡面有各種最佳化組合的比較圖表 - [ ] [part 3 - LibreOffice](http://hubicka.blogspot.com/2014/09/linktime-optimization-in-gcc-part-3.html) * 裡面有各種最佳化組合的比較圖表 ### [LLD - The LLVM Linker](http://lld.llvm.org) * LLD is a drop-in replacement for the GNU linkers that accepts the same command line arguments and linker scripts as GNU. - [ ] [What makes LLD so fast?](https://ncku365-my.sharepoint.com/:b:/g/personal/p76091624_ncku_edu_tw/EYeH1p1K7stPq_LixdYyH1MB_aWvA0ru9dBU_FIR3bVf8g?e=cAEXmk) / [WebM 錄影](https://video.fosdem.org/2019/K.4.201/llvm_lld.webm) - [ ] [Linkers and Loaders](https://wh0rd.org/books/linkers-and-loaders/linkers_and_loaders.pdf) * Peter Smith 宣稱 [lld](https://lld.llvm.org/) 比 GNU gold linker 快 2 到 3 倍,又比標準的 `ld.bfd` 快 5 到 10 倍 ![](https://i.imgur.com/577GpRA.png) ### 延伸閱讀 - [ ] [The missing link: explaining ELF static linking, semantically](http://dominic-mulligan.co.uk/wp-content/uploads/2011/08/oopsla-elf-linking-2016.pdf) > In the C programming language, a simple program such as 'hello, world!' exercises very few features of the language, and can be compiled even by a toy compiler. However, for a linker, even the smallest C program amounts to a complex job, since it links with the C library—one of the most complex libraries on the system, in terms of the linker features it exercises. * formal proof ### 在 Linux 核心的應用案例 - [ ] [Shrinking the kernel with link-time garbage collection](https://lwn.net/Articles/741494/) * 將沒有用到的 Symbol 去除 - [ ] [Shrinking the kernel with link-time optimization](https://lwn.net/Articles/744507/) * [STM32](https://en.wikipedia.org/wiki/STM32) * no MMU - [ ] [STM32F429](http://wiki.csie.ncku.edu.tw/embedded/STM32F429) * Dead-code elimination * LTO and the kernel - [ ] [Shrinking the kernel with an axe](https://lwn.net/Articles/746780/) ### 其他案例 * [tramp-test](https://github.com/ncultra/tramp-test) ```shell $ ./dis.sh $ cat tramp_test.o.asm $ cat trampoline.o.asm ``` * [libelfmaster](https://github.com/elfmaster/libelfmaster): Secure ELF parsing/loading library for forensics reconstruction of malware, and robust reverse engineering tools * [dt_infect](https://github.com/elfmaster/dt_infect): ELF Shared library injector using DT_NEEDED precedence infection. Acts as a permanent LD_PRELOAD * [How the GNU C Library handles backward compatibility](https://developers.redhat.com/blog/2019/08/01/how-the-gnu-c-library-handles-backward-compatibility/) ## [執行階段程式庫 (CRT)](https://hackmd.io/@sysprog/c-runtime?type=view#你所不知道的-C-語言-執行階段程式庫-CRT) 你想過 C 程式 `int main(int argc, char *argv[])` 背後的運作原理嗎? 想過 C 程式既然「從 main() 開始執行」,那從哪裡獲取 argv[] 的內容呢? 以及最後 main 函式 return 數值時,又做了什麼處理,才能讓作業系統得知程式執行結果呢? `atexit()` 一類的函式如何確保在 C 程式執行終止階段,得以執行註冊的函式? :::info - [ ] [使用 atexit() 讓程式被關閉時做對應的動作](https://kheresy.wordpress.com/2013/11/28/c-atexit/) * [`int atexit(void (*function)(void));`](https://linux.die.net/man/3/atexit) * register a function to be called at **normal process termination** * i.e. `return 0;` * Functions so registered are called in the reverse order of their registration; no arguments are passed. ::: - [ ] [The Development of the C Language](https://www.bell-labs.com/usr/dmr/www/chist.html) C 語言標準化的緩慢過程 - [ ] [Re: [問卦] 寫程式語言的語言是怎麼來的?](https://disp.cc/b/163-aR41) - [ ] [Re: [新聞] 她會個屁程式設計! 維密超模驚人簡歷](https://pttweb.tw/s/38ewP) ### 先看 Microsoft 的文件怎麼說 - [ ] [DLL 和 Visual C++ 執行階段程式庫行為](https://docs.microsoft.com/zh-tw/cpp/build/run-time-library-behavior?view=vs-2017) * 指定使用動態連結程式庫 (DLL) 時,預設連結器就會包含 Visual C++ 執行階段程式庫 (**VCRuntime**) * VCRuntime 包含初始化及終止 C/C++ 可執行檔所需的程式碼。 * VCRuntime 程式碼會提供內部 ==DLL 進入點函式呼叫== * `_DllMainCRTStartup` 函式會執行基本工作,例如堆疊緩衝區安全性設定,C 執行階段程式庫 (CRT) 初始化及終止,而且會呼叫==建構函式和解構函式== * `_DllMainCRTStartup` 也呼叫攔截函式的其他程式庫,例如 WinRT、 MFC 和 ATL 來執行他們自己的初始化及終止。而不需要這項初始化、 CRT 和其他程式庫,以及靜態變數,就會處於未初始化的狀態。 * `VCRuntime` 內部初始化和終止常式會呼叫是否以**靜態方式連結的 CRT** 或**動態連結的 CRTDLL**,會使用您的 DLL。 :::info * [Constructors and Destructors in C++](https://www.studytonight.com/cpp/constructors-and-destructors-in-cpp.php) * 在定義類別時,您可以使用建構函式(Constructor)來進行物件的初始化,而在物件釋放資源之前,您也可以使用「解構函式」 (Destructor)來進行一些善後的工作,例如清除動態配置的記憶體,或像是檔案的儲存、記錄檔的撰寫等等。 ::: - [ ] [How To Use the C Run-Time](https://support.microsoft.com/en-us/help/94248/how-to-use-the-c-run-time) :::info * [**`perror()`**](https://man7.org/linux/man-pages/man3/perror.3.html) * print a system error message * produces a message on standard error describing the last error encountered during a call to a system or library function. ::: - [ ] [MSVC與CRT的恩怨情仇](https://www.cnblogs.com/shijingjing07/p/5509640.html) - [ ] [深入淺出 MFC](https://wizardforcel.gitbooks.io/jjhou-mfc/content/0.html) -侯傑 ### 複習編譯流程 當你執行 `$ gcc -o hello hello.c` 時,會經歷以下步驟: 1. 前置處理 (pre-process) 給定的程式碼,移除程式內的註解,和其他技巧, 像是 expanding (展開) C 的 marco 2. 確認你的程式語法是否確實遵照 C/C++ 的規定,如果沒有符合的話,編譯器會出現警告 3. 將原始碼轉成組合語言 —— 它跟機器語言(machine code)非常相近,但仍在人類可理解的範圍內 4. 把組合語言轉成機器語言 —— 是的,這裡說的機器語言就是常提到的 bit 和 byte,也就是特定 0 和 1 的序列 5. 確認程式中用到的函式呼叫、全域變數是否正確,舉例來說:如若呼叫了不存在的函式,編譯器會顯示警告 6. 如果程式是由程式碼檔案來編譯,編譯器會整合起來 7. 編譯器會負責產生東西,讓系統上的 run-time loader 可以把程式載入記憶體內執行 8. 最後會把編譯完的執行檔存在指定的儲存空間 通常「編譯」(compilation) 是指上述第 1 到 4 個步驟,其他則稱為「連結」(linking),有時步驟 1 也指「前置處理」(pre-processing),而步驟 3 到步驟 4 則是「組譯」(assemble)。 ![](https://i.imgur.com/Ic9vGte.png) 思考以下程式的執行: (**`hello.c`**) ```clike int main() { return 1; } ``` * 編譯和執行 ```shell $ gcc -o hello hello.c $ ./hello ``` 看起來啥屁都沒發生 (或者說,看起來沒有 I/O) * 利用 `echo $?` 來取得程式返回值 ```shell $ echo $? 1 ``` 這個 **`hello`** 已經執行完了,這個回傳值是如何被保留下來的? 程式結束後, CRT 會自動執行 `exit()`,`return` 的 `1` 即為 **exit status** :::info * [**`void _exit(int status);`**](https://man7.org/linux/man-pages/man2/exit.2.html) * terminate the calling process * These functions do not return. * The value `status & 0xFF` is returned to the parent process as the **process's exit status**, ... ::: :::warning * 程式出錯時,我們可以利用 `atexit()` 來檢查問題是否出在所使用的函式上 ::: ### 為何需要 C runtime ? C 語言和一般的程式語言有個很重要的差異,就是 C 語言設計來滿足系統程式的需求 * 作業系統核心 * 一系列的工具程式,像是 ls, find, grep 等等 * 如果忽略指標操作這類幾乎可以直接對應於組合語言的指令的特色 C 語言之所以需要 runtime,有以下幾個地方: 1. `int main() { return 1; }` 也就是 `main()` 結束後,**要將 exit code 傳遞給作業系統的程式**,這部份會放在 `crt0` 2. **exception handling**,不要懷疑,C 語言當然有這個特徵,只是透過 `setjmp` 和 `longjmp` 函式來存取,這需要額外的函式庫 (如 libgcc) 來協助處理 stack 3. **算術操作**,特別在硬體沒有 **FPU** 時,需要 [libgcc](https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html) 來提供浮點運算協助 > - [ ] [What's the difference between hard and soft floating point numbers?](https://stackoverflow.com/questions/3321468/whats-the-difference-between-hard-and-soft-floating-point-numbers) > - [ ] [ArmHardFloatPort](https://wiki.debian.org/ArmHardFloatPort/VfpComparison) 指出在 gcc 針對 Arm 平台上有三種浮點數處理機制,也就是參數 `-mfloat-abi` 的選項有: > * **`soft`** - this is pure software > * **`softfp`** - this supports a hardware FPU, but the ABI is soft compatible. > * **`hard`** - the ABI uses float or VFP registers. ### 驗證執行檔 * 在核心裡的「驗證執行檔」步驟,是要驗證什麼? * UNIX 的「執行檔」有很多種可能,一個是依據特定格式保存的機械碼,也可能是透過額外程式去解析的 shell script,作業系統核心必須得事先解析並確認這個合法的執行檔,才能著手去執行 * 近來還有對執行檔進行簽章的機制,請見: [ELF executable signing and verification ](https://lwn.net/Articles/532710/) * secureboot mode * signelf * opensll ### `int main(int argc, char *argv[])` 背後的學問 :::danger 有些書上使用 `void main()` 的函式宣告,這是錯誤的。 ::: C++ 之父 Bjarne Stroustrup 在他的 [C++ Style and Technique FAQ](http://www.stroustrup.com/bs_faq2.html#void-main) 中明確地寫著 > "The definition void main( ) { /* … */ } is not and never has been C++, nor has it even been C." C 語言規範 5.1.2.2.1 : > It shall be defined with **a return type of int** and with **no parameters** or with **two parameters (argc and argv)** 複習名詞: ![](http://i.imgur.com/K2XsAUi.png) * "argument" 的重點是「傳遞給函式的形式」 > expressions passed into a function * 稱 `argc` 是 argument count * 稱 `argv` 是 argument vector :::info * [**`write()`**](https://www.man7.org/linux/man-pages/man2/write.2.html) * [**`writev()`**](https://linux.die.net/man/2/writev) * write data into multiple buffers ::: - "parameter" 的重點是「接受到的數值」 > values received by the function 比方說 C++ 有 [parameterized type](https://isocpp.org/wiki/faq/templates#param-types),就是說某個型態可以當作另外一個型態的「參數」,換個角度說,「型態」變成像是數值一樣的參數。 * Caller > 呼叫者(通常就是在其他函式呼叫的function) - Callee > 被呼叫者(被呼叫的 function) - [ ] [Parameter (computer programming)](https://en.wikipedia.org/wiki/Parameter_(computer_programming)) ### 程式使用函式呼叫在組合語言的實作 我們看到包含 **`envp`** 的宣告: ```cpp int main(int argc, char *argv[], char *envp[]) { ... } ``` - [ ] [(六) 一起学 Unix 环境高级编程 (APUE) 之 进程控制](https://www.cnblogs.com/0xcafebabe/p/4434218.html) * 故意改寫為以下: (**`x.c`**) ```cpp #include <stdio.h> int main(int argc, char (*argv)[0]) { puts(((char **) argv)[0]); return 0; } ``` * 使用 gdb 觀察: ```shell $ gcc -o x x.c -g $ gdb -q x ``` ```shell (gdb) b main (gdb) r (gdb) print *((char **) argv) $1 = 0x7fffffffe7c9 "/tmp/x" ``` 這裡符合預期,但接下來: ```shell (gdb) x/4s (char **) argv 0x7fffffffe558: "\311\347\377\377\377\177" 0x7fffffffe55f: "" 0x7fffffffe560: "" 0x7fffffffe561: "" ``` ```shell (gdb) x/4s (argv) 0x7fffffffe558: "\311\347\377\377\377\177" 0x7fffffffe55f: "" 0x7fffffffe560: "" 0x7fffffffe561: "" ``` 看不懂了,要換個方式: ```shell (gdb) x/4s ((char **) argv)[0] 0x7fffffffe7c9: "/tmp/x" 0x7fffffffe7d0: "LC_PAPER=zh_TW" 0x7fffffffe7df: "XDG_SESSION_ID=91" 0x7fffffffe7f1: "LC_ADDRESS=zh_TW" ``` ![](https://i.imgur.com/QndglVq.png) 原來後 3 項是 envp (environment variables),在 C run-time 傳遞進來的內容和 `printenv` 輸出一致 ```shell (gdb) shell printenv ``` **下面從影片 1:35:00 開始~** 假設 PID = 31114,那麼我們可觀察: ```shell cat /proc/31114/cmdline cat /proc/31114/environ ``` :::info 讀取 `argv[0]` 和 `cmdline` 來判斷執行的程式名稱,有個非常巧妙的應用: - [ ] [BusyBox - The Swiss Army Knife of Embedded Linux](https://busybox.net/downloads/BusyBox.html) ::: - [ ] [深度剖析 C 語言 main 函式](https://blog.csdn.net/z_ryan/article/details/80985101) * `_start()` * `__attribute__()` ### GNU Toolchain ![](https://i.imgur.com/jDdgIaH.png) * gcc : GNU compiler **collection** * as : GNU assembler * ld : GNU linker * gdb : GNU debugger (過度精簡的) 編譯的流程還有格式 ![](http://i.imgur.com/9c0k1v0.png) .coff 和 .elf 分別表示以下: * [COFF (common object file format)](https://en.wikipedia.org/wiki/COFF) : 是種用於執行檔、目的碼、共享函式庫 (shared library) 的檔案格式 * [ELF (extended linker format)](https://en.wikipedia.org/wiki/Elf) : GNU/Linux 和 *BSD 上最常用的執行檔格式,用於執行檔、目的碼、共享函式庫和核心的標準檔案格式,**用來取代 COFF** - [ ] [Computer Science from the Bottom Up](https://www.bottomupcs.com/) - [ ] [Starting a process](https://www.bottomupcs.com/starting_a_process.xhtml) * `__libc_csu_init` / `__libc_csu_fini` ### 隱藏的 `crt0` [crt0](https://en.wikipedia.org/wiki/Crt0) (也稱為 `c0`) ```= .text .globl _start _start: # _start is the entry point known to the linker mov %rsp, %rbp # setup a new stack frame mov 0(%rbp), %rdi # get argc from the stack lea 8(%rbp), %rsi # get argv from the stack call main # %rdi, %rsi are the first two args to main mov %rax, %rdi # mov the return of main to the first argument call exit # terminate the program ``` * `#6` : 將 `argc`、`argv[]`存入 stack * `#11` : 將 `main()` 的回傳值放入 `%rax` * `#12` : call `exit` [Newlib](https://en.wikipedia.org/wiki/Newlib) [newlib/i386 的實作程式碼](https://github.com/eblot/newlib/blob/master/newlib/libc/sys/linux/machine/i386/crt0.c) ```C extern char **environ; extern int main(int argc, char **argv, char **envp); extern char _end; extern char __bss_start; void _start(int args) { /* * The argument block begins above the current stack frame, because we * have no return address. The calculation assumes that sizeof(int) == * sizeof(void *). This is okay for i386 user space, but may be invalid in * other cases. */ int *params = &args - 1; int argc = *params; char **argv = (char **) (params + 1); environ = argv + argc + 1; /* Note: do not clear the .bss section. When running with shared * libraries, certain data items such __mb_cur_max or environ * may get placed in the .bss, even though they are initialized * to non-zero values. Clearing the .bss will end up zeroing * out their initial values. */ tzset(); /* initialize timezone info */ exit(main(argc, argv, environ)); } ``` 延伸閱讀: * [Creating a C Library](https://wiki.osdev.org/Creating_a_C_Library)

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Google Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully