# Dynamic library support for shecc > 黎詠哲, 李協儒 ## Motivation Currently, [shecc](https://github.com/sysprog21/shecc) can only generate statically-linked ELF executables. This results in inefficient space utilization. Therefore, your task is to provide an option to produce dynamically-linked ELF executables that link to `glibc`. ## Methodology ``` The two ELF's "views" are as follows: +-----------------+ +----| ELF File Header |----+ | +-----------------+ | v v +-----------------+ +-----------------+ | Program Headers | | Section Headers | +-----------------+ +-----------------+ || || || || || || || +------------------------+ || +--> | Contents (Byte Stream) |<--+ +------------------------+ In reality, the layout of a typical ELF executable binary on a disk file is like this: +-------------------------------+ | ELF File Header | +-------------------------------+ | Program Header for segment #1 | +-------------------------------+ | Program Header for segment #2 | +-------------------------------+ | ... | +-------------------------------+ | Contents (Byte Stream) | | ... | +-------------------------------+ | Section Header for section #1 | +-------------------------------+ | Section Header for section #2 | +-------------------------------+ | ... | +-------------------------------+ | ".shstrtab" section | +-------------------------------+ | ".symtab" section | +-------------------------------+ | ".strtab" section | +-------------------------------+ ``` To support dynamic link, we need to add headers and sections to ELF file. We need to follow the sequence of above ELF views and write necessary data one by one to output ELF file. Below is the list of headers and sections that we need to write to ELF file. **`Program headers`** - `DYNAMIC`: For dynamic binaries, this segment hold dynamic linking information and is usually the same as `.dynamic` section in ELF's linking view. - `INTERP`: For dynamic binaries, this holds the full pathname of runtime linker `ld.so`. This segement is the same as `.interp` section in ELF's linking view. **`Section headers`** - `.dynamic`: For dynamic binaries, this section holds dynamic linking information used by `ld.so`. - `.dynstr`: NULL-terminated strings of names of symbols in `.dynsym` section. - `.dynsym`: Runtime/Dynamic symbol table. For dynamic binaries, this section is the symbol table of globally visible symbols. For example, if a dynamic link library wants to export its symbols, these symbols will be stored here. On the other hand, if a dynamic executable binary uses symbols from a dynamic link library, then these symbols are stored here too. The symbol names (as NULL-terminated strings) are stored in .dynstr section. - `.got`: For dynamic binaries, this Global Offset Table holds the addresses of variables which are relocated upon loading. - `.got.plt`: For dynamic binaries, this Global Offset Table holds the addresses of functions in dynamic libraries. They are used by trampoline code in `.plt` section. If `.got.plt` section is present, it contains at least three entries, which have special meanings. - `.interp`: For dynamic binaries, this holds the full pathname of runtime linker `ld.so`. - `.plt`: For dynamic binaries, this Procedure Linkage Table holds the trampoline/linkage code. - `.rela.dyn`: Runtime/Dynamic relocation table. For dynamic binaries, this relocation table holds information of variables which must be relocated upon loading. Each entry in this table is a struct `Elf64_Rela` (see `/usr/include/elf.h`). - `.rela.plt`: Runtime/Dynamic relocation table. This relocation table is similar to the one in `.rela.dyn` section; the difference is this one is for functions, not variables. ## Current Progress (Updated in read time) - Wrote the program header `INTERP` to ELF file but it made section headers broken. - Listed all the necessay headers and sections for dynamic link. - We know what headers and sections `SHECC` need to write to ELF file. But we still struggled to figure out the detail of each headers and sections when we tried to put these data to ELF. So now we try to write these headers to ELF file one by one and make sure it won't make other headers broken. - It will take some time for implementation and test. We will keep updating the progress. ## Implementation Records ### Record 01: Tried to write data to generate dynamically-linked ELF but failed. We tried to write necessary data to ELF file and use `readelf -a` command to check if data is written to the file correctly. Below is the part of ELF file we made `SHECC` generated. ``` Entry point address: 0x10068 Start of program headers: 52 (bytes into file) Start of section headers: 12627 (bytes into file) Flags: 0x5000200, Version5 EABI, soft-float ABI Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 2 Size of section headers: 40 (bytes) Number of section headers: 6 Section header string table index: 5 ... Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] <no-strings> LOPROC+0x274732 00000000 000000 000000 00 XMSIOxxxo 0 0 0 [ 1] <no-strings> NULL 0000000b 000001 000007 00 65620 84 12388 [ 2] <no-strings> RELA 00000011 000001 000003 0c 78008 12472 96 [ 3] <no-strings> RELA 00000017 000002 000000 0c 0 12568 0 [ 4] <no-strings> RELA 0000001f 000003 000000 0c M 0 12568 0 [ 5] <no-strings> PROGBITS 00000001 000003 000000 00 0 12568 39 ... Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align INTERP 0x000054 0x00010054 0x00010054 0x00014 0x00014 R 0x4 [Requesting program interpreter: ] LOAD 0x000054 0x00010054 0x00010054 0x030c4 0x030c4 RWE 0x4 ``` We wrote program header `INTERP` to ELF file successfully but we also made section headers broken. The reason might be that we didn't set the ELF header `Start of section headers` correctly. :::warning Prepare skeleton code like [listing.s](https://github.com/cheery/riscv-mini-hello/blob/master/listing.s) and tweak the RISC-V- code generation of shecc to adapt ELF headers and sections. ::: ## Record 02: ELF Header Modifications We modified the elf_generate_header() function to support dynamic linking by implementing conditional ELF type selection:`elf_generate_header` and ELF file we made `SHECC` generated. ```cpp = void elf_generate_header(int dynamic_linking_enabled) { // ELF Magic number and identification elf_write_header_int(0x464c457f); // Magic: 0x7F followed by ELF elf_write_header_byte(1); // 32-bit elf_write_header_byte(1); // little-endian elf_write_header_byte(1); // EI_VERSION elf_write_header_byte(0); // System V // Key modification: Conditional ELF type selection elf_write_header_int(dynamic_linking_enabled ? 3 : 2); // ET_DYN or ET_EXEC } ``` Verification of the header modification showed successful type change to `DYN` ``` ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: None Version: 0x1002800 Entry point address: 0x54000000 ``` ## Record 03: Dynamic Linking Section Implementation ### Section String Table Additions Added the following dynamic linking related sections to the string table: - `.dynstr` - `.dynsym` - `.dynamic` - `.interp` - `.rela.plt` - `.plt` - `.got` **shstr section table** ```cpp = /* shstr section; len = 39(static), 93(dynamic) */ elf_write_section_byte(0); elf_write_section_str(".shstrtab", 9); elf_write_section_byte(0); elf_write_section_str(".text", 5); elf_write_section_byte(0); elf_write_section_str(".data", 5); elf_write_section_byte(0); if (dynamic_linking_enabled) { elf_write_section_str(".dynstr", 7); elf_write_section_byte(0); elf_write_section_str(".dynsym", 7); elf_write_section_byte(0); elf_write_section_str(".dynamic", 8); elf_write_section_byte(0); elf_write_section_str(".interp", 8); elf_write_section_byte(0); elf_write_section_str(".rela.plt", 9); elf_write_section_byte(0); elf_write_section_str(".plt", 4); elf_write_section_byte(0); elf_write_section_str(".got", 4); elf_write_section_byte(0); } elf_write_section_str(".symtab", 7); elf_write_section_byte(0); elf_write_section_str(".strtab", 7); elf_write_section_byte(0); ``` Implemented section headers following the `ELF32_Shdr` structure: **ELF32_Shdr** ```cpp = typedef struct { Elf32_Word sh_name; Elf32_Word sh_type; Elf32_Word sh_flags; Elf32_Addr sh_addr; Elf32_Off sh_offset; Elf32_Word sh_size; Elf32_Word sh_link; Elf32_Word sh_info; Elf32_Word sh_addralign; Elf32_Word sh_entsize; }ELF32_Shdr ``` **Dynamic Sections Generation** ```cpp = if (dynamic_linking_enabled) { /* .interp section header */ elf_write_section_int(48); elf_write_section_int(1); elf_write_section_int(2); elf_write_section_int(ELF_START + elf_header_len); elf_write_section_int(elf_header_len); elf_write_section_int(19); elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(1); elf_write_section_int(0); /* .dynsym section header */ elf_write_section_int(31); elf_write_section_int(11); elf_write_section_int(2); elf_write_section_int(ELF_START + elf_header_len + 19); elf_write_section_int(elf_header_len + 19); elf_write_section_int(33); elf_write_section_int(4); elf_write_section_int(1); elf_write_section_int(4); elf_write_section_int(16); /* .dynstr section header */ elf_write_section_int(23); elf_write_section_int(3); elf_write_section_int(2); elf_write_section_int(ELF_START + elf_header_len + 19 + 33); elf_write_section_int(elf_header_len + 19 + 33); elf_write_section_int(10); elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(1); elf_write_section_int(0); /* .dynamic section header */ elf_write_section_int(39); elf_write_section_int(6); elf_write_section_int(3); elf_write_section_int(ELF_START + elf_header_len + 62); elf_write_section_int(elf_header_len + 62); elf_write_section_int(16); elf_write_section_int(4); elf_write_section_int(0); elf_write_section_int(4); elf_write_section_int(8); /* .rela.plt section header */ elf_write_section_int(57); elf_write_section_int(4); elf_write_section_int(66); elf_write_section_int(ELF_START + elf_header_len + 78); elf_write_section_int(elf_header_len + 78); elf_write_section_int(12); elf_write_section_int(2); elf_write_section_int(8); elf_write_section_int(4); elf_write_section_int(12); /* .plt section header */ elf_write_section_int(67); elf_write_section_int(1); elf_write_section_int(6); elf_write_section_int(ELF_START + elf_header_len + 90); elf_write_section_int(elf_header_len + 90); elf_write_section_int(12); elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(4); elf_write_section_int(16); /* .got section header */ elf_write_section_int(72); elf_write_section_int(1); elf_write_section_int(3); elf_write_section_int(ELF_START + elf_header_len + 102); elf_write_section_int(elf_header_len + 102); elf_write_section_int(12); elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(4); elf_write_section_int(4); } ``` Next, we Implemented `elf_generate_dynamic_sections()` to create: - `.dynamic` section with `DT_NEEDED` entries - `.interp` section pointing to "`/lib/ld-linux.so.3`" - `.dynstr` section containing "`libc.so.6`" - `.dynsym` section with global symbol entries - `.rela.plt` section for relocation entries - `.plt` section with ARM32 PLT entries - `.got` section with three global offset tables **void elf_generate_dynamic_sections()** ```cpp = void elf_generate_dynamic_sections() { /* .dynamic section*/ elf_write_section_int(1); /* DT_NEEDED */ elf_write_section_int(elf_strtab_index); /* offset in .dynstr */ elf_write_section_int(0); /* DT_NULL */ elf_write_section_int(0); /* End of .dynamic */ /* .interp section */ elf_write_section_str("/lib/ld-linux.so.3", 18); /* interpreter */ elf_write_section_int(0); /* End of .interp */ /* .dynstr section */ elf_write_section_str("libc.so.6", 9); /* dynamic linked library */ elf_write_section_byte(0); /* End of .dynsym */ /* dynsym section*/ elf_write_section_int(0); /* NULL entry */ elf_write_section_int(0); elf_write_section_int(0); elf_write_section_byte(0); elf_write_section_byte(0); elf_write_section_byte(0 & 0xFF); elf_write_section_byte(0 >> 8 & 0xFF); /* SHN_UNDEF*/ elf_write_section_int(1); /* offset to "libc.so.6" */ elf_write_section_int(0); elf_write_section_int(0); elf_write_section_byte(0x10); /* STB_GLOBAL | STT_OBJECT */ elf_write_section_byte(0); elf_write_section_byte(0 & 0xFF); elf_write_section_byte(0 >> 8 & 0xFF); /* SHN_UNDEF */ elf_write_section_int(0); /* End of .dynsym*/ /* .rela.plt section */ elf_write_section_int(0); elf_write_section_int(0x16); /* R_ARM_JUMP_SLOT */ elf_write_section_int(0); /* r_addend*/ /* .plt section */ /* ARM32 PLT entry */ elf_write_code_int(0xe28fc600); /* add ip, pc, #0 */ elf_write_code_int(0xe28cca00); /* add ip, ip, #0 */ elf_write_code_int(0xe5bcf000); /* ldr pc, [ip, #0]! */ /* .got section */ elf_write_section_int(0); /* GOT[0]: Reserved */ elf_write_section_int(0); /* GOT[1]: Reserved */ elf_write_section_int(0); /* GOT[2]: "libc.so.6" entry */ } ``` and when the init of the `void elf_generate_sections`, the compiler would generate the dynamic sections first. - `void elf_generate_sections` ```cpp = void elf_generate_sections(int dynamic_linking_enabled) { if (dynamic_linking_enabled) { elf_generate_dynamic_sections(); } /* existing code ... */ ``` ### Current Issues 1. Build Failures The implementation currently produces build errors in stage 2: ``` $ make GEN out/libc.inc CC out/src/main.o LD out/shecc SHECC out/shecc-stage1.elf SHECC out/shecc-stage2.elf qemu-arm: out/shecc-stage1.elf: Invalid ELF image for this architecture make: *** [Makefile:114: out/shecc-stage2.elf] Error 255 ``` 2. ELF Validation Errors `readelf` analysis reveals several critical issues ``` $ readelf -a readelf: Error: Too many program headers - 0x2000 - the file is not that big readelf: Warning: The e_shentsize field in the ELF header is larger than the size of an ELF section header readelf: Error: Reading 2621440 bytes extends past end of file for section headers readelf: Error: Section headers are not available! readelf: Error: Too many program headers - 0x2000 - the file is not that big readelf: Error: Too many program headers - 0x2000 - the file is not that big ``` it indicates that: 1. Invalid program header count (0x2000) 2. Incorrect `e_shentsize` in ELF header 3. Section header offset extends beyond file size 4. Incorrect section count (10240) - ELF file ``` ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: None Version: 0x1002800 Entry point address: 0x54000000 Start of program headers: 872415488 (bytes into file) Start of section headers: 1056964608 (bytes into file) Flags: 0x31 Size of this header: 2 (bytes) Size of program headers: 13317 (bytes) Number of program headers: 8192 Size of section headers: 256 (bytes) Number of section headers: 10240 Section header string table index: 1536 There is no dynamic section in this file. ``` It seems that the elf doesn't get the right programs and sections header both in size and numbers ### Next Steps To fix these problems, we need to correct calculation of: 1. Program header count and size 2. Section header count and size 3. File offsets for all sections ### Action Items 1. 先利用簡單的方法產生正確有效的執行檔。 2. 判斷 section 是否有存在的必要,確認是哪一些 elf section 是有必要的 3. `.plt` 相當於 cache 去幫忙 `.got`,也像 cache 一樣會有替換 symbol 的情形(用於完整的 elf loader), 4. `.got` 建立 symbol name 的 的關聯 5. `.shtab` 是必要的 6. `rela.plt` 會需要使用機械碼是因為 __libc_start_main() 進入點需要將 argc, argv, 推進 stack (calling convention) 7. `.rel` 做 relocation 要怎麼做(參閱規格書),alignment 對齊問題。 "__libc_start_main" 的 relocation 要做對 8. page alignment (參考 `amacc/amacc.c#2013`) ## Record 04: Dynamic Linking Section Implementation According to the previous action item, I generate the right and effective ELF file through gcc and then that shecc genertate the right ELF header and add `.dynamic` section in program header table and section header table, But the current problem is that the section header name couldn't show right name in the table. - **ELF generated by `shecc`** ``` ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: ARM Version: 0x1 Entry point address: 0x10054 Start of program headers: 52 (bytes into file) Start of section headers: 12664 (bytes into file) Flags: 0x5000200, Version5 EABI, soft-float ABI Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 2 Size of section headers: 40 (bytes) Number of section headers: 7 Section header string table index: 6 Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] ree detected^J NULL 00000000 000000 000000 00 0 0 0 [ 1] d^J PROGBITS 00010054 000054 003064 00 WAX 0 0 4 [ 2] PROGBITS 000130b8 0030b8 000060 00 WA 0 0 4 [ 3] World^J DYNAMIC 00000000 000000 000010 08 WA 4 0 4 [ 4] ^A SYMTAB 00000000 003118 000000 10 4 0 4 [ 5] STRTAB 00000000 003128 000000 00 0 0 1 [ 6] ee detected^J STRTAB 00000000 003118 000030 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), D (mbind), y (purecode), p (processor specific) There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000054 0x00010054 0x00010054 0x030c4 0x030c4 RWE 0x4 DYNAMIC 0x003118 0x00013118 0x00013118 0x00010 0x00010 RW 0x4 Section to Segment mapping: Segment Sections... 00 d^J 01 Dynamic section at offset 0x3118 contains 2 entries: Tag Type Name/Value 0x20656572 (<unknown>: 20656572) 0x65746564 0x64657463 (Operating System specific: 64657463) 0x6425000a There are no relocations in this file. Symbol table '^A' contains 0 entries: Num: Value Size Type Bind Vis Ndx Name No version information found in this file. ``` - **[/src/elf.h](https://github.com/banglday/shecc/commit/277eeb3398b035d0d49e235f2f0573848d9ad3a0)** ```diff = void elf_generate_header() { /* ELF header */ elf_write_header_int(0x464c457f); /* Magic: 0x7F followed by ELF */ elf_write_header_byte(1); /* 32-bit */ @@ -76,14 +65,14 @@ void elf_generate_header() elf_write_header_byte(0); /* System V */ elf_write_header_int(0); /* EI_ABIVERSION */ elf_write_header_int(0); /* EI_PAD: unused */ - elf_write_header_byte(2); /* ET_EXEC */ + elf_write_header_byte(3); /* ET_EXEC */ elf_write_header_byte(0); elf_write_header_byte(ELF_MACHINE); elf_write_header_byte(0); elf_write_header_int(1); /* ELF version */ elf_write_header_int(ELF_START + elf_header_len); /* entry point */ - elf_write_header_int(0x34); /* program header offset */ - elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 39 + + elf_write_header_int(0x34); /* program header offset */ + elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 48 + 32 + 16 + elf_symtab_index + elf_strtab_index); /* section header offset */ /* flags */ @@ -92,13 +81,13 @@ void elf_generate_header() elf_write_header_byte(0); elf_write_header_byte(0x20); /* program header size */ elf_write_header_byte(0); - elf_write_header_byte(1); /* number of program headers */ + elf_write_header_byte(2); /* number of program headers */ elf_write_header_byte(0); elf_write_header_byte(0x28); /* section header size */ elf_write_header_byte(0); - elf_write_header_byte(6); /* number of sections */ + elf_write_header_byte(7); /* number of sections */ elf_write_header_byte(0); - elf_write_header_byte(5); /* section index with names */ + elf_write_header_byte(6); /* section index with names */ elf_write_header_byte(0); /* program header - code and data combined */ @@ -110,10 +99,26 @@ void elf_generate_header() elf_write_header_int(elf_code_idx + elf_data_idx); /* size in memory */ elf_write_header_int(7); /* flags */ elf_write_header_int(4); /* alignment */ + /* program header - dynamic segment */ + elf_write_header_int(2); /* PT_DYNAMIC */ + elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx); /* offset of segment */ + elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* virtual address */ + elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* physical address */ + elf_write_header_int(16); /* size in file */ + elf_write_header_int(16); /* size in memory */ + elf_write_header_int(6); /* flags */ + elf_write_header_int(4); /* alignment */ } void elf_generate_sections() { + /* .dynamic section*/ + elf_write_section_int(1); /* DT_NEEDED */ + elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + + elf_symtab_index + 16); /* offset in .dynstr */ + elf_write_section_int(0); /* DT_NULL */ + elf_write_section_int(0); /* End of .dynamic */ + /* symtab section */ for (int b = 0; b < elf_symtab_index; b++) elf_write_section_byte(elf_symtab[b]); @@ -130,6 +135,8 @@ void elf_generate_sections() elf_write_section_byte(0); elf_write_section_str(".data", 5); elf_write_section_byte(0); + elf_write_section_str(".dynamic", 8); + elf_write_section_byte(0); elf_write_section_str(".symtab", 7); elf_write_section_byte(0); elf_write_section_str(".strtab", 7); @@ -173,8 +180,20 @@ void elf_generate_sections() elf_write_section_int(4); elf_write_section_int(0); + /* .dynamic section header */ + elf_write_section_int(23); + elf_write_section_int(6); + elf_write_section_int(3); + elf_write_section_int(0); // sh_addr + elf_write_section_int(0); // sh_offset + elf_write_section_int(16); + elf_write_section_int(4); + elf_write_section_int(0); + elf_write_section_int(4); + elf_write_section_int(8); + /* .symtab */ - elf_write_section_int(0x17); + elf_write_section_int(32); elf_write_section_int(2); elf_write_section_int(0); elf_write_section_int(0); @@ -186,12 +205,12 @@ void elf_generate_sections() elf_write_section_int(16); /* .strtab */ - elf_write_section_int(0x1f); + elf_write_section_int(40); elf_write_section_int(3); elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + - elf_symtab_index); + elf_symtab_index + 16); elf_write_section_int(elf_strtab_index); /* size */ elf_write_section_int(0); elf_write_section_int(0); @@ -205,15 +224,14 @@ void elf_generate_sections() elf_write_section_int(0); elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + elf_symtab_index + elf_strtab_index); - elf_write_section_int(39); + elf_write_section_int(48); elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(1); elf_write_section_int(0); } ``` Another action item is that deciding the necessary sections, the neccessary added sections for dynamic linked feature are `.dynamic`, `.interp`, `.got`, `.rela.plt` and `.rela` ## Record 04: Generate Correct Dynamic Program Header and Section Table **[Github Repo](https://github.com/banglday/shecc/blob/test/dynamic-link/src/elf.c)** **Goal:** Add PT_INTERP and PT_DYNAMIC entries to the program header and define the corresponding .interp and .dynamic sections, so the generated ELF file is properly linked to the shared library libc.so.6 and uses the dynamic linker /lib/ld-linux.so.3. 1. Adjusting the ELF Header Length In src/globals.c, we increase elf_header_len from 0x54 to 0x94. The reason is that each program header entry is 32 bytes, and we need two additional entries (for PT_INTERP and PT_DYNAMIC). - `src/globals.c` ```diff! /* Existing Code */ /* ELF sections */ char *elf_code; int elf_code_idx = 0; char *elf_data; int elf_data_idx = 0; char *elf_header; int elf_header_idx = 0; - int elf_header_len = 0x54; /* ELF fixed: 0x34 + 1 * 0x20 */ +int elf_header_len = 0x94; /* ELF fixed: 0x34 + 3 * 0x20 */ int elf_code_start; int elf_data_start; char *elf_symtab; char *elf_strtab; char *elf_section; /* Existing Code */ ``` 2. Adding the `PT_INTERP` and `PT_DYNAMIC` Segments Next, we add the two new program header entries immediately after the existing `PT_LOAD` entry. This ensures that the ELF loader knows: 1. Which dynamic linker to invoke (`PT_INTERP`). 2. How to handle dynamic linking information (`PT_DYNAMIC`). ```cpp /* program header - interpreter segment */ elf_write_header_int(3); /* PT_INTERP */ elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx); /* p_offset */ elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* p_vaddr */ elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* p_paddr */ elf_write_header_int(22); /* p_filesz */ elf_write_header_int(22); /* p_memsz */ elf_write_header_int(4); /* p_flags */ elf_write_header_int(4); /* p_align */ /* program header - dynamic segment */ elf_write_header_int(2); /* PT_DYNAMIC */ elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 22); /* p_offset */ elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22); /* p_vaddr */ elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22); /* p_paddr */ elf_write_header_int(40); /* p_filesz */ elf_write_header_int(16); /* p_memsz */ elf_write_header_int(6); /* p_flags */ elf_write_header_int(4); /* p_align */ ``` --- 3. Defining the `.interp` and `.dynamic` Sections We also need to create the corresponding sections in the section header table. The `.interp` section holds the path to the dynamic linker (`/lib/ld-linux.so.3`), and the `.dynamic` section holds the runtime link information, including the necessary reference to `libc.so.6`. ```cpp void elf_generate_sections() { /* .interp section (length = 22) */ elf_write_section_str("/lib/ld-linux.so.3", 18); elf_write_section_int(0); /* .dynamic section (length = 40) */ elf_write_section_int(6); /* DT_SYMTAB */ elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40 + 28); elf_write_section_int(5); /* DT_STRTAB */ elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40); elf_write_section_int(11); /* DT_SYMENT */ elf_write_section_int(16); /* Symbol entry size */ elf_write_section_int(1); /* DT_NEEDED */ elf_write_section_int(1); /* Offset in .dynstr */ elf_write_section_int(0); /* DT_NULL */ elf_write_section_int(0); /* End of .dynamic */ /* .dynstr section (length = 28) */ elf_write_section_byte(0); /* NULL terminator */ elf_write_section_str("libc.so.6", 9); elf_write_section_byte(0); elf_write_section_str("__libc_start_main", 17); } /* Existing Code */ /* Add to .shstr tab and its string */ elf_write_section_str(".interp", 7); elf_write_section_byte(0); elf_write_section_str(".dynamic", 8); elf_write_section_byte(0); elf_write_section_str(".dynstr", 7); /* .interp section header */ elf_write_section_int(23); /* .sh_name offset in .shstrtab */ elf_write_section_int(1); /* SHT_PROGBITS */ elf_write_section_int(2); /* SHF_ALLOC */ elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* sh_addr */ elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx); /* sh_offset */ elf_write_section_int(22); /* sh_size */ elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(1); /* sh_addralign */ elf_write_section_int(0); /* .dynamic section header */ elf_write_section_int(31); /* .sh_name offset in .shstrtab */ elf_write_section_int(6); /* SHT_DYNAMIC */ elf_write_section_int(3); /* SHF_ALLOC | SHF_WRITE */ elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22); /* sh_addr */ elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22); /* sh_offset */ elf_write_section_int(40); /* sh_size */ elf_write_section_int(5); /* sh_link = .dynstr */ elf_write_section_int(0); elf_write_section_int(4); /* sh_addralign */ elf_write_section_int(0); /* .dynstr section header */ elf_write_section_int(40); /* .sh_name offset */ elf_write_section_int(3); /* SHT_STRTAB */ elf_write_section_int(2); /* SHF_ALLOC */ elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22 + 40); elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40); elf_write_section_int(28); /* sh_size */ elf_write_section_int(0); elf_write_section_int(0); elf_write_section_int(1); /* sh_addralign */ elf_write_section_int(0); ``` 4. Verifying with `readelf` After compiling and generating this ELF, running `readelf -a` confirms the presence of the additional segments and sections: ```text ELF Header: Magic: 7f 45 4c 46 01 01 01 00 ... Class: ELF32 Data: 2's complement, little endian ... Type: DYN (Shared object file) Machine: ARM ... Entry point address: 0x10094 Start of program headers: 52 (bytes into file) Start of section headers: 12777 (bytes into file) ... Size of program headers: 32 (bytes) Number of program headers: 3 ... Section Headers: [ 3] .interp PROGBITS 00013160 003160 00001d 00 A 0 0 1 [ 4] .dynamic DYNAMIC 0001317d 00317d 000010 00 WA 5 0 4 [ 5] .dynstr STRTAB 0001318d 00318d 00001c 00 A 0 0 1 ... Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000094 0x00010094 0x00010094 ... INTERP 0x003160 0x00013160 0x00013160 ... [Requesting program interpreter: /lib/ld-linux.so.3] DYNAMIC 0x00317d 0x0001317d 0x0001317d ... Dynamic section at offset 0x317d contains 2 entries: Tag Type Name/Value 0x00000001 (NEEDED) Shared library: [libc.so.6] ... ``` As shown, the `.interp` section correctly specifies `/lib/ld-linux.so.3`, and the dynamic section indicates it needs `libc.so.6`. These entries verify that the ELF is now dynamically linked and will load the shared library and dynamic linker at runtime. Below is an improved version of the report, formatted in Markdown, that clearly explains the changes made to create `.dynstb` (presumably `.dynsym`) and `.dynstr` sections, the resulting output from `readelf`, and the next steps. ## Record 05: Generating `.dynsym` and `.dynstr` Sections, and Issues with Adding `.got`, `.plt`, and `rel.plt` In this stage, we add `.dynstr` and `.dynsym` sections to the ELF. While these sections appear correctly in `readelf`, there is a lingering issue where the `.strtab` entry in the section header appears as `<corrupt>`. 1. Code Changes Below is the relevant code that creates the `.dynsym` entries in `elf_generate_sections()`: ```cpp /* First entry must be NULL */ void elf_generate_sections() { /* Existing Code ... */ elf_write_section_int(0); // st_name elf_write_section_int(0); // st_value elf_write_section_int(0); // st_size elf_write_section_byte(0); // st_info elf_write_section_byte(0); // st_other elf_write_section_byte(0); // st_shndx elf_write_section_byte(0); /* Second entry is the libc.so.6 */ elf_write_section_int(1); // st_name elf_write_section_int(0); // st_value elf_write_section_int(0); // st_size elf_write_section_byte(0x12); // st_info elf_write_section_byte(0); // st_other elf_write_section_byte(0); // st_shndx elf_write_section_byte(0); /* Third entry is __libc_start_main */ elf_write_section_int(11); // st_name elf_write_section_int(0); // st_value elf_write_section_int(0); // st_size elf_write_section_byte(0x12); // st_info elf_write_section_byte(0); // st_other elf_write_section_byte(0); // st_shndx elf_write_section_byte(0); /* Existing Code */ /* Add to .shstr tab and its string */ elf_write_section_str(".dynsym", 7); elf_write_section_byte(0); /* .dynsym section header */ elf_write_section_int(48); elf_write_section_int(11); elf_write_section_int(0); elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22 + 40 + 28); // sh_addr elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40 + 28); // sh_offset elf_write_section_int(48); elf_write_section_int(5); elf_write_section_int(2); elf_write_section_int(4); elf_write_section_int(16); /* Existing Code */ } ``` 2. Verifying with `readelf` After compiling and generating this ELF, running `readelf -a` shows the following (truncated for brevity): ``` ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian ... Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 3 Size of section headers: 40 (bytes) Number of section headers: 10 Section header string table index: 9 Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 [ 1] .text PROGBITS 00010094 000094 003064 00 WAX [ 2] .data PROGBITS 000130f8 0030f8 000068 00 WA [ 3] .interp PROGBITS 00013160 003160 000016 00 A [ 4] .dynamic DYNAMIC 00013176 003176 000028 00 WA [ 5] .dynstr STRTAB 0001319e 00319e 00001c 00 A [ 6] .dynsym DYNSYM 000131ba 0031ba 000030 10 [ 7] .symtab SYMTAB 00000000 0031ea 000000 10 [ 8] <corrupt> STRTAB 00000000 0031ea 000000 00 [ 9] .shstrtab STRTAB 00000000 0031ea 000040 00 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align ... INTERP 0x003160 0x00013160 0x00013160 0x00016 0x00016 R 0x4 [Requesting program interpreter: /lib/ld-linux.so.3] DYNAMIC 0x003176 0x00013176 0x00013176 0x00028 0x00010 RW 0x4 Dynamic section at offset 0x3176 contains 5 entries: Tag Type Name/Value 0x00000006 (SYMTAB) 0x31ba 0x00000005 (STRTAB) 0x319e 0x0000000b (SYMENT) 16 (bytes) 0x00000001 (NEEDED) Shared library: [libc.so.6] 0x00000000 (NULL) 0x0 Symbol table '.dynsym' contains 3 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FUNC GLOBAL DEFAULT UND libc.so.6 2: 00000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main ``` **Observation**: The `.dynsym` table is correct and references the strings in `.dynstr`. However, section header entry `[8]` is displayed as `<corrupt>`. This suggests there might be an error in how the `.strtab` or other string tables are being appended to the ELF, causing the section header data to be misaligned or overwritten. 3. Next Steps 1. **Investigate `.strtab` Corruption** - Check how the `.symtab` and `.strtab` sections are being added to the ELF. There might be a miscalculation in offsets or lengths, leading to overwriting existing data. - Verify the sizes and offsets for all sections after `.dynsym` to ensure they do not overlap each other. 2. **Implement External Symbol Handling** - Create a function to load external function names and sizes. This will help populate `.dynstr`, `.dynsym`, `.got`, and `.plt`. 3. **Implement `rel.plt`** - Use `R_ARM_JUMP_SLOT` relocations in `rel.plt`. - Provide a `plt` stub for `__libc_start_main()` (and any other external functions) to handle calling conventions and symbol resolution at runtime. ## Useful materials - [[Spec] RISC-V ABIs Specification](https://uim.fei.stuba.sk/wp-content/uploads/2022/10/riscv-abi-2022.pdf) - [[Spec] Acronyms relevant to Executable and Linkable Format (ELF)](https://web.archive.org/web/20190428202733/https://www.cs.stevens.edu/~jschauma/631/elf.html) - [[Blog] RISC-V from scratch 2: Hardware layouts, linker scripts, and C runtimes](https://twilco.github.io/riscv-from-scratch/2019/04/27/riscv-from-scratch-2.html) - [[Bolg] What is SHECC](https://hackmd.io/@sysprog/HkqE5DKqP)