Try   HackMD

Dynamic library support for shecc

黎詠哲, 李協儒

Motivation

Currently, shecc can only generate statically-linked ELF executables. This results in inefficient space utilization. Therefore, your task is to provide an option to produce dynamically-linked ELF executables that link to glibc.

Methodology

The two ELF's "views" are as follows:

              +-----------------+
         +----| ELF File Header |----+
         |    +-----------------+    |
         v                           v
 +-----------------+      +-----------------+
 | Program Headers |      | Section Headers |
 +-----------------+      +-----------------+
      ||                               ||
      ||                               ||
      ||                               ||
      ||   +------------------------+  ||
      +--> | Contents (Byte Stream) |<--+
           +------------------------+
In reality, the layout of a typical ELF executable binary on a disk file is like this:

    +-------------------------------+
    | ELF File Header               |
    +-------------------------------+
    | Program Header for segment #1 |
    +-------------------------------+
    | Program Header for segment #2 |
    +-------------------------------+
    | ...                           |
    +-------------------------------+
    | Contents (Byte Stream)        |
    | ...                           |
    +-------------------------------+
    | Section Header for section #1 |
    +-------------------------------+
    | Section Header for section #2 |
    +-------------------------------+
    | ...                           |
    +-------------------------------+
    | ".shstrtab" section           |
    +-------------------------------+
    | ".symtab"   section           |
    +-------------------------------+
    | ".strtab"   section           |
    +-------------------------------+

To support dynamic link, we need to add headers and sections to ELF file. We need to follow the sequence of above ELF views and write necessary data one by one to output ELF file. Below is the list of headers and sections that we need to write to ELF file.

Program headers

  • DYNAMIC: For dynamic binaries, this segment hold dynamic linking information and is usually the same as .dynamic section in ELF's linking view.
  • INTERP: For dynamic binaries, this holds the full pathname of runtime linker ld.so. This segement is the same as .interp section in ELF's linking view.

Section headers

  • .dynamic: For dynamic binaries, this section holds dynamic linking information used by ld.so.
  • .dynstr: NULL-terminated strings of names of symbols in .dynsym section.
  • .dynsym: Runtime/Dynamic symbol table. For dynamic binaries, this section is the symbol table of globally visible symbols. For example, if a dynamic link library wants to export its symbols, these symbols will be stored here. On the other hand, if a dynamic executable binary uses symbols from a dynamic link library, then these symbols are stored here too. The symbol names (as NULL-terminated strings) are stored in .dynstr section.
  • .got: For dynamic binaries, this Global Offset Table holds the addresses of variables which are relocated upon loading.
  • .got.plt: For dynamic binaries, this Global Offset Table holds the addresses of functions in dynamic libraries. They are used by trampoline code in .plt section. If .got.plt section is present, it contains at least three entries, which have special meanings.
  • .interp: For dynamic binaries, this holds the full pathname of runtime linker ld.so.
  • .plt: For dynamic binaries, this Procedure Linkage Table holds the trampoline/linkage code.
  • .rela.dyn: Runtime/Dynamic relocation table. For dynamic binaries, this relocation table holds information of variables which must be relocated upon loading. Each entry in this table is a struct Elf64_Rela (see /usr/include/elf.h).
  • .rela.plt: Runtime/Dynamic relocation table. This relocation table is similar to the one in .rela.dyn section; the difference is this one is for functions, not variables.

Current Progress (Updated in read time)

  • Wrote the program header INTERP to ELF file but it made section headers broken.
  • Listed all the necessay headers and sections for dynamic link.
  • We know what headers and sections SHECC need to write to ELF file. But we still struggled to figure out the detail of each headers and sections when we tried to put these data to ELF. So now we try to write these headers to ELF file one by one and make sure it won't make other headers broken.
  • It will take some time for implementation and test. We will keep updating the progress.

Implementation Records

Record 01: Tried to write data to generate dynamically-linked ELF but failed.

We tried to write necessary data to ELF file and use readelf -a command to check if data is written to the file correctly. Below is the part of ELF file we made SHECC generated.

  Entry point address:               0x10068
  Start of program headers:          52 (bytes into file)
  Start of section headers:          12627 (bytes into file)
  Flags:                             0x5000200, Version5 EABI, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         2
  Size of section headers:           40 (bytes)
  Number of section headers:         6
  Section header string table index: 5
...

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0] <no-strings>      LOPROC+0x274732 00000000 000000 000000 00 XMSIOxxxo  0   0  0
  [ 1] <no-strings>      NULL            0000000b 000001 000007 00     65620  84 12388
  [ 2] <no-strings>      RELA            00000011 000001 000003 0c     78008 12472 96
  [ 3] <no-strings>      RELA            00000017 000002 000000 0c      0 12568  0
  [ 4] <no-strings>      RELA            0000001f 000003 000000 0c   M  0 12568  0
  [ 5] <no-strings>      PROGBITS        00000001 000003 000000 00      0 12568 39
...

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  INTERP         0x000054 0x00010054 0x00010054 0x00014 0x00014 R   0x4
      [Requesting program interpreter: ]
  LOAD           0x000054 0x00010054 0x00010054 0x030c4 0x030c4 RWE 0x4

We wrote program header INTERP to ELF file successfully but we also made section headers broken. The reason might be that we didn't set the ELF header Start of section headers correctly.

Prepare skeleton code like listing.s and tweak the RISC-V- code generation of shecc to adapt ELF headers and sections.

Record 02: ELF Header Modifications

We modified the elf_generate_header() function to support dynamic linking by implementing conditional ELF type selection:elf_generate_header and ELF file we made SHECC generated.

void elf_generate_header(int dynamic_linking_enabled) {
    // ELF Magic number and identification
    elf_write_header_int(0x464c457f);    // Magic: 0x7F followed by ELF
    elf_write_header_byte(1);            // 32-bit
    elf_write_header_byte(1);            // little-endian
    elf_write_header_byte(1);            // EI_VERSION
    elf_write_header_byte(0);            // System V
    
    // Key modification: Conditional ELF type selection
    elf_write_header_int(dynamic_linking_enabled ? 3 : 2);  // ET_DYN or ET_EXEC
}
  

Verification of the header modification showed successful type change to DYN

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           None
  Version:                           0x1002800
  Entry point address:               0x54000000

Record 03: Dynamic Linking Section Implementation

Section String Table Additions

Added the following dynamic linking related sections to the string table:

  • .dynstr
  • .dynsym
  • .dynamic
  • .interp
  • .rela.plt
  • .plt
  • .got

shstr section table

/* shstr section; len = 39(static), 93(dynamic) */
    elf_write_section_byte(0);
    elf_write_section_str(".shstrtab", 9);
    elf_write_section_byte(0);
    elf_write_section_str(".text", 5);
    elf_write_section_byte(0);
    elf_write_section_str(".data", 5);
    elf_write_section_byte(0);
    if (dynamic_linking_enabled) {
        elf_write_section_str(".dynstr", 7);
        elf_write_section_byte(0);
        elf_write_section_str(".dynsym", 7);
        elf_write_section_byte(0);
        elf_write_section_str(".dynamic", 8);
        elf_write_section_byte(0);
        elf_write_section_str(".interp", 8);
        elf_write_section_byte(0);
        elf_write_section_str(".rela.plt", 9);
        elf_write_section_byte(0);
        elf_write_section_str(".plt", 4);
        elf_write_section_byte(0);
        elf_write_section_str(".got", 4);
        elf_write_section_byte(0);
    }
    elf_write_section_str(".symtab", 7);
    elf_write_section_byte(0);
    elf_write_section_str(".strtab", 7);
    elf_write_section_byte(0);

Implemented section headers following the ELF32_Shdr structure:

ELF32_Shdr

typedef struct {
        Elf32_Word      sh_name;
        Elf32_Word      sh_type;
        Elf32_Word      sh_flags;
        Elf32_Addr      sh_addr;
        Elf32_Off       sh_offset;
        Elf32_Word      sh_size;
        Elf32_Word      sh_link;
        Elf32_Word      sh_info;
        Elf32_Word      sh_addralign;
        Elf32_Word      sh_entsize;
}ELF32_Shdr

Dynamic Sections Generation

if (dynamic_linking_enabled) {
        /* .interp section header */
        elf_write_section_int(48);
        elf_write_section_int(1);
        elf_write_section_int(2);
        elf_write_section_int(ELF_START + elf_header_len);
        elf_write_section_int(elf_header_len);
        elf_write_section_int(19);
        elf_write_section_int(0);
        elf_write_section_int(0);
        elf_write_section_int(1);
        elf_write_section_int(0);

        /* .dynsym section header */
        elf_write_section_int(31);
        elf_write_section_int(11);
        elf_write_section_int(2);
        elf_write_section_int(ELF_START + elf_header_len + 19);
        elf_write_section_int(elf_header_len + 19);
        elf_write_section_int(33);
        elf_write_section_int(4);
        elf_write_section_int(1);
        elf_write_section_int(4);
        elf_write_section_int(16);

        /* .dynstr section header */
        elf_write_section_int(23);
        elf_write_section_int(3);
        elf_write_section_int(2);
        elf_write_section_int(ELF_START + elf_header_len + 19 + 33);
        elf_write_section_int(elf_header_len + 19 + 33);
        elf_write_section_int(10);
        elf_write_section_int(0);
        elf_write_section_int(0);
        elf_write_section_int(1);
        elf_write_section_int(0);

        /* .dynamic section header */
        elf_write_section_int(39);
        elf_write_section_int(6);
        elf_write_section_int(3);
        elf_write_section_int(ELF_START + elf_header_len + 62);
        elf_write_section_int(elf_header_len + 62);
        elf_write_section_int(16);
        elf_write_section_int(4);
        elf_write_section_int(0);
        elf_write_section_int(4);
        elf_write_section_int(8);

        /* .rela.plt section header */
        elf_write_section_int(57);
        elf_write_section_int(4);
        elf_write_section_int(66);
        elf_write_section_int(ELF_START + elf_header_len + 78);
        elf_write_section_int(elf_header_len + 78);
        elf_write_section_int(12);
        elf_write_section_int(2);
        elf_write_section_int(8);
        elf_write_section_int(4);
        elf_write_section_int(12);

        /* .plt section header */
        elf_write_section_int(67);
        elf_write_section_int(1);
        elf_write_section_int(6);
        elf_write_section_int(ELF_START + elf_header_len + 90);
        elf_write_section_int(elf_header_len + 90);
        elf_write_section_int(12);
        elf_write_section_int(0);
        elf_write_section_int(0);
        elf_write_section_int(4);
        elf_write_section_int(16);

        /* .got section header */
        elf_write_section_int(72);
        elf_write_section_int(1);
        elf_write_section_int(3);
        elf_write_section_int(ELF_START + elf_header_len + 102);
        elf_write_section_int(elf_header_len + 102);
        elf_write_section_int(12);
        elf_write_section_int(0);
        elf_write_section_int(0);
        elf_write_section_int(4);
        elf_write_section_int(4);
    }

Next, we Implemented elf_generate_dynamic_sections() to create:

  • .dynamic section with DT_NEEDED entries
  • .interp section pointing to "/lib/ld-linux.so.3"
  • .dynstr section containing "libc.so.6"
  • .dynsym section with global symbol entries
  • .rela.plt section for relocation entries
  • .plt section with ARM32 PLT entries
  • .got section with three global offset tables

void elf_generate_dynamic_sections()

void elf_generate_dynamic_sections() {
    /* .dynamic section*/
    elf_write_section_int(1);                /* DT_NEEDED */
    elf_write_section_int(elf_strtab_index); /* offset in .dynstr */
    elf_write_section_int(0);                /* DT_NULL */
    elf_write_section_int(0);                /* End of .dynamic */

    /* .interp section */
    elf_write_section_str("/lib/ld-linux.so.3", 18); /* interpreter */
    elf_write_section_int(0);                        /* End of .interp */

    /* .dynstr section */
    elf_write_section_str("libc.so.6", 9); /* dynamic linked library */
    elf_write_section_byte(0);             /* End of .dynsym */

    /* dynsym section*/
    elf_write_section_int(0); /* NULL entry */
    elf_write_section_int(0);
    elf_write_section_int(0);
    elf_write_section_byte(0);
    elf_write_section_byte(0);
    elf_write_section_byte(0 & 0xFF);
    elf_write_section_byte(0 >> 8 & 0xFF); /* SHN_UNDEF*/

    elf_write_section_int(1); /* offset to "libc.so.6" */
    elf_write_section_int(0);
    elf_write_section_int(0);
    elf_write_section_byte(0x10); /* STB_GLOBAL | STT_OBJECT */
    elf_write_section_byte(0);
    elf_write_section_byte(0 & 0xFF);
    elf_write_section_byte(0 >> 8 & 0xFF); /* SHN_UNDEF */

    elf_write_section_int(0); /* End of .dynsym*/

    /* .rela.plt section */
    elf_write_section_int(0);
    elf_write_section_int(0x16); /* R_ARM_JUMP_SLOT */
    elf_write_section_int(0);    /* r_addend*/

    /* .plt section */
    /* ARM32 PLT entry */
    elf_write_code_int(0xe28fc600); /* add ip, pc, #0 */
    elf_write_code_int(0xe28cca00); /* add ip, ip, #0 */
    elf_write_code_int(0xe5bcf000); /* ldr pc, [ip, #0]! */

    /* .got section */
    elf_write_section_int(0); /* GOT[0]: Reserved */
    elf_write_section_int(0); /* GOT[1]: Reserved */
    elf_write_section_int(0); /* GOT[2]: "libc.so.6" entry */
}

and when the init of the void elf_generate_sections, the compiler would generate the dynamic sections first.

  • void elf_generate_sections
void elf_generate_sections(int dynamic_linking_enabled) {
    if (dynamic_linking_enabled) {
        elf_generate_dynamic_sections();
    }
    /* existing code ... */

Current Issues

  1. Build Failures
    The implementation currently produces build errors in stage 2:
$ make
GEN   out/libc.inc
CC    out/src/main.o
LD    out/shecc
SHECC out/shecc-stage1.elf
SHECC out/shecc-stage2.elf
qemu-arm: out/shecc-stage1.elf: Invalid ELF image for this architecture
make: *** [Makefile:114: out/shecc-stage2.elf] Error 255
  1. ELF Validation Errors
    readelf analysis reveals several critical issues
$ readelf -a
readelf: Error: Too many program headers - 0x2000 - the file is not that big
readelf: Warning: The e_shentsize field in the ELF header is larger than the size of an ELF section header
readelf: Error: Reading 2621440 bytes extends past end of file for section headers
readelf: Error: Section headers are not available!
readelf: Error: Too many program headers - 0x2000 - the file is not that big
readelf: Error: Too many program headers - 0x2000 - the file is not that big

it indicates that:

  1. Invalid program header count (0x2000)
  2. Incorrect e_shentsize in ELF header
  3. Section header offset extends beyond file size
  4. Incorrect section count (10240)
  • ELF file
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           None
  Version:                           0x1002800
  Entry point address:               0x54000000
  Start of program headers:          872415488 (bytes into file)
  Start of section headers:          1056964608 (bytes into file)
  Flags:                             0x31
  Size of this header:               2 (bytes)
  Size of program headers:           13317 (bytes)
  Number of program headers:         8192
  Size of section headers:           256 (bytes)
  Number of section headers:         10240
  Section header string table index: 1536

There is no dynamic section in this file.

It seems that the elf doesn't get the right programs and sections header both in size and numbers

Next Steps

To fix these problems, we need to correct calculation of:

  1. Program header count and size
  2. Section header count and size
  3. File offsets for all sections

Action Items

  1. 先利用簡單的方法產生正確有效的執行檔。
  2. 判斷 section 是否有存在的必要,確認是哪一些 elf section 是有必要的
  3. .plt 相當於 cache 去幫忙 .got,也像 cache 一樣會有替換 symbol 的情形(用於完整的 elf loader),
  4. .got 建立 symbol name 的 的關聯
  5. .shtab 是必要的
  6. rela.plt 會需要使用機械碼是因為 __libc_start_main() 進入點需要將 argc, argv, 推進 stack (calling convention)
  7. .rel 做 relocation 要怎麼做(參閱規格書),alignment 對齊問題。 "__libc_start_main" 的 relocation 要做對
  8. page alignment (參考 amacc/amacc.c#2013)

Record 04: Dynamic Linking Section Implementation

According to the previous action item, I generate the right and effective ELF file through gcc and then that shecc genertate the right ELF header and add .dynamic section in program header table and section header table, But the current problem is that the section header name couldn't show right name in the table.

  • ELF generated by shecc
ELF Header:
Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
Class:                             ELF32
Data:                              2's complement, little endian
Version:                           1 (current)
OS/ABI:                            UNIX - System V
ABI Version:                       0
Type:                              DYN (Shared object file)
Machine:                           ARM
Version:                           0x1
Entry point address:               0x10054
Start of program headers:          52 (bytes into file)
Start of section headers:          12664 (bytes into file)
Flags:                             0x5000200, Version5 EABI, soft-float ABI
Size of this header:               52 (bytes)
Size of program headers:           32 (bytes)
Number of program headers:         2
Size of section headers:           40 (bytes)
Number of section headers:         7
Section header string table index: 6

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0] ree detected^J    NULL            00000000 000000 000000 00      0   0  0
  [ 1] d^J               PROGBITS        00010054 000054 003064 00 WAX  0   0  4
  [ 2]                   PROGBITS        000130b8 0030b8 000060 00  WA  0   0  4
  [ 3]  World^J          DYNAMIC         00000000 000000 000010 08  WA  4   0  4
  [ 4] ^A                SYMTAB          00000000 003118 000000 10      4   0  4
  [ 5]                   STRTAB          00000000 003128 000000 00      0   0  1
  [ 6] ee detected^J     STRTAB          00000000 003118 000030 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000054 0x00010054 0x00010054 0x030c4 0x030c4 RWE 0x4
  DYNAMIC        0x003118 0x00013118 0x00013118 0x00010 0x00010 RW  0x4

 Section to Segment mapping:
  Segment Sections...
   00     d^J  
   01     

Dynamic section at offset 0x3118 contains 2 entries:
  Tag        Type                         Name/Value
 0x20656572 (<unknown>: 20656572)        0x65746564
 0x64657463 (Operating System specific: 64657463)        0x6425000a

There are no relocations in this file.

Symbol table '^A' contains 0 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name

No version information found in this file.

void elf_generate_header() {
     /* ELF header */
     elf_write_header_int(0x464c457f); /* Magic: 0x7F followed by ELF */
     elf_write_header_byte(1);         /* 32-bit */
@@ -76,14 +65,14 @@ void elf_generate_header()
     elf_write_header_byte(0);         /* System V */
     elf_write_header_int(0);          /* EI_ABIVERSION */
     elf_write_header_int(0);          /* EI_PAD: unused */
-    elf_write_header_byte(2);         /* ET_EXEC */
+    elf_write_header_byte(3);         /* ET_EXEC */
     elf_write_header_byte(0);
     elf_write_header_byte(ELF_MACHINE);
     elf_write_header_byte(0);
     elf_write_header_int(1);                          /* ELF version */
     elf_write_header_int(ELF_START + elf_header_len); /* entry point */
-    elf_write_header_int(0x34); /* program header offset */
-    elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 39 +
+    elf_write_header_int(0x34);                       /* program header offset */
+    elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 48 + 32 + 16 +
                          elf_symtab_index +
                          elf_strtab_index); /* section header offset */
     /* flags */
@@ -92,13 +81,13 @@ void elf_generate_header()
     elf_write_header_byte(0);
     elf_write_header_byte(0x20); /* program header size */
     elf_write_header_byte(0);
-    elf_write_header_byte(1); /* number of program headers */
+    elf_write_header_byte(2); /* number of program headers */
     elf_write_header_byte(0);
     elf_write_header_byte(0x28); /* section header size */
     elf_write_header_byte(0);
-    elf_write_header_byte(6); /* number of sections */
+    elf_write_header_byte(7); /* number of sections */
     elf_write_header_byte(0);
-    elf_write_header_byte(5); /* section index with names */
+    elf_write_header_byte(6); /* section index with names */
     elf_write_header_byte(0);

     /* program header - code and data combined */
@@ -110,10 +99,26 @@ void elf_generate_header()
     elf_write_header_int(elf_code_idx + elf_data_idx); /* size in memory */
     elf_write_header_int(7);                           /* flags */
     elf_write_header_int(4);                           /* alignment */

+    /* program header - dynamic segment */
+    elf_write_header_int(2);                                                        /* PT_DYNAMIC */
+    elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx);             /* offset of segment */
+    elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* virtual address */
+    elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* physical address */
+    elf_write_header_int(16);                                                       /* size in file */
+    elf_write_header_int(16);                                                       /* size in memory */
+    elf_write_header_int(6);                                                        /* flags */
+    elf_write_header_int(4);                                                        /* alignment */
 }

void elf_generate_sections() {
+    /* .dynamic section*/
+    elf_write_section_int(1); /* DT_NEEDED */
+    elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx +
+                          elf_symtab_index + 16); /* offset in .dynstr */
+    elf_write_section_int(0);                     /* DT_NULL */
+    elf_write_section_int(0);                     /* End of .dynamic */
+
     /* symtab section */
     for (int b = 0; b < elf_symtab_index; b++)
         elf_write_section_byte(elf_symtab[b]);
@@ -130,6 +135,8 @@ void elf_generate_sections()
     elf_write_section_byte(0);
     elf_write_section_str(".data", 5);
     elf_write_section_byte(0);
+    elf_write_section_str(".dynamic", 8);
+    elf_write_section_byte(0);
     elf_write_section_str(".symtab", 7);
     elf_write_section_byte(0);
     elf_write_section_str(".strtab", 7);
@@ -173,8 +180,20 @@ void elf_generate_sections()
     elf_write_section_int(4);
     elf_write_section_int(0);

+    /* .dynamic section header */
+    elf_write_section_int(23);
+    elf_write_section_int(6);
+    elf_write_section_int(3);
+    elf_write_section_int(0);  // sh_addr
+    elf_write_section_int(0);  // sh_offset
+    elf_write_section_int(16);
+    elf_write_section_int(4);
+    elf_write_section_int(0);
+    elf_write_section_int(4);
+    elf_write_section_int(8);
+
     /* .symtab */
-    elf_write_section_int(0x17);
+    elf_write_section_int(32);
     elf_write_section_int(2);
     elf_write_section_int(0);
     elf_write_section_int(0);
@@ -186,12 +205,12 @@ void elf_generate_sections()
     elf_write_section_int(16);

     /* .strtab */
-    elf_write_section_int(0x1f);
+    elf_write_section_int(40);
     elf_write_section_int(3);
     elf_write_section_int(0);
     elf_write_section_int(0);
     elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx +
-                          elf_symtab_index);
+                          elf_symtab_index + 16);
     elf_write_section_int(elf_strtab_index); /* size */
     elf_write_section_int(0);
     elf_write_section_int(0);
@@ -205,15 +224,14 @@ void elf_generate_sections()
     elf_write_section_int(0);
     elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx +
                           elf_symtab_index + elf_strtab_index);
-    elf_write_section_int(39);
+    elf_write_section_int(48);
     elf_write_section_int(0);
     elf_write_section_int(0);
     elf_write_section_int(1);
     elf_write_section_int(0);
 }

Another action item is that deciding the necessary sections, the neccessary added sections for dynamic linked feature are .dynamic, .interp, .got, .rela.plt and .rela

Record 04: Generate Correct Dynamic Program Header and Section Table

Github Repo
Goal: Add PT_INTERP and PT_DYNAMIC entries to the program header and define the corresponding .interp and .dynamic sections, so the generated ELF file is properly linked to the shared library libc.so.6 and uses the dynamic linker /lib/ld-linux.so.3.

  1. Adjusting the ELF Header Length

    In src/globals.c, we increase elf_header_len from 0x54 to 0x94. The reason is that each program header entry is 32 bytes, and we need two additional entries (for PT_INTERP and PT_DYNAMIC).

  • src/globals.c
/* Existing Code */
/* ELF sections */

char *elf_code;
int elf_code_idx = 0;
char *elf_data;
int elf_data_idx = 0;
char *elf_header;
int elf_header_idx = 0;
- int elf_header_len = 0x54; /* ELF fixed: 0x34 + 1 * 0x20 */
+int elf_header_len = 0x94; /* ELF fixed: 0x34 + 3 * 0x20 */
int elf_code_start;
int elf_data_start;
char *elf_symtab;
char *elf_strtab;
char *elf_section;

/* Existing Code */

  1. Adding the PT_INTERP and PT_DYNAMIC Segments

    Next, we add the two new program header entries immediately after the existing PT_LOAD entry. This ensures that the ELF loader knows:

    1. Which dynamic linker to invoke (PT_INTERP).
    2. How to handle dynamic linking information (PT_DYNAMIC).
/* program header - interpreter segment */
elf_write_header_int(3);                                                        /* PT_INTERP */
elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx);             /* p_offset */
elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* p_vaddr */
elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* p_paddr */
elf_write_header_int(22);                                                       /* p_filesz */
elf_write_header_int(22);                                                       /* p_memsz */
elf_write_header_int(4);                                                        /* p_flags */
elf_write_header_int(4);                                                        /* p_align */

/* program header - dynamic segment */
elf_write_header_int(2);                                                             /* PT_DYNAMIC */
elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 22);             /* p_offset */
elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22); /* p_vaddr */
elf_write_header_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22); /* p_paddr */
elf_write_header_int(40);                                                            /* p_filesz */
elf_write_header_int(16);                                                            /* p_memsz */
elf_write_header_int(6);                                                             /* p_flags */
elf_write_header_int(4);                                                             /* p_align */

  1. Defining the .interp and .dynamic Sections

    We also need to create the corresponding sections in the section header table. The .interp section holds the path to the dynamic linker (/lib/ld-linux.so.3), and the .dynamic section holds the runtime link information, including the necessary reference to libc.so.6.

void elf_generate_sections() {
    /* .interp section (length = 22) */
    elf_write_section_str("/lib/ld-linux.so.3", 18);
    elf_write_section_int(0);

    /* .dynamic section (length = 40) */
    elf_write_section_int(6);  /* DT_SYMTAB */
    elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40 + 28);
    elf_write_section_int(5);  /* DT_STRTAB */
    elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40);
    elf_write_section_int(11); /* DT_SYMENT */
    elf_write_section_int(16); /* Symbol entry size */
    elf_write_section_int(1);  /* DT_NEEDED */
    elf_write_section_int(1);  /* Offset in .dynstr */
    elf_write_section_int(0);  /* DT_NULL */
    elf_write_section_int(0);  /* End of .dynamic */

    /* .dynstr section (length = 28) */
    elf_write_section_byte(0); /* NULL terminator */
    elf_write_section_str("libc.so.6", 9);
    elf_write_section_byte(0);
    elf_write_section_str("__libc_start_main", 17);
}

/* Existing Code */

/* Add to .shstr tab and its string */
elf_write_section_str(".interp", 7);
elf_write_section_byte(0);
elf_write_section_str(".dynamic", 8);
elf_write_section_byte(0);
elf_write_section_str(".dynstr", 7);

/* .interp section header */
elf_write_section_int(23); /* .sh_name offset in .shstrtab */
elf_write_section_int(1);  /* SHT_PROGBITS */
elf_write_section_int(2);  /* SHF_ALLOC */
elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx); /* sh_addr */
elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx);             /* sh_offset */
elf_write_section_int(22); /* sh_size */
elf_write_section_int(0);
elf_write_section_int(0);
elf_write_section_int(1);  /* sh_addralign */
elf_write_section_int(0);

/* .dynamic section header */
elf_write_section_int(31); /* .sh_name offset in .shstrtab */
elf_write_section_int(6);  /* SHT_DYNAMIC */
elf_write_section_int(3);  /* SHF_ALLOC | SHF_WRITE */
elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22); /* sh_addr */
elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22);             /* sh_offset */
elf_write_section_int(40); /* sh_size */
elf_write_section_int(5);  /* sh_link = .dynstr */
elf_write_section_int(0);
elf_write_section_int(4);  /* sh_addralign */
elf_write_section_int(0);

/* .dynstr section header */
elf_write_section_int(40); /* .sh_name offset */
elf_write_section_int(3);  /* SHT_STRTAB */
elf_write_section_int(2);  /* SHF_ALLOC */
elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx + 22 + 40);
elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx + 22 + 40);
elf_write_section_int(28); /* sh_size */
elf_write_section_int(0);
elf_write_section_int(0);
elf_write_section_int(1);  /* sh_addralign */
elf_write_section_int(0);
  1. Verifying with readelf

    After compiling and generating this ELF, running readelf -a confirms the presence of the additional segments and sections:

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 ...
  Class:                             ELF32
  Data:                              2's complement, little endian
  ...
  Type:                              DYN (Shared object file)
  Machine:                           ARM
  ...
  Entry point address:               0x10094
  Start of program headers:          52 (bytes into file)
  Start of section headers:          12777 (bytes into file)
  ...
  Size of program headers:           32 (bytes)
  Number of program headers:         3
  ...

Section Headers:
  [ 3] .interp           PROGBITS        00013160 003160 00001d 00   A  0   0  1
  [ 4] .dynamic          DYNAMIC         0001317d 00317d 000010 00  WA  5   0  4
  [ 5] .dynstr           STRTAB          0001318d 00318d 00001c 00   A  0   0  1
  ...

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000094 0x00010094 0x00010094 ...
  INTERP         0x003160 0x00013160 0x00013160 ...
      [Requesting program interpreter: /lib/ld-linux.so.3]
  DYNAMIC        0x00317d 0x0001317d 0x0001317d ...

Dynamic section at offset 0x317d contains 2 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 ...

As shown, the .interp section correctly specifies /lib/ld-linux.so.3, and the dynamic section indicates it needs libc.so.6. These entries verify that the ELF is now dynamically linked and will load the shared library and dynamic linker at runtime.

Below is an improved version of the report, formatted in Markdown, that clearly explains the changes made to create .dynstb (presumably .dynsym) and .dynstr sections, the resulting output from readelf, and the next steps.

Record 05: Generating .dynsym and .dynstr Sections, and Issues with Adding .got, .plt, and rel.plt

In this stage, we add .dynstr and .dynsym sections to the ELF. While these sections appear correctly in readelf, there is a lingering issue where the .strtab entry in the section header appears as <corrupt>.

  1. Code Changes

    Below is the relevant code that creates the .dynsym entries in elf_generate_sections():

/* First entry must be NULL */
void elf_generate_sections() {
    /* Existing Code ... */

    elf_write_section_int(0);   // st_name
    elf_write_section_int(0);   // st_value
    elf_write_section_int(0);   // st_size
    elf_write_section_byte(0);  // st_info
    elf_write_section_byte(0);  // st_other
    elf_write_section_byte(0);  // st_shndx
    elf_write_section_byte(0);

    /* Second entry is the libc.so.6 */
    elf_write_section_int(1);      // st_name
    elf_write_section_int(0);      // st_value
    elf_write_section_int(0);      // st_size
    elf_write_section_byte(0x12);  // st_info
    elf_write_section_byte(0);     // st_other
    elf_write_section_byte(0);     // st_shndx
    elf_write_section_byte(0);

    /* Third entry is __libc_start_main */
    elf_write_section_int(11);     // st_name
    elf_write_section_int(0);      // st_value
    elf_write_section_int(0);      // st_size
    elf_write_section_byte(0x12);  // st_info
    elf_write_section_byte(0);     // st_other
    elf_write_section_byte(0);     // st_shndx
    elf_write_section_byte(0);

    /* Existing Code */

    /* Add to .shstr tab and its string */
    elf_write_section_str(".dynsym", 7);
    elf_write_section_byte(0);

    /* .dynsym section header */
    elf_write_section_int(48);
    elf_write_section_int(11);
    elf_write_section_int(0);
    elf_write_section_int(ELF_START + elf_header_len + elf_code_idx + elf_data_idx 
                         + 22 + 40 + 28);  // sh_addr
    elf_write_section_int(elf_header_len + elf_code_idx + elf_data_idx 
                         + 22 + 40 + 28);  // sh_offset
    elf_write_section_int(48);
    elf_write_section_int(5);
    elf_write_section_int(2);
    elf_write_section_int(4);
    elf_write_section_int(16);

    /* Existing Code */
}
  1. Verifying with readelf

    After compiling and generating this ELF, running readelf -a shows the following (truncated for brevity):

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  ...
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         3
  Size of section headers:           40 (bytes)
  Number of section headers:         10
  Section header string table index: 9

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      
  [ 1] .text             PROGBITS        00010094 000094 003064 00 WAX  
  [ 2] .data             PROGBITS        000130f8 0030f8 000068 00  WA  
  [ 3] .interp           PROGBITS        00013160 003160 000016 00   A  
  [ 4] .dynamic          DYNAMIC         00013176 003176 000028 00  WA  
  [ 5] .dynstr           STRTAB          0001319e 00319e 00001c 00   A  
  [ 6] .dynsym           DYNSYM          000131ba 0031ba 000030 10      
  [ 7] .symtab           SYMTAB          00000000 0031ea 000000 10      
  [ 8] <corrupt>         STRTAB          00000000 0031ea 000000 00      
  [ 9] .shstrtab         STRTAB          00000000 0031ea 000040 00      

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  ...
  INTERP         0x003160 0x00013160 0x00013160 0x00016 0x00016 R   0x4
      [Requesting program interpreter: /lib/ld-linux.so.3]
  DYNAMIC        0x003176 0x00013176 0x00013176 0x00028 0x00010 RW  0x4

Dynamic section at offset 0x3176 contains 5 entries:
  Tag        Type                         Name/Value
  0x00000006 (SYMTAB)                     0x31ba
  0x00000005 (STRTAB)                     0x319e
  0x0000000b (SYMENT)                     16 (bytes)
  0x00000001 (NEEDED)                     Shared library: [libc.so.6]
  0x00000000 (NULL)                       0x0

Symbol table '.dynsym' contains 3 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 FUNC    GLOBAL DEFAULT  UND libc.so.6
     2: 00000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main

Observation:
The .dynsym table is correct and references the strings in .dynstr. However, section header entry [8] is displayed as <corrupt>. This suggests there might be an error in how the .strtab or other string tables are being appended to the ELF, causing the section header data to be misaligned or overwritten.

  1. Next Steps

    1. Investigate .strtab Corruption

      • Check how the .symtab and .strtab sections are being added to the ELF. There might be a miscalculation in offsets or lengths, leading to overwriting existing data.
      • Verify the sizes and offsets for all sections after .dynsym to ensure they do not overlap each other.
    2. Implement External Symbol Handling

      • Create a function to load external function names and sizes. This will help populate .dynstr, .dynsym, .got, and .plt.
    3. Implement rel.plt

      • Use R_ARM_JUMP_SLOT relocations in rel.plt.
      • Provide a plt stub for __libc_start_main() (and any other external functions) to handle calling conventions and symbol resolution at runtime.

Useful materials