黎詠哲, 李協儒
Currently, shecc can only generate statically-linked ELF executables. This results in inefficient space utilization. Therefore, your task is to provide an option to produce dynamically-linked ELF executables that link to glibc
.
To support dynamic link, we need to add headers and sections to ELF file. We need to follow the sequence of above ELF views and write necessary data one by one to output ELF file. Below is the list of headers and sections that we need to write to ELF file.
Program headers
DYNAMIC
: For dynamic binaries, this segment hold dynamic linking information and is usually the same as .dynamic
section in ELF's linking view.INTERP
: For dynamic binaries, this holds the full pathname of runtime linker ld.so
. This segement is the same as .interp
section in ELF's linking view.Section headers
.dynamic
: For dynamic binaries, this section holds dynamic linking information used by ld.so
..dynstr
: NULL-terminated strings of names of symbols in .dynsym
section..dynsym
: Runtime/Dynamic symbol table. For dynamic binaries, this section is the symbol table of globally visible symbols. For example, if a dynamic link library wants to export its symbols, these symbols will be stored here. On the other hand, if a dynamic executable binary uses symbols from a dynamic link library, then these symbols are stored here too. The symbol names (as NULL-terminated strings) are stored in .dynstr section..got
: For dynamic binaries, this Global Offset Table holds the addresses of variables which are relocated upon loading..got.plt
: For dynamic binaries, this Global Offset Table holds the addresses of functions in dynamic libraries. They are used by trampoline code in .plt
section. If .got.plt
section is present, it contains at least three entries, which have special meanings..interp
: For dynamic binaries, this holds the full pathname of runtime linker ld.so
..plt
: For dynamic binaries, this Procedure Linkage Table holds the trampoline/linkage code..rela.dyn
: Runtime/Dynamic relocation table. For dynamic binaries, this relocation table holds information of variables which must be relocated upon loading. Each entry in this table is a struct Elf64_Rela
(see /usr/include/elf.h
)..rela.plt
: Runtime/Dynamic relocation table. This relocation table is similar to the one in .rela.dyn
section; the difference is this one is for functions, not variables.INTERP
to ELF file but it made section headers broken.SHECC
need to write to ELF file. But we still struggled to figure out the detail of each headers and sections when we tried to put these data to ELF. So now we try to write these headers to ELF file one by one and make sure it won't make other headers broken.We tried to write necessary data to ELF file and use readelf -a
command to check if data is written to the file correctly. Below is the part of ELF file we made SHECC
generated.
We wrote program header INTERP
to ELF file successfully but we also made section headers broken. The reason might be that we didn't set the ELF header Start of section headers
correctly.
Prepare skeleton code like listing.s and tweak the RISC-V- code generation of shecc to adapt ELF headers and sections.
We modified the elf_generate_header() function to support dynamic linking by implementing conditional ELF type selection:elf_generate_header
and ELF file we made SHECC
generated.
Verification of the header modification showed successful type change to DYN
Added the following dynamic linking related sections to the string table:
.dynstr
.dynsym
.dynamic
.interp
.rela.plt
.plt
.got
shstr section table
Implemented section headers following the ELF32_Shdr
structure:
ELF32_Shdr
Dynamic Sections Generation
Next, we Implemented elf_generate_dynamic_sections()
to create:
.dynamic
section with DT_NEEDED
entries.interp
section pointing to "/lib/ld-linux.so.3
".dynstr
section containing "libc.so.6
".dynsym
section with global symbol entries.rela.plt
section for relocation entries.plt
section with ARM32 PLT entries.got
section with three global offset tablesvoid elf_generate_dynamic_sections()
and when the init of the void elf_generate_sections
, the compiler would generate the dynamic sections first.
void elf_generate_sections
readelf
analysis reveals several critical issuesit indicates that:
e_shentsize
in ELF headerIt seems that the elf doesn't get the right programs and sections header both in size and numbers
To fix these problems, we need to correct calculation of:
.plt
相當於 cache 去幫忙 .got
,也像 cache 一樣會有替換 symbol 的情形(用於完整的 elf loader),.got
建立 symbol name 的 的關聯.shtab
是必要的rela.plt
會需要使用機械碼是因為 __libc_start_main() 進入點需要將 argc, argv, 推進 stack (calling convention).rel
做 relocation 要怎麼做(參閱規格書),alignment 對齊問題。 "__libc_start_main" 的 relocation 要做對amacc/amacc.c#2013
)According to the previous action item, I generate the right and effective ELF file through gcc and then that shecc genertate the right ELF header and add .dynamic
section in program header table and section header table, But the current problem is that the section header name couldn't show right name in the table.
shecc
Another action item is that deciding the necessary sections, the neccessary added sections for dynamic linked feature are .dynamic
, .interp
, .got
, .rela.plt
and .rela
Github Repo
Goal: Add PT_INTERP and PT_DYNAMIC entries to the program header and define the corresponding .interp and .dynamic sections, so the generated ELF file is properly linked to the shared library libc.so.6 and uses the dynamic linker /lib/ld-linux.so.3.
Adjusting the ELF Header Length
In src/globals.c, we increase elf_header_len from 0x54 to 0x94. The reason is that each program header entry is 32 bytes, and we need two additional entries (for PT_INTERP and PT_DYNAMIC).
src/globals.c
Adding the PT_INTERP
and PT_DYNAMIC
Segments
Next, we add the two new program header entries immediately after the existing PT_LOAD
entry. This ensures that the ELF loader knows:
PT_INTERP
).PT_DYNAMIC
).Defining the .interp
and .dynamic
Sections
We also need to create the corresponding sections in the section header table. The .interp
section holds the path to the dynamic linker (/lib/ld-linux.so.3
), and the .dynamic
section holds the runtime link information, including the necessary reference to libc.so.6
.
Verifying with readelf
After compiling and generating this ELF, running readelf -a
confirms the presence of the additional segments and sections:
As shown, the .interp
section correctly specifies /lib/ld-linux.so.3
, and the dynamic section indicates it needs libc.so.6
. These entries verify that the ELF is now dynamically linked and will load the shared library and dynamic linker at runtime.
Below is an improved version of the report, formatted in Markdown, that clearly explains the changes made to create .dynstb
(presumably .dynsym
) and .dynstr
sections, the resulting output from readelf
, and the next steps.
.dynsym
and .dynstr
Sections, and Issues with Adding .got
, .plt
, and rel.plt
In this stage, we add .dynstr
and .dynsym
sections to the ELF. While these sections appear correctly in readelf
, there is a lingering issue where the .strtab
entry in the section header appears as <corrupt>
.
Code Changes
Below is the relevant code that creates the .dynsym
entries in elf_generate_sections()
:
Verifying with readelf
After compiling and generating this ELF, running readelf -a
shows the following (truncated for brevity):
Observation:
The .dynsym
table is correct and references the strings in .dynstr
. However, section header entry [8]
is displayed as <corrupt>
. This suggests there might be an error in how the .strtab
or other string tables are being appended to the ELF, causing the section header data to be misaligned or overwritten.
Next Steps
Investigate .strtab
Corruption
.symtab
and .strtab
sections are being added to the ELF. There might be a miscalculation in offsets or lengths, leading to overwriting existing data..dynsym
to ensure they do not overlap each other.Implement External Symbol Handling
.dynstr
, .dynsym
, .got
, and .plt
.Implement rel.plt
R_ARM_JUMP_SLOT
relocations in rel.plt
.plt
stub for __libc_start_main()
(and any other external functions) to handle calling conventions and symbol resolution at runtime.