[Record] Utilization of GNU Binutils
===
###### tags: `record` `linux` `nm` `objdump` `readelf` `strip` `strings`
[Toc]
# Preface
When porting our own system from ARM-based architecture to mips-based one, I will utilize GNU binary utilities like objdump, nm, strp and so on to check if the binary I compiled through cross-compiler was exactly what I expected or to check what symbols existed in my binary even to strip unneeded symbols in my binary.
# Introduction[^ref1]
>The GNU Binutils are a collection of binary tools. The main ones are:
>
>ld - the GNU linker.
as - the GNU assembler.
But they also include:
>
>addr2line - Converts addresses into filenames and line numbers.
ar - A utility for creating, modifying and extracting from archives.
c++filt - Filter to demangle encoded C++ symbols.
dlltool - Creates files for building and using DLLs.
gold - A new, faster, ELF only linker, still in beta test.
gprof - Displays profiling information.
nlmconv - Converts object code into an NLM.
nm - Lists symbols from object files.
objcopy - Copies and translates object files.
objdump - Displays information from object files.
ranlib - Generates an index to the contents of an archive.
readelf - Displays information from any ELF format object file.
size - Lists the section sizes of an object or archive file.
strings - Lists printable strings from files.
strip - Discards symbols.
windmc - A Windows compatible message compiler.
windres - A compiler for Windows resource files
>
>Most of these programs use BFD, the Binary File Descriptor library, to do low-level manipulation. Many of them also use the opcodes library to assemble and disassemble machine instructions.
>
>The binutils have been ported to most major Unix variants as well as Wintel systems, and their main reason for existence is to give the GNU system (and GNU/Linux) the facility to compile and link programs.
# Record the use of binutils
## nm
```
Usage: nm [option(s)] [file(s)]
List symbols in [file(s)] (a.out by default).
The options are:
-a, --debug-syms Display debugger-only symbols
-A, --print-file-name Print name of the input file before every symbol
-B Same as --format=bsd
-C, --demangle[=STYLE] Decode low-level symbol names into user-level names
The STYLE, if specified, can be `auto' (the default),
`gnu', `lucid', `arm', `hp', `edg', `gnu-v3', `java'
or `gnat'
--no-demangle Do not demangle low-level symbol names
-D, --dynamic Display dynamic symbols instead of normal symbols
--defined-only Display only defined symbols
-e (ignored)
-f, --format=FORMAT Use the output format FORMAT. FORMAT can be `bsd',
`sysv' or `posix'. The default is `bsd'
-g, --extern-only Display only external symbols
-l, --line-numbers Use debugging information to find a filename and
line number for each symbol
-n, --numeric-sort Sort symbols numerically by address
-o Same as -A
-p, --no-sort Do not sort the symbols
-P, --portability Same as --format=posix
-r, --reverse-sort Reverse the sense of the sort
--plugin NAME Load the specified plugin
-S, --print-size Print size of defined symbols
-s, --print-armap Include index for symbols from archive members
--size-sort Sort symbols by size
--special-syms Include special symbols in the output
--synthetic Display synthetic symbols as well
-t, --radix=RADIX Use RADIX for printing symbol values
--target=BFDNAME Specify the target object format as BFDNAME
-u, --undefined-only Display only undefined symbols
-X 32_64 (ignored)
@FILE Read options from FILE
-h, --help Display this information
-V, --version Display this program's version number
nm: supported targets:
elf32-tradbigmips
elf32-tradlittlemips
ecoff-bigmips
ecoff-littlemips
elf32-ntradbigmips
elf64-tradbigmips
elf32-ntradlittlemips
elf64-tradlittlemips
elf64-little elf64-big
elf32-little elf32-big
plugin
srec
symbolsrec
verilog
tekhex
binary
ihex
```
**++Example++**
To show symbols in a dynamic library libssp.so.
```
~ # nm -D libssp.so
00000d24 T __chk_fail
U close
w __cxa_finalize
U _exit
U fgets
U free
U gets
00000d80 T __gets_chk
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
00000000 A LIBSSP_1.0
U malloc
U memcpy
00000ef0 T __memcpy_chk
U memmove
00000f30 T __memmove_chk
00000f70 T __mempcpy_chk
U memset
00000fd0 T __memset_chk
U open
U __progname
v program_invocation_short_name
U read
00000cec T __stack_chk_fail
00011650 B __stack_chk_guard
U stdin
00001010 T __stpcpy_chk
000010a0 T __strcat_chk
00001130 T __strcpy_chk
U strlen
000011b0 T __strncat_chk
U strncpy
000012e0 T __strncpy_chk
U syslog
U write
```
>For each symbol, nm shows: [^ref2]
>
>The symbol value, in the radix selected by options (see below), or hexadecimal by default.
The symbol type. At least the following types are used; others are, as well, depending on the object file format. If lowercase, the symbol is usually local; if uppercase, the symbol is global (external). There are however a few lowercase symbols that are shown for special global symbols (u, v and w).
>A
The symbol’s value is absolute, and will not be changed by further linking.
>
>B
b
The symbol is in the uninitialized data section (known as BSS).
>
>C
The symbol is common. Common symbols are uninitialized data. When linking, multiple common symbols may appear with the same name. If the symbol is defined anywhere, the common symbols are treated as undefined references. For more details on common symbols, see the discussion of –warn-common in Linker options in The GNU linker.
>
>D
d
The symbol is in the initialized data section.
>
>G
g
The symbol is in an initialized data section for small objects. Some object file formats permit more efficient access to small data objects, such as a global int variable as opposed to a large global array.
>
>i
For PE format files this indicates that the symbol is in a section specific to the implementation of DLLs. For ELF format files this indicates that the symbol is an indirect function. This is a GNU extension to the standard set of ELF symbol types. It indicates a symbol which if referenced by a relocation does not evaluate to its address, but instead must be invoked at runtime. The runtime execution will then return the value to be used in the relocation.
>
>I
The symbol is an indirect reference to another symbol.
>
>N
The symbol is a debugging symbol.
>
>p
The symbols is in a stack unwind section.
>
>R
r
The symbol is in a read only data section.
>
>S
s
The symbol is in an uninitialized data section for small objects.
>
>T
t
The symbol is in the text (code) section.
>
>U
The symbol is undefined.
>
>u
The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use.
>
>V
v
The symbol is a weak object. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the weak symbol becomes zero with no error. On some systems, uppercase indicates that a default value has been specified.
>
>W
w
The symbol is a weak symbol that has not been specifically tagged as a weak object symbol. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the symbol is determined in a system-specific manner without error. On some systems, uppercase indicates that a default value has been specified.
>
>--
>The symbol is a stabs symbol in an a.out object file. In this case, the next values printed are the stabs other field, the stabs desc field, and the stab type. Stabs symbols are used to hold debugging information.
>
>?
The symbol type is unknown, or object file format specific.
---
## objdump
```
objdump <option(s)> <file(s)>
Display information from object <file(s)>.
At least one of the following switches must be given:
-a, --archive-headers Display archive header information
-f, --file-headers Display the contents of the overall file header
-p, --private-headers Display object format specific file header contents
-P, --private=OPT,OPT... Display object format specific contents
-h, --[section-]headers Display the contents of the section headers
-x, --all-headers Display the contents of all headers
-d, --disassemble Display assembler contents of executable sections
-D, --disassemble-all Display assembler contents of all sections
-S, --source Intermix source code with disassembly
-s, --full-contents Display the full contents of all sections requested
-g, --debugging Display debug information in object file
-e, --debugging-tags Display debug information using ctags style
-G, --stabs Display (in raw form) any STABS info in the file
-W[lLiaprmfFsoRt] or
--dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
=frames-interp,=str,=loc,=Ranges,=pubtypes,
=gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
=addr,=cu_index]
Display DWARF info in the file
-t, --syms Display the contents of the symbol table(s)
-T, --dynamic-syms Display the contents of the dynamic symbol table
-r, --reloc Display the relocation entries in the file
-R, --dynamic-reloc Display the dynamic relocation entries in the file
@<file> Read options from <file>
-v, --version Display this program's version number
-i, --info List object formats and architectures supported
-H, --help Display this information
```
**++Example++**
Get the compiler information from an object file.
```
~ # objdump -s --section .comment a.out
a.out: file format elf64-x86-64
Contents of section .comment:
0000 4743433a 20285562 756e7475 20372e33 GCC: (Ubuntu 7.3
0010 2e302d32 37756275 6e747531 7e31382e .0-27ubuntu1~18.
0020 30342920 372e332e 3000 04) 7.3.0.
```
And we can know that the binary a.out is built by ***GCC (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0***.
**++Note++**
This comment section in elf can be stripped through other ways like
```
objcopy --remove-section .comment a.out
```
As a result, you will get a message like
```
~ # objdump -s --section a.out
a.out: file format elf64-x86-64
objdump: section 'a.out' mentioned in a -j option, but not found in any input file
```
---
## readelf
```
Usage: readelf <option(s)> elf-file(s)
Display information about the contents of ELF format files
Options are:
-a --all Equivalent to: -h -l -S -s -r -d -V -A -I
-h --file-header Display the ELF file header
-l --program-headers Display the program headers
--segments An alias for --program-headers
-S --section-headers Display the sections' header
--sections An alias for --section-headers
-g --section-groups Display the section groups
-t --section-details Display the section details
-e --headers Equivalent to: -h -l -S
-s --syms Display the symbol table
--symbols An alias for --syms
--dyn-syms Display the dynamic symbol table
-n --notes Display the core notes (if present)
-r --relocs Display the relocations (if present)
-u --unwind Display the unwind info (if present)
-d --dynamic Display the dynamic section (if present)
-V --version-info Display the version sections (if present)
-A --arch-specific Display architecture specific information (if any)
-c --archive-index Display the symbol/file index in an archive
-D --use-dynamic Use the dynamic section info when displaying symbols
-x --hex-dump=<number|name>
Dump the contents of section <number|name> as bytes
-p --string-dump=<number|name>
Dump the contents of section <number|name> as strings
-R --relocated-dump=<number|name>
Dump the contents of section <number|name> as relocated bytes
-w[lLiaprmfFsoRt] or
--debug-dump[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
=frames-interp,=str,=loc,=Ranges,=pubtypes,
=gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
=addr,=cu_index]
Display the contents of DWARF2 debug sections
--dwarf-depth=N Do not display DIEs at depth N or greater
--dwarf-start=N Display DIEs starting with N, at the same depth
or deeper
-I --histogram Display histogram of bucket list lengths
-W --wide Allow output width to exceed 80 characters
@<file> Read options from <file>
-H --help Display this information
-v --version Display the version number of readelf
```
**++Example++**
The other way to get the compiler information from an elf binary is through readelf utility.
```
~ # readelf -p .comment a.out
String dump of section '.comment':
[ 0] GCC: (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
```
**++Note++**
If the comment section in a binary is stripped, you will get below warning.
```
~ # readelf -p .comment a.out
readelf:Warning: Section 'comment' was not dumped because it does not exist!
```
# Reference
[^ref1]: https://www.gnu.org/software/binutils/
[^ref2]: https://sourceware.org/binutils/docs/binutils/nm.html