# Characteristics of Nintendo 64 compiler output Or: heuristics for distinguishing them. ## General compiler stuff ### IDO * Unconditional branches use `b`. * Float literals can be loaded with `lui` followed by `mtc1` or from rodata with `lui` and `lwc1`. These can be reordered among other instructions in the function. (It seems to never use `lui`, `ori`, `mtc1` but this needs to be verified) * This is the only compiler where an instruction that isn't `mfhi` or `mflo` can follow a `break 6`. * The expansion of division and modulus for ido in general allows for reorderings that the other compilers don't have, but this is the easiest to check. * (-g only) The compiler will emit `b after; nop; after:` frequently. These are the words 0x10000001 0x00000000. ### KMC GCC (2.7.2) * Unconditional branches use `j`. * Float literals will always be loaded with a sequence of `lui`, an optional `ori`, and `mtc1` with the GPR used for all three being $at. These are always directly sequential. * `break 6` will always be followed by `mfhi` or `mflo`. `break 7` will always be followed by `mfhi`, `mflo`, or `addiu $at, $zero, -0x1` (exactly). ### GCC 2.8(?) * Function epilogue (`addiu $sp, $sp, X`, `jr $ra`) can be omitted after an infinite loop. ### SN64 * Unconditional branches use `j`. * Float literals will always be loaded with a sequence of `lui` and `lwc1` with the GPR being used for both being $at. These are always directly sequential. * `mtc1 $at, ...` is *never* emitted. * `break 6` will always be followed by `mfhi` or `mflo`. `break 7` will always be followed by `mfhi`, `mflo`, or `addiu $at, $zero, -0x1` (exactly). ## MIPS levels ### MIPS I * requires a `nop` after certain loads (TODO: which?) ### MIPS III * An unencumbered `-mips3` (not patched as in e.g. KMC) will use `daddiu` etc. for loads (see e.g. iQue) # Entrypoints ## makerom Original assembly looks like ```mips la $8 <BssStart> li $9 <bssSize> 1: sw $0, 0($8) sw $0, 4($8) addi $8, 8 addi $9, 0xfff8 bne $9, $0, 1b la $29 <bootStack> la $10 <bootEntry> j $10 ``` The symbols in `<>` are all written as numeric constants. IDO will expand `la` to `(lui)/addiu` and `li` to `(lui)/ori` in such cases; `j` expands to `jr`. ## mild ```mips la $8,_%sSegmentBssStart la $9,_%sSegmentBssSize 1: sw $0, 0($8) sw $0, 4($8) addi $8, 8 addi $9, 0xfff8 bne $9, $0, 1b la $10,%s # bootEntry la $29,%s # bootStack jr $10 ``` (not 100% confirmed but very likely from reading it from the binary). The instructions all use *symbols*, which GCC expands as `lui/addiu`. ## SN64 There are several versions of the entrypoint for SN64. * The main distinguishing common factors between them is that they all have `li $sp, 0x803FFFF0` (aka `lui $sp, 0x803F` followed by `ori $sp, $sp, 0xFFF0`) and that they use a `jal` for calling the bootproc instead of `jr`. * `break` also appears frequently, although some custom versions do not use it.