Try   HackMD

LLVM-based GBA Emulator

tags: llvm GBA

Source Code

TODO

  • 繼續完成 ARM -> LLVM IR 的對應
  • 我忘記 Thumb mode 是 16bits 了,這下子有不少東西要改了
  • 有必要重新𨤳清數值系統的對應關係
  • CPSR
  • MMIO

Reference

Info

  • MCInst 就是 opcode 和 一堆 operand 而已
  • ARMGenInstrInfo.inc 裡有列出所有 ARM Backend 可識別的 opcode
  • LLVM Backend 是在哪個步驟生成 MCInst 的?怎麼生成的?

lib/Target/ARM/ARMAsmPrinter.cpp:1289

void ARMAsmPrinter::emitInstruction(const MachineInstr *MI) {

它會先用 TableGen 生成的函式轉換:

bool ARMAsmPrinter:: emitPseudoExpansionLowering(MCStreamer &OutStreamer, const MachineInstr *MI) { switch (MI->getOpcode()) { default: return false; case ARM::B: { MCInst TmpInst; MCOperand MCOp; TmpInst.setOpcode(ARM::Bcc); // Operand: target lowerOperand(MI->getOperand(0), MCOp); TmpInst.addOperand(MCOp); // Operand: p TmpInst.addOperand(MCOperand::createImm(14)); TmpInst.addOperand(MCOperand::createReg(0)); EmitToStreamer(OutStreamer, TmpInst); break; }

轉不了再手動轉

  • MCInstrDesc ?
/// Describe properties that are true of each instruction in the target /// description file. This captures information about side effects, register /// use and many other things. There is one instance of this struct for each /// target instruction class, and the MachineInstr class points to this struct /// directly to describe itself. class MCInstrDesc { public: unsigned short Opcode; // The opcode number unsigned short NumOperands; // Num of args (may be more if variable_ops) unsigned char NumDefs; // Num of args that are definitions unsigned char Size; // Number of bytes in encoding. unsigned short SchedClass; // enum identifying instr sched class uint64_t Flags; // Flags identifying machine instr class uint64_t TSFlags; // Target Specific Flag values const MCPhysReg *ImplicitUses; // Registers implicitly read by this instr const MCPhysReg *ImplicitDefs; // Registers implicitly defined by this instr const MCOperandInfo *OpInfo; // 'NumOperands' entries about operands
  • target description file 我猜是 .td。
  • 每個 instruction class 都會有一個 MCInstDesc instance,也就是說他是描述 instruction class (data prrcess, branch, cmp, etc) 的類,一個 opcode 對應一個 MCInstrDesc

:exclamation: opcode 並不是最終指令編碼的 opcode,而是 LLVM 自己定義的 opcode

  • MachineInstr 有個指標指到這個類,這意味著它應該是 Codegen 的最後一步了。
  • IRTranslator 會將 Instruction.def 裡的每個 LLVM IR 映射到 TargetOpcodes.def 裡的 gMIR,每個 gMIR 都有對應的 MCInstDesc,而 MCInstDesc 由 backend 定義
  • Value 本身並不具備任何編號。

Question

  • Code generator 需要哪些資料?
  • 承上,哪些資料與程式碼是 TableGen 可以生成的?哪些是手寫的?
  • 承上,需要準備哪些資料好讓 TableGen 生成所需的資料與程式碼?