# Discuss the structures of RzIL op Decide to use Core Theory as our IL opcodes. Our task is to implement the interfaces declared in this [page](http://binaryanalysisplatform.github.io/bap/api/master/bap-core-theory/Bap_core_theory/Theory/module-type-Minimal/index.html) We may have to create these types in C ([link](http://binaryanalysisplatform.github.io/bap/api/odoc/bap-core-theory/Bap_core_theory/index.html#hierarchy-of-terms)) ## RzIL opcode : enums and structs ### enums enum may be defined like this ```c= typedef enum { // Init OP_VAR, OP_UNK, OP_ITE, // Bool OP_AND, OP_OR, // ... // BitVector OP_INT, // ... // Memory OP_LOAD, OP_STORE, // Effects (opcode with side effects) OP_PERFORM, OP_SET, // ... } CoreTheoryOPCode; ``` and use struct `RzILOp` as our internal data structure to represent the Instruction (aka `op`) instead of using string : ```c= typedef enum {/*...*/} CoreTheoryOp; /* define every CoreTheory opcode strucut ()*/ // for example : ite in Ocaml // val ite : bool -> 'a pure -> 'a pure -> 'a pure // ite c x y is x if c evaluates to b1 else y. // we create the following struct // while the RzIL_PURE is only a mark to remind the developer // it's a pure type // as for the RzVal, we will talk it in the next section #define RzIL_PURE typedef struct { // 3 arguments RzIL_PURE RzVal arg1; RzIL_PURE RzVal arg2; Bool arg3; // TODO: I'm not sure should we put the return value here // RzIL_PURE RzVal ret; } Op_ite; // .... More Opcodes // Then define a union to union all of these struct typedef union { Op_ite ite; Op_load load; // ... More } _RzILOp; ``` The wrapper struct of RzIL instruction looks like this : ```c= // the final structure typedef struct RzILOp_t { ut64 id; CoreTheoryOp code; _RzILOp op; } RzILOp; ``` - `code` : enum of Core Theory op, can be used to find the corresponding function : `op_A_funcp = func_table[opcode];` and it maybe used in VM. (switch case etc.) - `id` : identifier of a specific instruction, should be unique. Anton suggest to add this attributes, In uplifting stage, an assembly instruction can be translated to multiple RzIL Instruction, or multiple assembly instructions can be translated to a single RzIL. So we may use this identifier with an hash table to find a specific RzIL Instruction. - `op` : the _RzILOp union The VM is a fetch-and-exec loop, but a Effect structure is obtained before `exec` ```c= while (True) { RzILOp cur_op = fetch_one(); // get possible side effects RzILEffect eff = parse(cur_op); handle_effect(eff); exec(cur_op); } ``` the `RzILEffect` are not designed. Anton mentioned the delay slot case in MIPS. It should be considered. ```c= RzILEffect exec(RzILOp op){ RzILOp opcode = op->code; RzILfuncp handler = func_table[opcode]; return handler(op); } ``` ## Val and Var structures should implement these two to support some features of Ocaml types. - Val -> store te value/data - Var -> meta info about an variable ```c= struct RzILVal{ Union { // CoreTheory Types BitVector bv, Bool bl, // C-predefined types ut8, int, float } }; struct RzILVar{ string type; string varname; } ``` from my POV, it would be better to build a wrapper of these two : ```c= struct RzILVariable{ RzILVal val; RzILVar var; }; ``` ## function prototype ### bitvector mod && Bool mod The function prototypes in these two modules are straightfoward ```c= Bool rz_il_op_inv (bool); Bool rz_il_op_and (bool, bool); Bool rz_il_op_or (bool, bool); // ... Bool rz_il_op_lsb (Bitvector); Bitvector rz_il_op_add (Bitvector, Bitvector); ``` ### memory Mod only two instruction : `load` and `store` -> we should discuss about the memory design of our new VM ### Init Mod 1. unk likes NULL, just an identifier 2. RzILVal rz_il_op_var (RzILVar variable); to get value of a variable 3. RzILVariable op_ite(Bool condition, RzILVariable x, RzILVariable y) => condition ? x : y ### Effect Mod don't have a good idea, need to think more.