# Superscalar out-of-order RISC-V capable of booting Linux > 林志懋 Make `soomrv` support RISC-V rv64 execution. ## Todo - [ ] uOPs definition Frontend: - [ ] Inst fetch - [ ] Decoder - [ ] Rename - [ ] Branch prediction logic Execution: - [ ] Isssue Queue - [ ] Operand collect - [ ] ALU - [ ] Branch - [ ] CSR ## Decode Since the implementation of the fetch stage is tighlty coupled with the memory subsystem, I will skip that part for now. The decode stage mainly produces two info, one being the decoded internal representation, `D_UOp`, and one for special instructions that change the control flow like `wfi`, `ecall` ... Adding rv64 instructions won't change the latter. Changes have been committed to [commit](https://github.com/millaker/SoomRV/commit/337226a89d8edd46c446785fe5f9ff571db10b1b). ## Rename For every valid decode uOPs, `D_UOps`, the rename module will allocate a ROB entry for it, specified by the sequence number `SqN`. At the same time, the result of the source operands validity will be checked by `RenameTable` or register alias table (RAT). Sources that are already present in the physical register will be marked valid in the output `R_UOp` (`availA`, `availB`, `availC`) and the index into the ROB are also stored in the uop. ## IssueQueue Rename module will enqueue decoded `R_UOps` into one of the `IssueQueue`s. `R_UOp`s that reside in the queue will update source availability every cycle. When the instruction is ready, it will be issued to its target funcitonal unit. To support rv64, immediates will need 64-bit to encode. While modifying bitwidth of the immediates in `IssueQueue`, I came across the following section and couldn't figure out the special encoding used for these three opcodes. Will need to look into branch unit where these immediates where decoded and used. ```verilog // Special handling for jalr if (HasFU(FU_BRANCH) && enqCandidates[i].fu == FU_BRANCH && (enqCandidates[i].opcode == BR_V_JALR || enqCandidates[i].opcode == BR_V_JR || enqCandidates[i].opcode == BR_V_RET)) begin assert(IMM_BITS == 36); assert(NUM_OPERANDS == 2); // Use {imm[0], tags[1]} to encode 8 bits of imm12 temp.tags[NUM_OPERANDS-1] = Tag'(enqCandidates[i].imm12[6:0]); temp.imm[0] = enqCandidates[i].imm12[7]; // rest goes into upper 4 bits of 36 (!) immediate bits temp.imm[IMM_BITS-1-:4] = enqCandidates[i].imm12[11:8]; // tags[1] is not used for register encoding, thus is always valid temp.avail[NUM_OPERANDS-1] = 1; end ``` So this is just to save registers. We need to send `jalr` predicted address down the pipeline, therefore using unused fields to store the immediate and restore when dequeued. ```verilog typedef struct packed { logic[IMM_BITS-1:0] imm; logic[NUM_OPERANDS-1:0] avail; Tag[NUM_OPERANDS-1:0] tags; ... } R_ST_UOp; ``` We need to extend `IMM_BITS` since the bitwidth of `tags` wont change. [Commit](https://github.com/millaker/SoomRV/commit/6608361c213829c5ab6a4d3c4e4abd8c9bb2dc4f) ## Load Since uOps piped to `IssueQueue` all store register file tags, but not the actual register value, this `Load` stage is responsible of reading the actual physical regsiter value. The validity of the value is resolved in `IssueQueue` so `Load` can guarantee the update to date value. [Commit](https://github.com/millaker/SoomRV/commit/11f4a59e98b095e1d4f6b08ae877b6b3e6ff7ca7) ## IntALU ALU will operate on 64-bit instead of 32-bit values. [Commit](https://github.com/millaker/SoomRV/commit/32dd8e0f382eb89ad2e7f3b687961305b307c244) ## Multiplier rv64 added a new instruction `MULW` which operates on 32-bit source registers and produce 64-bit sign extended value of the 32-bit result. All other `MUL*` instructions are 64-bit. [Commit](https://github.com/millaker/SoomRV/commit/310aaf05c8c1f71279438d6d31ede0acf82e4f1a) ## Divider Same with `Multiply`, divide added several new instructions that operates on 32-bit values, `DIVW`, `DIVUW`, `REMW`, `REMUW`. [Commit](https://github.com/millaker/SoomRV/commit/91a60ff2e292fa64d0beadc782e70660e71c3970) ## CSR Because rv64 can operate on 64-bit values directly, there is no need to split some CSRs into two halves. `mstatus`, `mcycle`, and other CSRs now do not require a separate read to the `*h` version to form the full CSR value. [Commit](https://github.com/millaker/SoomRV/commit/261c204ceef00c6045894d97ea41d7b782884e5f) I have left some CSRs unchanged because I'm not interested in them for now. ## LoadStore system Load/Store instructions unlike arithmetic instructions require address translation when virtual memory is involved. Start from address generation unit `AGU`, the virtual address will be translated into physical address either by looking up the TLB or by a page table walk, performed by a hardware `PageTableWalker`. `SoomRV` employs VIPT where the cache is indexed by virtual address, so an early load signal `eldUOp` is used to signal the cache. [Commit](https://github.com/millaker/SoomRV/commit/8c879158f26b129fe536e8571ec354cf27f4dc8f) After obtaining the physical address, the command will be enqueued into one of the load queue or store queue. The store queue makes sure that the memory is consistent with the committed instructions. Only when the store instruction is committed can other components see the change to memory. The load queue enables load bypass, which checks if the load value can be forwarded from the store queue. [Commit](https://github.com/millaker/SoomRV/commit/7712c5b55f82f14de9f399d9566e72c9848da5de) [Commit](https://github.com/millaker/SoomRV/commit/fffde980d32bb4f3db3f0a448825d17a4f696098)