Try   HackMD

Enhance visualized RISC-V simulation

陳致翰, 孫禾洵

Demo

The demo of our final result is available here:

RV32I Instruction Set

The RISC-V RV32I is a 32-bit integer instruction set architecture (ISA) that forms the foundation of the RISC-V. Designed as a minimalist ISA, it provides the necessary function for general purpose computing.

Instruction Format

RV32I instructions are encoded in a fixed 32-bit format, supporting multiple types:

  • R-Type: Used for register-to-register operations (e.g., addition, bitwise AND).
  • I-Type: Used for operations involving immediates and load instructions.
  • S-Type: Encodes store instructions.
  • B-Type: Handles conditional branching.
  • U-Type: Supports large immediate values for load upper immediate lui and auipc.
  • J-Type: Encodes jump instructions.

Each instruction type consists of:

  • Opcode: Specifies the operation.
  • Funct3/Funct7: Additional function codes for specifying variations.
  • Registers: rs1 (source 1), rs2 (source 2), rd (destination).
  • Immediate Values: Encoded constants for arithmetic or memory access.

Instruction Categories

The RV32I ISA is divided into functional groups:

  1. Arithmetic Operations
    • Addition/Subtraction: add, sub.
    • Logical Operations: and, or, xor.
    • Shift Operations: sll (shift left logical), srl (shift right logical), sra (shift right arithmetic).
    • Comparison: slt (set if less than), sltu (unsigned comparison).
  2. Memory Access
    • Load Instructions:lb, lh, lw (load byte, halfword, word), lbu, lhu (unsigned).
    • Store Instructions:sb, sh, sw (store byte, halfword, word).
  3. Control Transfer
    • Branching: beq, bne (branch if equal/not equal),blt, bge (branch if less/greater or equal), bltu, bgeu.
    • Jump Instructions:jal (jump and link), jalr (jump and link register).
  4. Immediate Instructions
    • Arithmetic with Immediate:addi, andi, ori, xori.
    • Shift Immediate:sllisrlisrai.

RV32IM Instruction Set

M Extension

Building on the RV32I base instruction set, RV32IM adds support for integer multiplication and division, including both signed and unsigned operations.

Integer Multiplication Instructions

  1. mul: Signed multiplication that stores the lower 32 bits of the product in the destination register.

    Instruction Meaning
    mul x3, x1, x2
    x3=x1×x2
  2. mulh: Signed multiplication that stores the upper 32 bits of the product in the destination register.

    Instruction Meaning
    muh x4, x1, x2
    x4=High 32 bits of (x1×x2)
  3. mulhu: Unsigned multiplication that stores the upper 32 bits of the product in the destination register.

    Instruction Meaning
    mulhu x5, x1, x2
    x5=High 32 bits of (x1×x2)
  4. mulhsu: Mixed multiplication where the first operand is signed, and the second is unsigned.

    Instruction Meaning
    mulhsu x6, x1, x2
    x6=High 32 bits of (x1×x2)
    • In this instruction, x1 is signed, and x2 is unsigned.

Integer Division Instructions

  1. div: Signed integer division.

    Instruction Meaning
    div x7, x1, x2
    x7=x1÷x2
  2. divu: Unsigned integer division.

    Instruction Meaning
    divu x8, x1, x2
    x8=x1÷x2
    • The result of this instruction is unsigned.
  3. rem: Signed remainder operation.

    Instruction Meaning
    rem x9, x1, x2
    x9=x1modx2
  4. remu: Unsigned remainder operation.

    Instruction Meaning
    remu x10, x1, x2
    x10=x1modx2
    • The result of this instruction is unsigned.

Instruction Format

These instructions follow the R-Type format, consistent with the RV32I base set. The structure is as follows:

funct7 rs2 rs1 funct3 rd opcode
7 bits 5 bits 5 bits 3 bits 5 bits 7 bits
  • opcode: All instructions in the M Extension use the 0110011 opcode (R-Type).
  • funct3 and funct7: These differentiate between various multiplication and division operations.
    Example:
    • funct3 = 000 with specific funct7 values for multiplication instructions (e.g., mul).
    • funct = 100 for division instructions (e.g., div and divu).

emulsiV

Introduction

emulsiV is a visual simulator for Virgule, a minimal CPU core implementation based on the RISC-V architecture. This simulator is intended to be used as a tool for teaching the basics of computer architecture.

Extend emulsiV

  • To extend emulsiV to support the demonstration of the RV32IM instruction set, it is first necessary to define the relevant instruction set.
const ASM_TABLE = {
    lui   : "du",
    auipc : "du",
    jal   : "dp",
    jalr  : "d1i",
    beq   : "12p",
    bne   : "12p",
    blt   : "12p",
    bge   : "12p",
    bltu  : "12p",
    bgeu  : "12p",
    lb    : "da",
    lh    : "da",
    lw    : "da",
    lbu   : "da",
    lhu   : "da",
    sb    : "2a",
    sh    : "2a",
    sw    : "2a",
    addi  : "d1i",
    slli  : "d1i",
    slti  : "d1i",
    sltiu : "d1i",
    xori  : "d1i",
    srli  : "d1i",
    srai  : "d1i",
    ori   : "d1i",
    andi  : "d1i",
    add   : "d12",
    sub   : "d12",
    sll   : "d12",
    slt   : "d12",
    sltu  : "d12",
    xor   : "d12",
    srl   : "d12",
    sra   : "d12",
    or    : "d12",
    and   : "d12",
    mul   : "d12",  // Add MUL instruction
    mulh  : "d12",  // Add MULH instruction
    mulhu : "d12",  // Add MULHU instruction
    mulhsu: "d12",  // Add MULHSU instruction
    div   : "d12",  // Add DIV instruction    
    divu  : "d12",  // Add DIVU instruction
    rem   : "d12",  // Add REM instruction
    remu  : "d12",  // Add REM instruction

Assembly operand syntax:

  • The ASM_table follow this rule:
    • d = destination register (rd).
    • 1 = first source register (rs1).
    • 2 = second source register (rs2).
    • i = immediate.
    • u = upper immediate.
    • p = offset immediate.
    • a = indirect address.
  • The format d12 follow the pattern:
    ​​​​operation rd, rs1, rs2
    
  • The eight instructions in RV32IM are all R-type instructions, also known as register-to-register operations, so the d12 format is sufficient for all of them.

Implemention of RV32IM instruction set extensions

  1. MUL
  2. MULH
  3. MULHU
  4. MULHSU
  5. DIV
  6. DIVU
  7. REM
  8. REMU

Funct3 & Funct7 opcodes

The M-extension of RISC-V provides instructions for integer multiplication and division. Below is an explanation of the opcode settings for these instructions, focusing on the values assigned to funct3, funct7 in the provided code.

Funct3 Settings for M-Extension
The funct3 field specifies the exact operation type for M-extension instructions. It differentiates between multiplication, division, and remainder operations.

const F3_MUL    = 0;
const F3_MULH   = 1;
const F3_MULHSU = 2;
const F3_MULHU  = 3;
const F3_DIV    = 4; 
const F3_DIVU   = 5; 
const F3_REM    = 6;
const F3_REMU   = 7;

Funct7 Settings for M-Extension
The funct7 field provides additional differentiation between instructions, particularly those sharing the same funct3.
Funct7 = 1 Indicates M-extension operations

const F7_MUL    = 1;
const F7_MULH   = 1;
const F7_MULHU  = 1;
const F7_MULHSU = 1;
const F7_DIV    = 1; 
const F7_DIVU   = 1; 
const F7_REM    = 1;
const F7_REMU   = 1;

Map instruction names to fixed field values

INSTR_NAME_TO_FIELDS is a mapping that associates instruction names (like mul, div, rem, etc.) with their corresponding opcode, funct3, and funct7 fields. This mapping is essential for the proper encoding and decoding of instructions in the RISC-V architecture.

const INSTR_NAME_TO_FIELDS = {
    lui   : [OP_LUI],
    auipc : [OP_AUIPC],
    jal   : [OP_JAL],
    jalr  : [OP_JALR  , F3_JALR],
    beq   : [OP_BRANCH, F3_BEQ],
    bne   : [OP_BRANCH, F3_BNE],
    blt   : [OP_BRANCH, F3_BLT],
    bge   : [OP_BRANCH, F3_BGE],
    bltu  : [OP_BRANCH, F3_BLTU],
    bgeu  : [OP_BRANCH, F3_BGEU],
    lb    : [OP_LOAD  , F3_B],
    lh    : [OP_LOAD  , F3_H],
    lw    : [OP_LOAD  , F3_W],
    lbu   : [OP_LOAD  , F3_BU],
    lhu   : [OP_LOAD  , F3_HU],
    sb    : [OP_STORE , F3_B],
    sh    : [OP_STORE , F3_H],
    sw    : [OP_STORE , F3_W],
    addi  : [OP_IMM   , F3_ADD],
    slli  : [OP_IMM   , F3_SL, F7_L],
    slti  : [OP_IMM   , F3_SLT],
    sltiu : [OP_IMM   , F3_SLTU],
    xori  : [OP_IMM   , F3_XOR] ,
    srli  : [OP_IMM   , F3_SR       , F7_L],
    srai  : [OP_IMM   , F3_SR       , F7_A],
    ori   : [OP_IMM   , F3_OR],
    andi  : [OP_IMM   , F3_AND],
    add   : [OP_REG   , F3_ADD      , F7_L],
    sub   : [OP_REG   , F3_ADD      , F7_A],
    sll   : [OP_REG   , F3_SL       , F7_L],
    slt   : [OP_REG   , F3_SLT      , F7_L],
    sltu  : [OP_REG   , F3_SLTU     , F7_L],
    xor   : [OP_REG   , F3_XOR      , F7_L],
    srl   : [OP_REG   , F3_SR       , F7_L],
    sra   : [OP_REG   , F3_SR       , F7_A],
    or    : [OP_REG   , F3_OR       , F7_L],
    and   : [OP_REG   , F3_AND      , F7_L],
    mret  : [OP_SYSTEM, F3_MRET     , F7_MRET, RS2_MRET, RS1_MRET, RD_MRET],
    mul   : [OP_REG   , F3_MUL      , F7_MUL],
    mulh  : [OP_REG   , F3_MULH     , F7_MULH],
    mulhu : [OP_REG   , F3_MULHU    , F7_MULHU],
    mulhsu: [OP_REG   , F3_MULHSU   , F7_MULHSU],
    div   : [OP_REG   , F3_DIV      , F7_DIV],
    divu  : [OP_REG   , F3_DIVU     , F7_DIVU],
    rem   : [OP_REG   , F3_REM      , F7_REM],
    remu  : [OP_REG   , F3_REMU     , F7_REMU]
};

Each instruction in the M-extension has unique values for funct3 and funct7 to distinguish it from other instructions.
INSTR_NAME_TO_FIELDS provides these values, making it possible to encode instructions into binary or decode binary instructions back into their assembly form.

ACTION_TABLE

ACTION_TABLE defines the execution logic for each instruction. Specifically, for the M-extension instructions, it outlines how the CPU simulator should handle the operands and perform the specified ALU operation.

const ACTION_TABLE = {
    lui     : {            src2: "imm", aluOp: "b",    wbMem: "r"                 },
    auipc   : {src1: "pc", src2: "imm", aluOp: "add",  wbMem: "r"                 },
    jal     : {src1: "pc", src2: "imm", aluOp: "add",  wbMem: "pc+", branch: "al" },
    jalr    : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "pc+", branch: "al" },
    beq     : {src1: "pc", src2: "imm", aluOp: "add",                branch: "eq" },
    bne     : {src1: "pc", src2: "imm", aluOp: "add",                branch: "ne" },
    blt     : {src1: "pc", src2: "imm", aluOp: "add",                branch: "lt" },
    bge     : {src1: "pc", src2: "imm", aluOp: "add",                branch: "ge" },
    bltu    : {src1: "pc", src2: "imm", aluOp: "add",                branch: "ltu"},
    bgeu    : {src1: "pc", src2: "imm", aluOp: "add",                branch: "geu"},
    lb      : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "lb"                },
    lh      : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "lh"                },
    lw      : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "lw"                },
    lbu     : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "lbu"               },
    lhu     : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "lhu"               },
    sb      : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "sb"                },
    sh      : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "sh"                },
    sw      : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "sw"                },
    addi    : {src1: "x1", src2: "imm", aluOp: "add",  wbMem: "r"                 },
    slli    : {src1: "x1", src2: "imm", aluOp: "sll",  wbMem: "r"                 },
    slti    : {src1: "x1", src2: "imm", aluOp: "slt",  wbMem: "r"                 },
    sltiu   : {src1: "x1", src2: "imm", aluOp: "sltu", wbMem: "r"                 },
    xori    : {src1: "x1", src2: "imm", aluOp: "xor",  wbMem: "r"                 },
    srli    : {src1: "x1", src2: "imm", aluOp: "srl",  wbMem: "r"                 },
    srai    : {src1: "x1", src2: "imm", aluOp: "sra",  wbMem: "r"                 },
    ori     : {src1: "x1", src2: "imm", aluOp: "or",   wbMem: "r"                 },
    andi    : {src1: "x1", src2: "imm", aluOp: "and",  wbMem: "r"                 },
    add     : {src1: "x1", src2: "x2",  aluOp: "add",  wbMem: "r"                 },
    sub     : {src1: "x1", src2: "x2",  aluOp: "sub",  wbMem: "r"                 },
    sll     : {src1: "x1", src2: "x2",  aluOp: "sll",  wbMem: "r"                 },
    slt     : {src1: "x1", src2: "x2",  aluOp: "slt",  wbMem: "r"                 },
    sltu    : {src1: "x1", src2: "x2",  aluOp: "sltu", wbMem: "r"                 },
    xor     : {src1: "x1", src2: "x2",  aluOp: "xor",  wbMem: "r"                 },
    srl     : {src1: "x1", src2: "x2",  aluOp: "srl",  wbMem: "r"                 },
    sra     : {src1: "x1", src2: "x2",  aluOp: "sra",  wbMem: "r"                 },
    or      : {src1: "x1", src2: "x2",  aluOp: "or",   wbMem: "r"                 },
    mul     : {src1: "x1", src2: "x2",  aluOp: "mul",  wbMem: "r"                 }, // mul instruction
    mulh    : {src1: "x1", src2: "x2",  aluOp: "mulh", wbMem: "r"                 }, // mulh instruction
    mulhu   : {src1: "x1", src2: "x2",  aluOp: "mulhu",wbMem: "r"                 }, // mulhu instruction
    mulhsu  : {src1: "x1", src2: "x2",  aluOp: "mulhsu",wbMem:"r"                 }, // mulhsu instruction
    div     : {src1: "x1", src2: "x2",  aluOp: "div",  wbMem: "r"                 }, // div instruction
    rem     : {src1: "x1", src2: "x2",  aluOp: "rem",  wbMem: "r"                 }, // rem instruction
    remu    : {src1: "x1", src2: "x2",  aluOp: "remu", wbMem: "r"                 }, // rem instruction
    and     : {src1: "x1", src2: "x2",  aluOp: "and",  wbMem: "r"                 },
    mret    : {                                                                   },
    invalid : {                                                                   },
};

All M-extension instructions share the same structure for operands src1, src2, rd and write-back behavior wbMem . Only the aluOp field changes.

Compute

The compute function in emulsiV is responsible for executing the ALU operations defined by the aluOp field in ACTION_TABLE.

  1. MUL

    ​​​​case "mul":  this.aluResult = signed(a) * signed(b);  break; 
    

    The MUL instruction performs signed integer multiplication of two 32-bit operands and produces the low 32 bits of the result.

  2. MULH

    ​​​​case "mulh": this.aluResult = Number((BigInt(signed(a)) * BigInt(signed(b))) >> 32n);
    
    • The MULH instruction computes the high 32 bits of the signed 64-bit product of two 32-bit integers.
    • JavaScript's number type has 53 bits of precision by default.
    • This is insufficient to represent the full 64-bit result of multiplying two 32-bit integers.
    • BigInt provides arbitrary precision for integers, allowing the accurate computation of 64-bit results without precision loss.
    • >> 32n performs a logical right shift by 32 bits on the 64-bit product to extract the high 32 bits.
  3. MULHU

    ​​​​case "mulhu":this.aluResult = Number((BigInt(unsigned(a)) * BigInt(unsigned(b))) >> 32n);
    
    • The MULHU instruction computes the high 32 bits of the product of two unsigned 32-bit integers.
    • Unlike MUL or MULH, which deal with signed integers, MULHU specifically handles unsigned integers.
    • The primary computation logic of MULHU is similar to MULH, but MULHU operates on two unsigned operand.
    • unsigned(a) and unsigned(b) converts the operands a and b into unsigned 32-bit integers.
  4. MULSH

    ​​​​case "mulhsu":this.aluResult = Number((BigInt(signed(a)) * BigInt(unsigned(b))) >> 32n);    break;
    
    • The MULHSU instruction computes the high 32 bits of the product of a signed 32-bit integer and an unsigned 32-bit integer. It combines signed and unsigned arithmetic in a single operation.
    • The primary computation logic of MULHSU is similar to MULH, but MULHSU operates on a signed operand (rs1) and an unsigned operand (rs2).
    • signed(a) and unsigned(b) converts the operands a and b into a signed 32-bit integer and an unsigned 32-bit integers.
  5. DIV

    ​​​​case "div": this.aluResult = signed(b) === 0 ? 0xFFFFFFFF : Math.trunc(signed(a)/signed(b));
    
    • The DIV instruction performs signed integer division of two 32-bit integers.
      • Operand a (rs1) is the dividend.
      • Operand b (rs2) is the divisor.
    • The result is the integer quotient of
      a÷b
      .
    • If divisor b is zero, the result is defined as 0xFFFFFFFF by the RISC-V specification.
    • Math.trunc removes any fractional part of the result, ensuring an integer value.
  6. DIVU

    ​​​​case "divu": this.aluResult = unsigned(b) === 0 ? 0xFFFFFFFF : Math.floor(unsigned(a)/unsigned(b));
    
    • The DIVU instruction performs unsigned integer division of two 32-bit integers
      • Operand a (rs1) is the dividend.
      • Operand b (rs2) is the divisor.
    • The result is the integer quotient of a ÷ b
    • Using unsigned(a) and unsigned(b) ensures the operands are treated as unsigned values.
    • If divisor b is zero, the result is defined as 0xFFFFFFFF by the RISC-V specification.
    • Math.floor rounds the result down to the nearest integer.
  7. REM

    ​​​​case "rem":  this.aluResult = signed(b) === 0 ? signed(a) : (signed(a) % signed(b));
    
    • The REM instruction computes the signed integer remainder when dividing two 32-bit integers:
      • Operand a (rs1) is the dividend.
      • Operand b (rs2) is the divisor.
    • The result is the remainder after performing signed division ,which satisfies:
      a=(a÷b)×b+remainder
    • If divisor b is zero, the result is defined as dividend a by the RISC-V specification.
  8. REMU

    ​​​​case "remu":  this.aluResult = unsigned(b) === 0 ? unsigned(a) : (unsigned(a) % unsigned(b));  
    
    • The REMU instruction computes the unsigned integer remainder when dividing two 32-bit unsigned integers:
      • Operand a (rs1) is the dividend.
      • Operand b (rs2) is the divisor.
    • The result is the remainder after performing unsigned division ,which satisfies:
      a=(a÷b)×b+remainder
    • Using
      unsigned(a)
      and
      unsigned(b)
      ensures the operands are treated as unsigned values.
    • If divisor b is zero, the result is defined as dividend a by the RISC-V specification.

Reference

  1. RV32IM assembly instructions reference card
  2. The RISC-V Instruction Set Manual Volume I

Always refer to the official RISC-V resources!