黃若綾, 蔡雅彤, 林靖婷
This CPU is designed as an auxiliary processor in FPGA designs and ASICs. It supports a wide range of configurations for flexibility in performance, size, and feature set. Example configurations include:
PicoRV32 comes in three core variations:
picorv32
: Simple native memory interface.picorv32_axi
: AXI4-Lite Master interface for compatibility with AXI-based systems.picorv32_wb
: Wishbone Master interface.Additional modules include an AXI4 adapter and PCPI cores for implementing custom instructions.
Module | Description |
---|---|
picorv32 |
The PicoRV32 CPU |
picorv32_axi |
CPU with AXI4-Lite interface |
picorv32_axi_adapter |
Adapter from PicoRV32 Memory Interface to AXI4-Lite |
picorv32_wb |
CPU with Wishbone Master interface |
picorv32_pcpi_mul |
PCPI core implementing MUL[H[SU |
picorv32_pcpi_fast_mul |
Single-cycle multiplier version of pcpi_mul |
picorv32_pcpi_div |
PCPI core implementing DIV[U]/REM[U] instructions |
firmware/
: Simple test firmware for IRQ handling and PCPI cores.tests/
: Instruction-level tests from riscv-tests.dhrystone/
: Dhrystone benchmark.picosoc/
: Example SoC using PicoRV32.scripts/
: Synthesis and hardware configuration scripts.The RISC-V GNU toolchain and libraries will be install in /opt/riscv32i
:
Run make test_vcd
in the picorv32 folder, and the result:
Run gtkwave testbench.vcd
to check the wave file
This extension is intended to provide some combination of code size reduction, performance improvement, and energy reduction. According to different operation properties, it is divided into four categories, Zba, Zbb, Zbc and Zbs extension.
The detail could be found in the document.
Extension | Operation |
---|---|
Zba | Address generation instructions |
Zbb | Basic bit-manipulation |
Zbc | Carry-less multiplication |
Zbs | Single-bit instructions |
Zba extension for RV32 includes the following instructions:
Mnemonic | Instruction | Type |
---|---|---|
sh1add rd, rs1, rs2 | Shift left by 1 and add | R-type |
sh2add rd, rs1, rs2 | Shift left by 2 and add | R-type |
sh3add rd, rs1, rs2 | Shift left by 3 and add | R-type |
Zbb extension for RV32 includes the following instructions:
Mnemonic | Instruction | Type |
---|---|---|
andn rd, rs1, rs2 | AND with inverted operand | R-type |
orn rd, rs1, rs2 | OR with inverted operand | R-type |
xnor rd, rs1, rs2 | Exclusive OR | R-type |
max rd, rs1, rs2 | Maximum | R-type |
maxu rd, rs1, rs2 | Unsigned maximum | R-type |
min rd, rs1, rs2 | Minimum | R-type |
minu rd, rs1, rs2 | Unsigned minimum | R-type |
rol rd, rs1, rs2 | Rotate left (Register) | R-type |
ror rd, rs1, rs2 | Rotate right (Register) | R-type |
clz rd, rs | Count leading zero bits | I-type |
ctz rd, rs | Count trailing zero bits | I-type |
cpop rd, rs | Count set bits | I-type |
sext.b rd, rs | Sign-extend byte | I-type |
sext.h rd, rs | Sign-extend halfword | I-type |
zext_h rd, rs | Sign-extend halfword | I-type |
rori rd, rs | Rotate right (Immediate) | I-type |
orc.b rd, rs | Bitwise OR-Combine, byte granule | I-type |
rev8 rd, rs | Byte-reverse register | I-type |
Mnemonic | Instruction | Type |
---|---|---|
clmul rd, rs1, rs2 | Carry-less multiply (low-part) | R-type |
clmulh rd, rs1, rs2 | Carry-less multiply (high-part) | R-type |
clmulr rd, rs1, rs2 | Exclusive OR | R-type |
Mnemonic | Instruction | Type |
---|---|---|
bclr rd, rs1, rs2 | Single-Bit Clear (Register) | R-type |
bext rd, rs1, rs2 | Single-Bit Extract (Register) | R-type |
binv rd, rs1, rs2 | Single-Bit Invert (Register) | R-type |
bset rd, rs1, rs2 | Single-Bit Set (Register) | R-type |
bclri rd, rs1, imm | Single-Bit Clear (Immediate) | I-type |
bexti rd, rs1, imm | Single-Bit Extract (Immediate) | I-type |
binvi rd, rs1, imm | Single-Bit Invert (Immediate) | I-type |
bseti rd, rs1, imm | Single-Bit Set (Immediate) | I-type |
Download the official testbench from riscv-tests. In the isa
folder, we can find the test cases for Zba, Zbb, Zbc, Zbs extensions (rv32uzba, rv32uzbb, rv32uzbc, rv32uzbs). Copy the assembly files into picorv32/tests/
(Some files will need to modify from 64-bit version).
Run make test_vcd
to check whether the added instructions can be operate correctly.
We performed validation on the CLZ instruction in the B extension and compared it with the CLZ implementation in RV32I.
CLZ C implement
Below is the optimized CLZ function in C code:
CLZ assembly
And the following is the assembly of clz c code accordingly:
Validation result
We implemented clztest.c to peform and check cycles in these two situations.
And below is the validation result, which shows the cycles of 5 clz examples.