# RISC-V core ###### tags: `110-1_RISC-V`、`CA_Final Project` ## Overview ![](https://i.imgur.com/GOY5fom.png) ## Feature * 32-bit RISC-V ISA CPU core. * Support RISC-V integer (I), multiplication and division (M), and CSR instructions (Z) extensions (RV32IMZicsr). * Supports user, supervisor and machine mode privilege levels. * Support for instruction / data cache, AXI bus interfaces or tightly coupled memories. * Coremark: 2.94 CoreMark/MHz * Dhrystone: 1.25 DMIPS/MHz :::info In this document, I will focus on implementation of AXI bus and Cache ::: ## Environment Setup 1. Git clone the project ``` $ git clone https://github.com/ultraembedded/riscv.git ``` 2. Install the required package ``` $ sudo apt-get install libelf-dev binutils-dev ``` 3. Install GTKwave ``` $ sudo apt install gtkwave ``` 4. Install SystemC and Verilator * [SystemC 2.3.1](https://www.accellera.org/images/downloads/standards/systemc/systemc-regressions-2.3.1a.tar.gz) ``` tar -xvf systemc-2.3.1a.tar.gz ``` * After untar , follow the `INSTALL` document in the systemc directory to properly install systemc * [Verilator](https://verilator.org/guide/latest/install.html) * Switch the verilator version to `V4.012` and install it ``` # Prerequisites: #sudo apt-get install git perl python3 make autoconf g++ flex bison ccache #sudo apt-get install libgoogle-perftools-dev numactl perl-doc #sudo apt-get install libfl2 # Ubuntu only (ignore if gives error) #sudo apt-get install libfl-dev # Ubuntu only (ignore if gives error) #sudo apt-get install zlibc zlib1g zlib1g-dev # Ubuntu only (ignore if gives error) git clone https://github.com/verilator/verilator # Only first time # Every time you need to build: unsetenv VERILATOR_ROOT # For csh; ignore error if on bash unset VERILATOR_ROOT # For bash cd verilator git pull # Make sure git repository is up-to-date git tag # See what versions exist #git checkout master # Use development branch (e.g. recent bug fixes) #git checkout stable # Use most recent stable release #git checkout v{version} # Switch to specified release version autoconf # Create ./configure script ./configure # Configure and create Makefile make -j `nproc` # Build Verilator itself (if error, try just 'make') sudo make install ``` ## Modify Makefile * Modify `makefile` in `/riscv/top_tcm_axi/tb/` and `/riscv/top_cache_axi/tb/` * Change the `VERILATOR_SRC` and `SYSTEMC_HOME` to the path where you install systemc and verilator ```c= ############################################################################### ## Tool paths ############################################################################### VERILATOR_SRC ?= /usr/local/share/verilator/include SYSTEMC_HOME ?= /usr/local/systemc-2.3.1 TEST_IMAGE ?= ../../isa_sim/images/basic.elf CORE ?= riscv export VERILATOR_SRC export SYSTEMC_HOME ############################################################################### ## Makefile ############################################################################### all: build build: make -C ../../isa_sim lib make -f makefile.generate_verilated CORE=$(CORE) make -f makefile.build_verilated make -f makefile.build_sysc_tb clean: make -f makefile.generate_verilated CORE=$(CORE) $@ make -f makefile.build_verilated $@ make -f makefile.build_sysc_tb $@ -rm *.vcd run: build ./build/test.x -f $(TEST_IMAGE) ``` ## Run the program * Run top_tcm_axi/tb ``` $ cd riscv/top_tcm_axi/tb/ $ make $ make run ``` * Result ``` ./build/test.x -f ../../isa_sim/images/basic.elf SystemC 2.3.1-Accellera --- Jan 4 2022 15:18:31 Copyright (c) 1996-2014 by all Contributors, ALL RIGHTS RESERVED Info: (I702) default timescale unit used for tracing: 1 ns (sysc_wave.vcd) Memory: 0x2000 - 0x3cd3 (Size=7KB) [.text] Memory: 0x3cd4 - 0x3ce7 (Size=0KB) [.data] Memory: 0x3ce8 - 0x4d07 (Size=4KB) [.bss] Starting from 0x00002000 Test: 1. Initialised data 2. Multiply 3. Divide 4. Shift left 5. Shift right 6. Shift right arithmetic 7. Signed comparision 8. Word access 9. Byte access 10. Comparision TB: Aborted at 109020 ns ``` * Run top_cache_axi/tb ``` $ cd riscv/top_cache_axi/tb/ $ make $ make run ``` * Result ``` ./build/test.x -f ../../isa_sim/images/basic.elf SystemC 2.3.1-Accellera --- Jan 4 2022 15:18:31 Copyright (c) 1996-2014 by all Contributors, ALL RIGHTS RESERVED Info: (I702) default timescale unit used for tracing: 1 ns (sysc_wave.vcd) Memory: 0x2000 - 0x3cd3 (Size=7KB) [.text] Memory: 0x3cd4 - 0x3ce7 (Size=0KB) [.data] Memory: 0x3ce8 - 0x4d07 (Size=4KB) [.bss] Starting from 0x00002000 Test: 1. Initialised data 2. Multiply 3. Divide 4. Shift left 5. Shift right 6. Shift right arithmetic 7. Signed comparision 8. Word access 9. Byte access 10. Comparision TB: Aborted at 157910 ns ``` :::info After $make run, it will generate **sysc_wave.vcd** and we can use **gtkwave** to observe the waveform of the testbench ::: ## AXI4 Bus protocol * [AXI4 spec](http://www.gstitt.ece.ufl.edu/courses/fall15/eel4720_5721/labs/refs/AXI4_specification.pdf) * In AXI4 protocol, it use a special mechanism to ensure data transfer and control signal transfer, call `Handshake` * The `source` generates the `VALID` signal to indicate **when the address, data or control information is available**. The `destination` generates the `READY` signal to indicate that **it can accept the information**. Transfer occurs only when **both the VALID and READY signals** are **HIGH** :::info In implement of AXI4 BUS we will design five transaction channels: 1. AR Channel: Send **read address** and **read control signal** (e.g. arvalid, arlen, arsize, rid) 2. R Channel: Send back **required data** and **read responses** (e.g. rresp, rlast) 3. AW Channel: Send **write address** and **write control signal** (e.g. awvalid, awlen, awsize, wid, wstrb) 4. W Channel: Write **required data** into memory and send **wlast** signal 5. B Channel: Send back **write response** (e.g. bvalid, bready, bid, bresp) ::: ### Example of handshake process * In Figure A3-2, the source presents the address, data or control information after T1 and asserts the **VALID** signal.The destination asserts the **READY** signal after T2, and the source must **keep its information stable until the transferoccurs at T3**, when this assertion is recognized ![](https://i.imgur.com/kKsbH3R.png) * In Figure A3-3, the destination asserts **READY**, after T1, before the address, data or control information is valid,indicating that it can accept the information. The source presents the information, and asserts **VALID**, after T2, and the transfer occurs at T3, when this assertion is recognized. In this case, **transfer occurs in a single cycle.** ![](https://i.imgur.com/X0fmd4e.png) * In Figure A3-4, **both the source and destination** happen to indicate, after T1, that they can transfer the address, data or control information. In this case the transfer occurs at the rising clock edge when the assertion of **both VALID and READY** can be recognized. This means the transfer occurs at T2. ![](https://i.imgur.com/jNPqIRI.png) :::info * A **source** is **not permitted** to wait until **READY** is asserted before asserting **VALID** * A **destination is permitted** to wait for **VALID** to be asserted before asserting the corresponding **READY** * Once **VALID** is asserted it must **remain asserted** until the handshake occurs ::: ### Dependencies between channel handshake signals :::warning * **Single-headed arrows** point to signals that can be asserted **before or after the signal** at the start of the arrow * **Double-headed arrows** point to signals that must be asserted **only after** assertion of the signal at the start of the arrow. ::: #### **Read transaction dependencies** * The `master` **must not wait** for the `slave` to assert **ARREADY** before asserting **ARVALID** * The `slave` **can wait** for **ARVALID** to be asserted before it asserts **ARREADY** * The `slave` can assert **ARREADY** before **ARVALID** is asserted * The `slave` **must wait** for **both ARVALID and ARREADY** to be asserted before it asserts **RVALID** to indicate that valid data is available * The `slave` **must not wait** for the `master` to assert **RREADY** before asserting **RVALID** * The `master` **can wait** for **RVALID** to be asserted before it asserts **RREADY** * The `master` **can assert RREADY** before **RVALID** is asserted. ![](https://i.imgur.com/xCmGWzw.png) #### **Write transaction dependencies** * The `master` **must not wait** for the `slave` to assert **AWREADY** or **WREADY** before asserting **AWVALID** or **WVALID** * The `slave` **can wait** for **AWVALID** or **WVALID**, or **both** before asserting **AWREADY** * The `slave` **can assert AWREADY** before **AWVALID** or **WVALID**, or **both**, are asserted * The `slave` **can wait** for **AWVALID** or **WVALID**, or **both**, before asserting **WREADY** * The `slave` **can assert WREADY** before **AWVALID** or **WVALID**, or **both**, are asserted * The `slave` **must wait** for **both WVALID and WREADY** to be asserted before asserting **BVALID** * The `slave` **must wait** for **WLAST** to be asserted before asserting **BVALID**, because the write response, **BRESP**, must be signaled **only after the last data transfer** of a write transaction * The `slave` **must not wait** for the `master` to assert **BREADY** before asserting **BVALID** * The `master` **can wait** for **BVALID** before asserting **BREADY** * The `master` **can assert BREADY** before **BVALID** is asserted. ![](https://i.imgur.com/wDJBtFU.png) #### **Write response dependency** * The `master` **must not wait** for the `slave` to assert **AWREADY** or **WREADY** before asserting **AWVALID** or **WVALID** * The `slave` **can wait** for **AWVALID** or **WVALID**, or **both**, before asserting **AWREADY** * The `slave` **can assert AWREADY** before **AWVALID** or **WVALID**, or **both**, are asserted * The `slave` **can wait** for **AWVALID** or **WVALID**, or **both**, before asserting **WREADY** * The `slave` **can assert WREADY** before **AWVALID** or **WVALID**, or **both**, are asserted * The `slave` **must wait** for **AWVALID, AWREADY, WVALID**, and **WREADY** to be asserted before asserting **BVALID** * The `slave` **must wait** for **WLAST** to be asserted before asserting **BVALID** because the write response,**BRESP** must be signaled **only after the last data transfer** of a write transaction * The `slave` **must not wait** for the `master` to assert **BREADY** before asserting **BVALID** * The `master` **can wait** for **BVALID** before asserting **BREADY** * The `master` **can assert BREADY** before **BVALID** is asserted ![](https://i.imgur.com/dAbOoao.png) ### Verilog Code of AXI write/read transaction * Write transaction ```c= wire req_is_read_w = ((req_valid_w & !request_in_progress_w) ? req_w[68] : 1'b0); wire req_is_write_w = ((req_valid_w & !request_in_progress_w) ? ~req_w[68] : 1'b0); reg awvalid_inhibit_q; reg wvalid_inhibit_q; always @ (posedge clk_i or posedge rst_i) if (rst_i) awvalid_inhibit_q <= 1'b0; else if (axi_awvalid_o && axi_awready_i && axi_wvalid_o && !axi_wready_i) awvalid_inhibit_q <= 1'b1; else if (axi_wvalid_o && axi_wready_i) awvalid_inhibit_q <= 1'b0; always @ (posedge clk_i or posedge rst_i) if (rst_i) wvalid_inhibit_q <= 1'b0; else if (axi_wvalid_o && axi_wready_i && axi_awvalid_o && !axi_awready_i) wvalid_inhibit_q <= 1'b1; else if (axi_awvalid_o && axi_awready_i) wvalid_inhibit_q <= 1'b0; assign axi_awvalid_o = req_is_write_w && !awvalid_inhibit_q; assign axi_awaddr_o = {req_w[31:2], 2'b0}; assign axi_wvalid_o = req_is_write_w && !wvalid_inhibit_q; assign axi_wdata_o = req_w[63:32]; assign axi_wstrb_o = req_w[67:64]; assign axi_bready_o = 1'b1; assign write_complete_w = (awvalid_inhibit_q || axi_awready_i) && (wvalid_inhibit_q || axi_wready_i) && req_is_write_w; ``` * Read transaction ```c= assign axi_arvalid_o = req_is_read_w; assign axi_araddr_o = {req_w[31:2], 2'b0}; assign axi_rready_o = 1'b1; assign mem_data_rd_o = axi_rdata_i; assign read_complete_w = axi_arvalid_o && axi_arready_i; ``` ## Interfaces | Name | Description| | -------- | -------- | | clk_i | Clock input | | rst_i | Async reset, active-high. Reset memory / AXI interface | | rst_cpu_i | Async reset, active-high. Reset CPU core (excluding AXI / memory)| | axi_d_* | AXI4 slave interface for access to data memory | | axi_i_* | AXI4 slave interface for access to instruction memory | | inrt_i | Active high interrupt input (for connection external int controller) | | mem_i_* |AXI4-Lite master interface for CPU access to instruction cache | | mem_d_* |AXI4-Lite master interface for CPU access to data cache | ## Waveform of AXI4 BUS * The starting address of basic.elf is from 2000 and the instruction is `0x258006f` ![](https://i.imgur.com/NhgqZsh.png) ### Read Instruction Memory ![](https://i.imgur.com/9qi7lJs.png) * As above picture, you can see when master send the `araddr` to the slave, it will also generate a `arvalid` signal to tell slave that it want to read a data from that address. * At the same time, slave will also generate a `arready` indicate that it is ready to receive request ![](https://i.imgur.com/UYUyFS5.png) * As a consequence, in the above red box, the **handshake** process of control signal finish * Hence, at the next cycle, slave gives him the corresponding instruction `0258006f` with a `rvalid` signal * From picture below, we can see the master always asserts `rready` to high(=1'b1), so once slave asserts `rvalid`, the **handshake** process of data transfer complete **(see red box below)** ![](https://i.imgur.com/fhVfw25.png) * According to picture above, `rlast` signal haven't assert to high(it means that data transfer not over yet), so AXI bus will keep transfer data according to **handshake** process **(see the blue box below)** * In the end, when all data are transferred, slave will generate a `rlast` signal to tell master that all data transfer request are complete **(see orange box below)** ![](https://i.imgur.com/0C7auNz.png) ### Read Data Memory * Take araddr=0x3CE0 for example ![](https://i.imgur.com/aKb1TXi.png) * From picture below, when master wants to access data or peripheral memories it will send a `araddr` and `arvalid` to slave, tell slave that it wants to read its data * At the same time, slave will also send a arready signal to indicate that it is ready to handle the request from master * Consequently, the **handshake** process finish once master sends a `arvalid` signal and slave returns a `arready` **(see red box below)** ![](https://i.imgur.com/ZqGG9wV.png) * At the next cycle, slave will return required data with a `rvalid` signal, and because master always assert `rready` to high(=1'b1), the **handshake** process complete once slave return data(=0x1234) to master **(see red box below)** ![](https://i.imgur.com/QqYXc3z.png) * From waveform above, we can see the `rlast` signal hasn't assert to high, it means that data transfer process not done yet * Therefore, slave will keep transferring the data(=0x00000000) to master based on **handshake** process **(see blue box below)** * Finally, when all required data transfer are completed, slave will assert a `rlast` signal to indicate master that it is the last data **(see red box below)** * At this moment, all read data request are complete ![](https://i.imgur.com/j4aVbhD.png) ### Write Data Memory * As for write data memory, we take `awaddr=0x00004020` for example ![](https://i.imgur.com/TM2J0ny.png) * From picture below, we can see when master wants to write a data or peripheral memory, it will send a `awvalid` signal with a `awaddr` to slave * But at this moment, slave isn't ready for answering the request from master, so it won't return a `awready` * Three clock cycle later, slave can accept request from master, hence it will return a `awready` signal * Consequently, the **handshake** process complete at this moment, slave receive **address** and **control signal** from master **(see red box below)** ![](https://i.imgur.com/LwhaSEU.png) * After **handshaking** of AW channel, slave need to write the data into corresponding address * However, the next cycle after **AW channel handshaking**, slave aren't able to write the data into memory * So master will remain `wvalid` to high, and also remain `wdata=0x00000000` stable until **W channel handshaking process** complete * As picture below, we can see after several cycles, slave are able to write the data into memory * Therefore, it will assert a `wready` signal to indicate that it is ready to write the data into memory * Once slave asserts `wready` to high, the **handshake** process of W channel complete(because `wvalid` are always high) **(see red box below)** ![](https://i.imgur.com/ev9AZ6E.png) * From waveform below, we can see `wlast`、`bvalid` and `bready` are not assert to high yet, which means master hasn't write enough data into memory * As a consequence, slave will keep returning a `wready` signal when it is able to response master's request **(see blue box below)** * In the end, we can see when `bready`、`bvalid` and `wlast` simultaneous assert to high, it means that all write data process are complete **(see red box below)** and the **handshake** process of B channel is also complete * After completion of **B channel handshake**, all write data memory requests are staisfied ![](https://i.imgur.com/wEt6ldt.png) ## Instruction Cache and Data Cache ### Address mapping * According to **Address mapping example** in the [icache.v](https://github.com/ultraembedded/riscv/blob/master/top_cache_axi/src_v/icache.v), the Instruction cache mapped address as follow: ![](https://i.imgur.com/azM7aVD.png) ### Data RAM / Tag RAM size * Both **Tag RAM** of **Instruction Cache** and **Data Cache** are 0KB ![](https://i.imgur.com/laFnHix.png) * Both **Data RAM** of **Instruction Cache** and **Data Cache** are 12KB ![](https://i.imgur.com/3MbaRFE.png) ### Replacement policy of icache and dcache - Random Replacement Policy * **Randomly** selects a candidate item and discards it to make space when necessary. This algorithm **does not require keeping any information about the access history** * Implement of **Random Replacement Policy** in **icache** ![](https://i.imgur.com/9Z9gY0n.png) * Implement of **Random Replacement Policy** in **dcache** ![](https://i.imgur.com/UFdFomH.png) ### Finite State Machine(FSM) of Instruction Cache #### STATE_FLUSH * When `rst_i` signal come, icache state will set to **STATE_FLUSH**, which will reset the data in icache_data_ram or flush a line in icache_data_ram * After that, when icache receive `req_invalidate_i` and `req_accept_o`from CPU or when `flush_addr_q` equals to 8'b1, it will change state to **STATE_LOOKUP** #### STATE_LOOKUP * In **STATE_LOOKUP**, icache will compare `tag` from **instruction** with all the `tag` in the **tag ram** * If it found that it doesn't match any `tags` in tag array, it will assert `tag_hit_any_w` to low and change the state to **STATE_REFILL** * Alternatively, if it found that the according line of `instruction tag` is invalid or icache receives `req_flush_i` signal from CPU, it will change state to **STATE_FLUSH** #### STATE_REFILL * In **STATE_REFILL**, icache will keep storing data into icache_data_ram until it receive `axi_rlast_i` signal from slave, which means all the required data has been transferred and then change the state to **STATE_RELOOK** #### STATE_RELOOKUP * In **STATE_RELOOKUP**, it will check the `lookup_addr_q` based on `tag` again and switch the state to **STATE_LOOKUP** next cycle ![](https://i.imgur.com/2QLDCe3.png) ### Waveform of Instruction Cache * Here, we start from `program counter=0x00002000`, which is the first instruction of `basic.elf` ![](https://i.imgur.com/rhMAako.png) * From waveform below, we can see that when CPU send the `mem_i_pc_o=0x00002000` to the icache, it will also assert `mem_i_rd_i` to high, which indicate it wants to read a instruction from icahce * At next cycle, icache will assert `mem_i_accept_i` to high, which means it accepts the request and can return required data to CPU * The corresponding instruction and `mem_i_accept_i` will be returned and asserted simultaneous * As instruction sequence above, we can know that after `pc=0x00002000` it will jump to `pc=0x00002258` * Consequently, although CPU keeps sending wrong program counter(e.g. 0x2004, 0x2008), icache still return instruction equals to `0x00000000` to CPU * Until program counter equals to `0x2258`, icache will return instruction equals to `0x00005137` to CPU ![](https://i.imgur.com/n3GpOby.png) * In the end, we can see waveform below, when CPU doesn't want to access instruction from icache, it will assert `mem_i_rd_o` to low, which will tell icache that CPU doesn't need instructions anymore * One clock cycle later, icache will set the `mem_i_valid_i` to low * After `mem_i_valid_i` set to `0`, no more instructions would send back to CPU ![](https://i.imgur.com/pDIRGTN.png) ### Finite State Machine(FSM) of Data Cache #### STATE_RESET * When `rst_i` signal comes, it will set the state into **STATE_RESET** * In the **STATE_RESET**, it will clear all the data in the `data ram` and `tag ram` * The `flush_last_q` signal will check whether it reach the final line of cache or not, after that it will change to **STATE_LOOKUP** #### STATE_LOOKUP * In **STATE_LOOKUP**, first, it will check that whether **previous access missed in the cache** and there is a **dirty line**, if so, it will change to **STATE_EVICT**, or **STATE_REFILL**(miss happen in last access, but no dirty line in cache) * Second, it will check if there is `mem_writeback_i` signal sending from controller, if so, it will change to **STATE_WRITEBACK**(writeback a single line) * Third, it will check if there is `mem_flush_i` signal sending from controller, if so, it will change to **STATE_FLUSH_ADDR**(flush whole cache) * Finally, it will check if there is `mem_invalidate_i` signal sending from controller, if so, it will change to **STATE_INVALIDATE**(invalidate a line, even if it is dirty) * If none of above signal send into data cache during **STATE_LOOKUP**, it will stay in **STATE_LOOKUP**, and compare the `tag` from instruction with `tag` in **tag ram** #### STATE_INVALIDATE * In **STATE_INVALIDATE**, it will invalidate a cache line even if it is dirty, and then return to **STATE_LOOKUP** #### STATE_WRITEBACK * In **STATE_WRITEBACK**, it will check `tag_hit_and_dirty_m_w` signal, which indicate whether a `hit tag line` is dirty or not. If it is dirty, it will change to **STATE_EVICT** or it will return to **STATE_LOOKUP** #### STATE_EVICT * In **STATE_EVICT**, it will evict the dirty line and check whether the eviction is complete or not. If the eviction is finish, it will change the state to **STATE_EVICT_WAIT** to wait the write operation completion #### STATE_EVICT_WAIT * In the **STATE_EVICT_WAIT**, cache will wait for completion of write operation, when write operation finish, controller will send three types of signal to tell cache which state to change * If cache receive `mem_writeback_m_q` and `pmem_ack_w` signal, it means that controller want to write back a single line, so it will first change to **STATE_LOOKUP** and then go to **STATE_WRITEBACK** * Or if cache receive `flushing_q` and `pmem_ack_w` signal, it means controller want to flush the dirty line, therefore, it will change to **STATE_FLUSH_ADDR** * Otherwise, if cache only receive `pmem_ack_w`, it means controller wants to re-fill the cache now, consequently, it will swithc to **STATE_REFILL** * If cache doesn't recevie none of above signal, it will simply stay in **STATE_EVICT_WAIT**, and wait for write operation completion #### STATE_WRITE or STATE_READ * In **STATE_WRITE** or **STATE_READ**, controller will simply write or read a data from cache, after that it will go back to **STATE_LOOKUP** #### STATE_REFILL * In **STATE_REFILL**, it will use `pmem_last_w` to determine whether the refill operation is complete, if so, controller go on do check `mem_wr_m_q` signal. If `mem_wr_m_q` isn't equal to `4'b0`, it means that the refill reason is to write, so it will switch to **STATE_WRTIE**, or it will switch to **STATE_READ** #### STATE_FLUSH_ADDR * In **STATE_FLUSH_ADDR**, controller will add one to `flush_addr_q`, then change to **STATE_FLUSH** next cycle #### STATE_FLUSH * In **STATE_FLUSH**, first, `tag_dirty_any_m_w` will detect if there is a dirty line, if so , controller will decide either send `evict_way_w` signal to evict dirty line and switch to **STATE_EVICT** or just wait for dirty way to be selsected(stay in **STATE_FLUSH**) * Second, controller will check if it reach the final line and find no dirty line and switch to **STATE_LOOKUP**, or controller will switch cache to **STATE_FLUSH_ADDR** to increment the `flush_addr_q` and keep checking the lines in the cache ![](https://i.imgur.com/tQWbNE2.png) ### Waveform of Data Cache #### Write into data cache * From instruction below, we can see when `mem_i_pc_o=0x000022BC` ,the MEM stage is processing instruction `22b4: 00112623 sw ra,12(sp)` * Hence, we take this instruction to explanation how data is wrote into dcache ```c= 000022b0 <init>: 22b0: ff010113 addi sp,sp,-16 22b4: 00112623 sw ra,12(sp) 22b8: 198000ef jal ra,2450 <serial_init> 22bc: 00c12083 lw ra,12(sp) 22c0: 00002537 lui a0,0x2 22c4: 42c50513 addi a0,a0,1068 # 242c <serial_putchar> 22c8: 01010113 addi sp,sp,16 22cc: 5d10006f j 309c <printf_register> ``` ![](https://i.imgur.com/ZQ9iz5p.png) * From waveform below, we can know, unlike icache, when CPU want to write into dcache, it won't assert `mem_d_rd_o` to high, on the contrary, CPU will assert `mem_d_rd_o` to low but with a `mem_d_wr_o` equals to `4'hf`, which will tell dcache controller that CPU want to do write operation * At next cycle, dcache controller will assert `mem_d_ack_i` to high which indicate dcache activating now and is writing the data, here is 0x00000000, into dcache line * Once `mem_d_ack_i` become to low, the whole write dcache operation is complete ![](https://i.imgur.com/0sICFjb.png) #### Read the data cache * As instruction sequence above, when `mem_i_pc_o=0x000022C4`, the MEM stage is processing instruction `22bc: 00c12083 lw ra,12(sp)` * Therefore, I take this instruction to explane the read from data cache operation ![](https://i.imgur.com/7F8Vgkj.png) * As below waveform shows, just like icache, when CPU wants to read the data from cache, it will assert `mem_d_rd_o` to high with `mem_d_wr_o=4'b0` * This will indicate dcache controller that CPU wants to do read operation to dcache * A clock cycle later, dcache controller will set the `mem_d_ack_i` to high, which means the dcache is activating and for this operation, is reading data ![](https://i.imgur.com/kArc5Eu.png) * After `mem_d_ack_i` become to one, the data would be read out, here, because from previous instruction, we write `0x00000000` into dcache, and now the data to be read out should be `0x00000000` **(see red box below)** * Once the data be read out, the whole read dcache operation is complete ![](https://i.imgur.com/88VptWn.png)