Assignment1: RISC-V Assembly and Instruction Pipeline

# Assignment1: RISC-V Assembly and Instruction Pipeline contributed by < [`pine0113`](https://github.com/pine0113) > # NOT Finished yet ###### tags: `RISC-V` ## Implement maxpool operator using bfloat16 >The maximum pooling operation performs downsampling by dividing the input into pooling regions and computing the maximum value of each region. The maxpool function applies the maximum pooling operation to dlarray data. ## Motivation Since BF16 is a floating format made for AI computing, operators like conv, gemmm, maxpool is really common to use to implement neural network. this homework I would like to implement a operator and to see where can we improve the efficiency. ## Implementation You can find the source code [here](https://github.com/pine0113/Computer-Architecture/tree/master/hw1). Feel free to fork and modify it. ### C code * first version ```clike= void maxpool(float* output, float* input, int inputRow, int inputCol, int kernelRow, int kernelCol) { float tmp = input[0]; int outputRow= inputRow - kernelRow + 1; int outputCol= inputCol - kernelCol + 1; for (int i=0; i<outputRow; i++ ) { for (int j=0; j<outputCol; j++) { float tmp=input[i*inputCol+j]; for(int k=0; k<kernelRow; k++) { for (int l=0; l<kernelCol ;l++) { if(input[(i+k)*inputCol+(j+l)] > tmp) { tmp=input[((i+k)*inputCol+(j+l))]; } } } output[i*outputCol+j]=tmp; } } return; } ``` * second version * ### Assembly code In this example, ```clike= .data arr1: .word 1, 2, 3 # a[3] = {2, 4, 6} arr2: .word 4, 5, 6 # b[3] = {8, 10, 12} len: .word 3 # array length = 3 str: .string "The inner product of two vectors is " .text # s1 = arr1 base address # s2 = arr2 base address # s3 = array length # s4 = sum # t0 = i # t1 = a[i] # t2 = b[i] # t3 = a[i] * b[i] (assume no overflow, lower 32 bits) main: ecall loop: ret ``` ## Analysis We test our code using [Ripes](https://github.com/mortbopet/Ripes) simulator. ### Pseudo instruction Put code above into editor and we will see that Ripe doesn't execute it literally. Instead, it replace [pseudo instruction (p.110)](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) into equivalent one, and change register name from ABI name to sequencial one. The translated code looks like: ``` 00000000 <main>: ``` In each row it denotes address in instruction memory, instruction's machine code (in hex) and instruction itself respectively. * Explain your program with the visualization for multiplexer input selection, register write/enable signals and more. You have to illustrate each stage such as IF, ID, IE, MEM, and WB. In addition, you should discuss the steps of memory updates accordingly. ## Reference * [RISC-V Instruction Set Manual](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) * [RISC-V Green Card](https://www.cl.cam.ac.uk/teaching/1617/ECAD+Arch/files/docs/RISCVGreenCardv8-20151013.pdf) * [Environmental Calls](https://github.com/mortbopet/Ripes/wiki/Environment-calls) * [fast max pooling](https://github.com/mratsim/Arraymancer/issues/174) *

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.