# Assignment 1 : Convert Binary Number in a Linked List to Integer
contributed by [`popo8712`](https://github.com/popo8712/Computer-Architecture-1)
# A. quiz1 - problem B
The problem requires converting a 32-bit floating point number (IEEE 754 float format) into a 16-bit bfloat16 format.
## 1. Purpose
- Reduce memory usage: bfloat16 uses 16 bits instead of 32 bits.
- Increase computational speed: Faster processing due to reduced data size.
- Maintain dynamic range: Exponent bits remain the same, allowing for the same numerical range.
- Sacrifice precision: bfloat16 retains only 7 bits for the mantissa, resulting in lower precision, but generally sufficient.
## 2. C Code
```c
static inline bf16_t fp32_to_bf16(float s) {
union {
float f;
uint32_t i;
} u = {.f = s}; // Use union to handle both float and integer representations
// Check for NaN case
if ((u.i & 0x7FFFFFFF) > 0x7F800000) { // NaN check
bf16_t h;
h.bits = (u.i >> 16) | 0x0040; // Force it to be quiet NaN
return h;
}
// Normal case
bf16_t h;
h.bits = (u.i >> 16) + ((u.i >> 15) & 1); // Perform rounding
return h;
}
```
## 3. Assembly Code
```s
.data
argument: .word 0x3E800000 # float 0.25 (32-bit IEEE 754)
.text
_start:
j main
fp32_to_bf16:
la a0, argument # Load the address of the argument
lw a1, 0(a0) # Load float 0.25 (32-bit IEEE 754) from memory
# NaN Check
li a2, 0x7F800000 # NaN threshold
li a3, 0x7FFFFFFF # Mask to remove the sign bit
and a4, a1, a3 # u.i & 0x7FFFFFFF
bgtu a4, a2, non_case # If u.i > 0x7F800000, handle NaN case
# Rounding and conversion to bfloat16
li a2, 0x7FFF # Constant 0x7FFF
srli a3, a1, 16 # Shift right to extract the top 16 bits of u.i
li t0, 0x8000 # Load the 0x8000 value for rounding
and a3, a1, t0 # Check the 17th bit for rounding (u.i >> 15) & 1
add a3, a3, a2 # 0x7FFF + (u.i >> 15) & 1 (Rounding)
add a3, a3, a1 # u.i + (0x7FFF + (u.i >> 15) & 1)
srli a0, a3, 16 # Shift right to get the final bfloat16 value
ret # Return the result in a0
non_case:
# Handle NaN case
srli a1, a1, 16 # Shift right by 16 bits
li a4, 0x0040 # Set the quiet NaN flag
or a0, a1, a4 # Set the final result as (u.i >> 16) | 0x40
ret
print_result:
li a7, 1 # syscall number for print integer (assuming RISC-V Linux)
ecall # make system call to print the value in a0
ret
main:
jal ra, fp32_to_bf16 # Call the fp32_to_bf16 function
jal ra, print_result # Print the result stored in a0
end:
nop
```
## 4. Result


# B. Leetcode 1290 Implementation
## 1. Background
### 1.1 Convert Binary Number in a Linked List to Integer
- **Definition**: In this problem, we deal with a linked list where each node stores a binary digit (0 or 1). The head node of the linked list represents the most significant bit of the binary number, while the tail node represents the least significant bit. Our task is to convert this binary representation into the corresponding decimal integer.
- **Example**: Suppose the linked list is [1, 0, 1], which represents the binary number 101. Converting this to a decimal number results in 5.
- **Task Description**: Given a linked list, we need to read its contents step by step and convert the binary number into a decimal integer by shifting each binary digit to the left and accumulating the result. This process requires an understanding of how to operate on a linked list structure and how to efficiently perform bitwise operations.
### 1.2 Motivation
- **Method Used**: This problem involves converting a binary number to a decimal, which can typically be solved efficiently through bitwise operations. In the C code, we can use the left shift operation (`<<`) to move each encountered binary digit to the correct position and accumulate the result. Similarly, we aim to optimize the process by writing manual RISC-V assembly code, leveraging the simplicity of the RISC-V instruction set to produce faster and smaller code.
- **Performance Optimization**: The purpose of manually optimizing the code using RISC-V assembly is to reduce runtime overhead by utilizing simple bitwise operations, thereby avoiding unnecessary logic computations and function calls. This reduces the code size and improves efficiency, especially in contexts where floating-point operations or complex computations are not needed. Handwriting assembly allows for more precise control of the operations.
## 2. Implementation
You can find the source code [here](https://github.com/popo8712/Computer-Architecture/blob/main/convert.c).
[leetcode 1290](https://leetcode.com/problems/convert-binary-number-in-a-linked-list-to-integer/description/)
### 2.1 C program
```c=
/**
* Definition for singly-linked list.
* struct ListNode {
* int val;
* struct ListNode *next;
* };
*/
int getDecimalValue(struct ListNode* head){
int res = 0;
while( head ){
res <<= 1;
res |= head->val;
head = head->next;
}
return res;
}
```
### 2.2 Assembly Code
The complete source code can be found [here](https://github.com/popo8712/Computer-Architecture/blob/main/assembly%20code). The following shows only the part of main function.
```assembly=
# Common function to process linked list and print result
process_list:
li a1, 0 # Initialize result (res) to 0
loop:
lw t0, 0(a0) # Load the current node value (head->val) into t0
blt t0, zero, end # If the value is -1 (end marker), jump to end
slli a1, a1, 1 # Left shift res by 1 (res <<= 1)
or a1, a1, t0 # OR the current value with res (res |= head->val)
addi a0, a0, 4 # Move to the next node (head = head->next)
j loop # Repeat the loop
end:
# End of loop, result is stored in a1
mv a0, a1 # Move the result into a0 for printing
li a7, 1 # Syscall code 1 (print integer)
ecall # Make syscall to print the result
ret # Return to the main function
```
> I found that there are currently too many lost packets, so I need to revise my assembly language.
> 
### 2.4 Revise
#### 2.4.1 Assembly Code
```assembly=
```
### 2.5 Result
Input: head = [1,0,1]
Output: 5
Input: head = [0]
Output: 0
#### C program output

#### RISC-V Assembly output

## 3. Analysis
I test the code using [Ripes](https://github.com/mortbopet/Ripes) simulator.
### 3.1 Disassembled Executable Code
The complete translated code can be found [here](https://github.com/popo8712/Computer-Architecture/blob/main/disassemble.s). The following shows only the part of main function.
```Disassembled=
00000000 <main>:
0: 10000517 auipc x10 0x10000
4: 00050513 addi x10 x10 0
8: 00400893 addi x17 x0 4
c: 00000073 ecall
10: 10000517 auipc x10 0x10000
14: 01c50513 addi x10 x10 28
18: 048000ef jal x1 72 <process_list>
1c: 10000517 auipc x10 0x10000
20: ff950513 addi x10 x10 -7
24: 00400893 addi x17 x0 4
28: 00000073 ecall
2c: 10000517 auipc x10 0x10000
30: feb50513 addi x10 x10 -21
34: 00400893 addi x17 x0 4
38: 00000073 ecall
3c: 10000517 auipc x10 0x10000
40: 00050513 addi x10 x10 0
44: 01c000ef jal x1 28 <process_list>
48: 10000517 auipc x10 0x10000
4c: fcd50513 addi x10 x10 -51
50: 00400893 addi x17 x0 4
54: 00000073 ecall
58: 00a00893 addi x17 x0 10
5c: 00000073 ecall
```
### 3.2 5-stage pipelined processor
Ripes provide different processor including single-cycle processor, 5-stage processor, 6-stage dual-issuer processor. 5-stage pipelined processor is chosen to run the program. The block diagram is shown below:

The 5 stages is designed to deal with the instructions in pipeline approach to enhance the execution efficiency. The 5 stages is shown as below:
1. **IF (Instruction Fetch)**
- fetch the instruction from instruction memory
2. **ID (Instruction Decode)**
- decode the instruction and read the value from registers
3. **EX (Execute)**
- use ALU to execute calculations or compute memory addresses
4. **MEM (Memory)**
- access the operands from memory
5. **WB (Write Back)**
- write the result back to the registers
### 3.3 I-type format
I will choose the following section as the object of analysis:
```assembly=
li a7, 1 # Syscall code 1 (print integer)
ecall # Make syscall to print the result
```
According to [RISC-V Instruction Set Manual (p.14)](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf), the I-type instruction will be accessed in the format below:
> Shifts by a constant are encoded as a specialization of the I-type format. The operand to be shifted is in rs1, and the shift amount is encoded in the lower 5 bits of the I-immediate field. The right shift type is encoded in a high bit of the I-immediate.

1. **IF (Instruction Fetch)**

- In this stage, the processor reads the current instruction's address from the Program Counter (PC) and fetches the instruction from the instruction memory. In the diagram, we see the instruction `addi x17, x0, 1` is in the IF stage, with the PC value being `0x00000080`. The processor fetches `0x00100893`, which is the machine code for `addi x17, x0, 1`, and prepares for decoding in the next stage.
2. **ID (Instruction Decode)**

- In the ID stage, the fetched machine code is decoded into a specific instruction. At this point, the processor identifies the target register and operands. For the instruction `addi x10, x11, 0`, the machine code `0x00058513` is decoded, with register `x11` identified as the source register, and both the immediate value `0` and the target register `x10` ready. The instruction is now prepared to move to the EX stage for the addition operation.
3. **EX (Execute)**

- In the EX stage, arithmetic or logic operations are performed in the Arithmetic Logic Unit (ALU). For the instruction `addi x10, x10, 4`, the ALU is adding the value of `x10` to the immediate value `4`. We can see the result of the ALU operation is `x10 + 4`, and this result will be passed to the next stage. In the diagram, the ALU’s inputs are the current value of `x10` (`0x00000078`) and the immediate value `4`, resulting in `0x0000007C`.
4. **MEM (Memory)**

- For instructions that don’t involve memory access, such as the logic instruction `or x11, x11, x5`, this stage is typically skipped. However, for memory access instructions like `lw` (load word) or `sw` (store word), this stage would involve accessing data memory. In this diagram, there are no instructions currently accessing memory, so the MEM operation is not applicable here.
5. **WB (Write Back)**

- In the WB stage, the final result of the instruction is written back to the target register. In the diagram, the instruction `or x11, x11, x5` is in the write-back stage. The result will be written back to register `x11`, which has the value `0x00000001`. The write-back process indicates that the execution of this instruction is complete and the result has been successfully stored in the target register.
### 3.4 memory update

This diagram shows the status of **32 General Purpose Registers (GPR)** in the processor and their values during the current stage of program execution. Each register has its name, alias, and stored value. These registers are used to store data and intermediate computation results while the program is running.
#### Explanation of the columns in the diagram:
1. **Name**: The names of the registers in the RISC-V processor, ranging from `x0` to `x31`, which follow the RISC-V instruction set standard.
2. **Alias**: Some common registers have aliases. For example, `x0` is called `zero`, `x1` is called `ra` (return address), `x2` is `sp` (stack pointer), and so on.
3. **Value**: The current value stored in the register, displayed in hexadecimal format. These values represent the status of each register while the processor is executing the program.
#### Key register analysis:
- **x0 (zero)**: This register always holds the value `0`, no matter the situation. This is a specification in the RISC-V design.
- **x1 (ra, return address)**: `x1` holds the return address. In the diagram, its value is `0x0000001c`, meaning that when the current function finishes, the program will jump to address `0x0000001c`.
- **x2 (sp, stack pointer)**: `x2` is the stack pointer, indicating the current top of the stack. In the diagram, the stack pointer's value is `0x7ffffff0`, which represents the memory address of the stack.
- **x5 (t0, temporary register)**: `x5` is a temporary register used for storing intermediate data. Here, it holds the value `0x00000001`, indicating that `t0` currently stores the value `1`. This might be the result of an `addi` instruction or a similar operation.
#### Other register statuses:
- **x10 (a0)** and **x11 (a1)**: These registers are typically used as argument registers for function calls. In the diagram, `x10` holds the value `0x1000002c`, which may represent a parameter value, while `x11` is `0x00000000`, meaning it currently holds no value.
- **x31 (t6)**: Although not all registers are shown in the diagram, the `t` series registers are typically used for temporary values during computation. These registers are frequently used during operations but do not hold important global state.
#### Analysis summary:
1. **Register `x5`** (alias `t0`) holds the value `0x00000001`, indicating that during the current execution, `t0` stores the value `1`. This may be related to a recently executed `addi` instruction or shift operation.
2. **Stack pointer `x2`** points to address `0x7ffffff0`, which is the current top of the stack. The stack pointer often changes during function calls or returns.
3. **Return address register `x1`** holds the value `0x0000001c`, which is the address where the processor will jump to execute the next instruction after the current function returns.