# Assignment1: RISC-V Assembly and Instruction Pipeline contributed by <[`SmallliDinosaur(GitHub)`](https://github.com/SmallliDinosaur)> --- ### Problem B in Quiz1 #### C Program ``` static inline bf16_t fp32_to_bf16(float s) { union { float f; uint32_t i; } u = {.f = s}; // Use union to handle both float and integer representations // Check for NaN case if ((u.i & 0x7FFFFFFF) > 0x7F800000) { // NaN check bf16_t h; h.bits = (u.i >> 16) | 0x0040; // Force it to be quiet NaN return h; } // Normal case bf16_t h; h.bits = (u.i >> 16) + ((u.i >> 15) & 1); // Perform rounding return h; } ``` #### Assembly ``` .data argument: .word 0x3E800000 .text _start: j main fp32_to_bf16: la a0, argument lw a1, 0(a0) li a2, 0x7F800000 li a3, 0x7FFFFFFF and a4, a1, a3 bgtu a4, a2, non_case li a2, 0x7FFF srli a3, a1, 16 li t0, 0x8000 and a3, a1, t0 add a3, a3, a2 add a3, a3, a1 srli a0, a3, 16 ret non_case: srli a1, a1, 16 li a4, 0x0040 or a0, a1, a4 ret print_result: li a7, 1 ecall ret main: jal ra, fp32_to_bf16 jal ra, print_result end: nop ``` ### Minimum Changes To Make Alternating Binary String([LeetCode 1758](https://leetcode.com/problems/minimum-changes-to-make-alternating-binary-string/description/)) >You are given a string s consisting only of the characters '0' and '1'. In one operation, you can change any '0' to '1' or vice versa. > >The string is called alternating if no two adjacent characters are equal. For example, the string "010" is alternating, while the string "0100" is not. > >Return the minimum number of operations needed to make s alternating. #### Idea for problem solving The problem asks for the minimum operations needed to convert a binary string into an alternating pattern, where no two adjacent characters are the same. There are two possible alternating patterns: one starting with '0' and the other starting with '1'. We calculate the number of changes required to match each pattern and return the smaller value. The space complexity is O(1) since we only use a few variables to track counts, while the time complexity is O(n), where n is the length of the string, as we need to iterate through the string once. #### C Program ``` C int minOperations(char* s) { int c1 = 0, c2 = 0; for (int i = 0; i < strlen(s); ++i) { c1 += (s[i] - '0' == i % 2); c2 += (s[i] - '0' != i % 2); } return c1 < c2 ? c1 : c2; } ``` #### Assembly ``` Assembly .data string1: .asciz "0100" # Test case 1 string2: .asciz "10" # Test case 2 string3: .asciz "1111" # Test case 3 newline: .asciz "\n" # Newline character .text .globl main main: la a0, string1 # Load the address of string1 jal process_string # Process string1 jal print_result # Output the result of string1 la a0, string2 # Load the address of string2 jal process_string # Process string2 jal print_result # Output the result of string2 la a0, string3 # Load the address of string3 jal process_string # Process string3 jal print_result # Output the result of string3 li a0, 1 # Set the exit code to 1 li a7, 93 # ecall number 93 is for exit ecall process_string: li t0, 0 # i = 0 li t1, 0 # c1 = 0 li t2, 0 # c2 = 0 process_loop: lb t3, 0(a0) # Load the current character from the string beqz t3, end_process # Exit if the current character is '\0' # Check if s[i] == i % 2 li t4, 48 # Load the ASCII value of character '0' sub t5, t3, t4 # Calculate s[i] - '0' (convert ASCII to number) andi t6, t0, 1 # Use bitwise operation for i % 2, get the lowest bit of i beq t5, t6, inc_c1 # If equal, increment c1++ j inc_c2 # If not equal, increment c2++ inc_c1: addi t1, t1, 1 # c1++ j next_char # Jump to the next character inc_c2: addi t2, t2, 1 # c2++ next_char: addi t0, t0, 1 # i++ addi a0, a0, 1 # Move the string pointer to the next character j process_loop # Continue processing the next character end_process: # Compare c1 and c2, take the minimum value bge t1, t2, output_c2 mv a0, t1 # a0 = c1 j end_process_func output_c2: mv a0, t2 # a0 = c2 end_process_func: jr ra # Return to the main program print_result: li a7, 1 # ecall number 1 is for printing an integer ecall # System call to print the content of a0 la a0, newline # Load the address of the newline character li a7, 4 # ecall number 4 is for printing a string ecall # System call to print a newline jr ra # Return to the main program ``` - [Ripes](https://ripes.me/) #### Analysis Input: s = "0100" Output: 1 Input: s = "10" Output: 0 Input: s = "1111" Output: 2 ![image](https://hackmd.io/_uploads/Hkb4Xh5kyl.png) Based on the results, the execution statistics show that the program took 48 cycles to complete, with 28 instructions retired. The average cycles per instruction (CPI) is 1.71, and the instructions per cycle (IPC) is 0.583. Additionally, the clock rate is recorded at 17.54 Hz, suggesting the program might have been executed in a simulated or testing environment. The relatively high CPI and low IPC indicate potential areas for performance optimization. ![image](https://hackmd.io/_uploads/HkoHXh9Jyg.png) This results shows the general-purpose register (GPR) values from a RISC-V architecture at a particular point during program execution. The snapshot reveals that the program is likely in the middle of a function or system call, with the return address stored in the `x1 (ra)` register, the stack pointer (`x2`) pointing to the stack, and arguments being passed through registers like `x10`. The `x17 (a7)` register holds the value `0x0000005d`, which may indicate a specific system call or function identifier. Temporary registers are also in use for intermediate calculations, and the stack is appropriately positioned. This register state provides insight into the current execution flow, showing that the program is in the midst of handling some operational task. ![image](https://hackmd.io/_uploads/Skx3DE2ck1l.png)