Assignment2: GNU Toolchain

Due: Oct 31, 2023

Requirements

Following the instructions in Lab2: RISC-V RV32I[MACF] emulator with ELF support, select one assembly program from Assignment1: RISC-V Assembly and Instruction Pipeline, and adapt it into both RISC-V assembly and C implementations that can be executed flawlessly with rv32emu.
- You must NOT select programs that you have previously submitted.
- You should provide a brief description of your motivations for your selection.
- You may choose to study the same subject as other students, but you must make your own discoveries.
- There are just RV32I instructions that can be used. This means that you MUST build C programs with the -march=rv32i -mabi=ilp32 flags.
  - RV32M (multiply and divide) and RV32F (single-precision floating point) are not permitted.
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
  rv32emu and Ripes may not work together, therefore please be aware of the potential incompatibility. Please check docs/syscall.md and src/syscall.c in advance.
- Do not duplicate workspaces or the entire repository from rv32emu. As a starting point, copy the asm-hello directory instead. You shall modify Makefile and the linker script accordingly.
- kdnvt produced some excellent work that can be used as a benchmark for program analysis and future optimizations. Please read his report carefully and pay attention to certain suggestions and observations.
- (Optional) You have the choice to choose the programs, pi.c and nqueens.c, and create RISC-V assembly that is superior to that produced by GNU Toolchain. That is, your handwritten RISC-V assembly program should run more quickly and occupy less space in the ELF image.
Disassemble the ELF files produced by the C compiler and contrast the handwritten and compiler-optimized assembly listings.
- You can append the compilation options to experiment. e.g., Change -Ofast (optimized for speed) to -Os (optimized for size).
- Describe your obserations and explain.
Check the ticks.c and perfcounter for the statistics of your program's execution. Then, try to optimize the handwritten/generated assembly. You shall read RISC-V Assembly Programmer's Manual carefully.
- Image Not Showing Possible Reasons
  - The image file may be corrupted
  - The server hosting the image is unavailable
  - The image path is incorrect
  - The image format is not supported
  Learn More →
  We care about CSR cycles at the moment.
- Can you improve the assembly code generated by gcc with optimizations? Or, can you write even faster/smaller programs in RISC-V assembly?
- You may drop some function calls and apply techniques such as loop unrolling and peephole optimization.
  
  Quote from RISC-V Instruction Set Manual: The RDCYCLE pseudo-instruction reads the low XLEN bits of the cycle CSR which holds a count of the number of clock cycles executed by the processor on which the hardware thread is running from an arbitrary start time in the past. RDCYCLEH is an RV32I-only instruction that reads bits 63–32 of the same cycle counter. The underlying 64-bit counter should never overflow in practice.
Write down your thoughts and progress in HackMD notes.
- Example page
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
  Do not modify this note.
- Insert your HackMD notes and programs in the following table.
- Of course, you MUST write in English.
BONUS: Find the bugs inside rv32emu and send pull requests to improve it!

Fill in the table for your homework

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Be aware of spaces. Separate each item with |

Formal given name	Descriptions	HackMD note
Sample0	Monotonic Array	Homework2
Sample1	Move zeroes	Homework2
Sample2	Sort Colors	Homework2
Sample3	Length of Last Word	Homework2
Sample4	Single number	Homework2
黃柏叡	Palindrome detection with counting leading zeros	Homework2
施宇庭	Logarithm of bfloat16 numbers	Homework2
洪胤勛	Reducing memory usage with bfloat and bfloat multiplication	Homework2
陳彥佑	Implement log base power of 2 with CLZ	Homework2
戴鈞彥	Find Leftmost 0-byte using CLZ	Homework2
廖泓博	Calculate the Hamming Distance using Counting Leading Zeros	Homework2
陳川曜	Find First String of 1-bits of a Given Length by CLZ	Homework2
吳堉銨	Find First String of 1-bits of a Given Length by CLZ	Homework2
林柏全	Data encryption using CLZ	Homework2
張澤家	Get sine value without floating point multiplication support	Homework2
鄭吉廷	A Novel Approach to IP Routing: CLZ for Prefix Match	Homework2
林子勝	Shell sort with FP32 in BF16 format	Homework2
盧俊銘	Implementation of multiplication overflow prediction for unsigned integers using CLZ	Homework2
張偉治	Matrix multiplication with floating point addition and multiplication	Homework2
侯廷翰	Find the position of MSB by CLZ	Homework2
林以薰	Calculate the Hamming Distance using Counting Leading Zeros	Homework2
洪佑杭	Find Leftmost 0-byte using CLZ	Homework2
邱德昌	Improve conversion of FP32 to BF16 and count the number of ones in binary representation	Homework2
李冠澄	Matrix multiplication with floating point addition and multiplication and improvement	Homework2
陳金諄	Implement priority encoder using CLZ	Homework2
鄭朝駿	Matrix multiplication using bfloat16	Homework2
魏彥庭	Implenet priority encoder using CLZ	Homework2
謝維倫	Multiplication overflow prediction for unsigned int using CLZ	Homework2
林允顥	Implement unsigned int mul by count leading zero	Homework2
高紹捷	Reducing memory usage with bfloat and bfloat multiplication	Homework2
陳浩文	Find Leftmost 0-byte using CLZ	Homework2
簡志耀	Calculate the Hamming Distance using Counting Leading Zeros	Homework2
王豊惟	Implement palindrome detection and using CLZ	Homework2
周育晨	Implement palindrome detection and using CLZ	Homework2
劉智恩	Implement transformation from integer to float by clz	Homework2
蔡忠翰	Implement function to find maximum absolute value in bfloat16 array for quantization	Homework2
鄭博文	Multiplication overflow prediction for unsigned int using CLZ	Homework2
許唯萱	Using Hamming Codes to implement number of "1" bits	Homework2
施柏安	Implement and analyze BF16 multiplication	Homework2
黃定山	Find the square root of a integer number	Homework2
洪碩星	Image scaling with Bilinear interpolation	Homework2
楊宇翔	Find first set using CLZ	Homework2
林勁羽	Calculate the Hamming Distance using CLZ	Homework2
魏泳禎	Implementation of multiplication overflow prediction for unsigned integers using CLZ	Homework2
彭煜博	Bitwise AND of Numbers Range	Homework2
陳燦仁	Implement unsigned int mul by count leading zero	Homework2
丁竟烽	Generate bitmask by CLZ	Homework2
李亮穎	Implementation of Bit-Plane Slicing with CLZ	Homework2
顏伯丞	Implement Binarization by count leading zero	Homework2
李承泰	Sum of Leading Zeros in Linked List by CLZ	Homework2
陸品潔	Calculate the Hamming Distance using Counting Leading Zeros	Homework2
江冠霆	Data encryption using CLZ	Homework2
李晨瑞	Implement Binarization by count leading zero	Homework2
蕭明祥	Implement Binarization by count leading zero	Homework2
鍾沅熹	Implement Variable Byte Compression By Counting Leading Zeros	Homework2
范紘維	Indexing of hierachical data structures by CLZ	Homweork2
李熙堃	Bits Compression Using CLZ	Homework2
曾鼎棊	Output different number with same binary format by using clz	Homework2
張正德	Image scaling with Bilinear interpolation by float32 multiplication	Homework2
林昊霆	Implementing FP32 Operations by Applying FP32 to Bfloat16 Conversion Algorithm	Homework2
林晉宇	Convert RGB image into grayscale by using RV32I ISA	Homework2
許卜元	Implement Binarization by count leading zero	?
倪英智	Multiplication Overflow Prediction	Homework2
陳冠元	Approximating a bfloat number using binary search	Homework2
黃于睿	Shell sort with FP32 in BF16 format	Homework2
唐飴苹	Implementation of Bit-Plane Slicing with CLZ	Homework2
劉庭聿	Calculate the Hamming Distance using Counting Leading Zeros	?

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Be aware of spaces. Separate each item with |