---
tags: computer-arch
---
# Assignment2: GNU Toolchain
> Due: ==Oct 31, 2023==
## Requirements
1. Following the instructions in [Lab2: RISC-V RV32I[MACF] emulator with ELF support](https://hackmd.io/@sysprog/SJAR5XMmi), select one assembly program from [Assignment1: RISC-V Assembly and Instruction Pipeline](https://hackmd.io/@sysprog/2023-arch-homework1), and adapt it into both RISC-V assembly and C implementations that can be executed flawlessly with [rv32emu](https://github.com/sysprog21/rv32emu).
* You must NOT select programs that you have previously submitted.
* You should provide a brief description of your motivations for your selection.
* You may choose to study the same subject as other students, but you must make your own discoveries.
* There are just **RV32I** instructions that can be used. This means that you MUST build C programs with the `-march=rv32i -mabi=ilp32` flags.
* RV32M (multiply and divide) and RV32F (single-precision floating point) are not permitted.
* :warning: [rv32emu](https://github.com/sysprog21/rv32emu) and [Ripes](https://github.com/mortbopet/Ripes) may not work together, therefore please be aware of the potential incompatibility. Please check [docs/syscall.md](https://github.com/sysprog21/rv32emu/blob/master/docs/syscall.md) and [src/syscall.c](https://github.com/sysprog21/rv32emu/blob/master/src/syscall.c) in advance.
* Do not duplicate workspaces or the entire repository from [rv32emu](https://github.com/sysprog21/rv32emu). As a starting point, copy the [`asm-hello`](https://github.com/sysprog21/rv32emu/tree/master/tests/asm-hello) directory instead. You shall modify `Makefile` and the linker script accordingly.
* [kdnvt](https://github.com/kdnvt) produced [some excellent work](https://hackmd.io/@kdnvt/2022-arch-hw1) that can be used as a benchmark for program analysis and future optimizations. Please read his report carefully and pay attention to certain suggestions and observations.
* (**Optional**) You have the choice to choose the programs, [pi.c](https://github.com/sysprog21/rv32emu/blob/master/tests/pi.c) and [nqueens.c](https://github.com/sysprog21/rv32emu/blob/master/tests/nqueens.c), and create RISC-V assembly that is superior to that produced by GNU Toolchain. That is, your handwritten RISC-V assembly program should run more quickly and occupy less space in the ELF image.
2. Disassemble the ELF files produced by the C compiler and contrast the handwritten and compiler-optimized assembly listings.
* You can append the compilation options to experiment. e.g., Change `-Ofast` (optimized for speed) to `-Os` (optimized for size).
* Describe your obserations and explain.
3. Check the [ticks.c](https://github.com/sysprog21/rv32emu/blob/master/tests/ticks.c) and [perfcounter](https://github.com/sysprog21/rv32emu/tree/master/tests/perfcounter) for the statistics of your program's execution. Then, try to optimize the handwritten/generated assembly. You shall read [RISC-V Assembly Programmer's Manual](https://github.com/riscv/riscv-asm-manual/blob/master/riscv-asm.md) carefully.
* :warning: We care about CSR cycles at the moment.
* Can you improve the assembly code generated by gcc with optimizations? Or, can you write even faster/smaller programs in RISC-V assembly?
* You may drop some function calls and apply techniques such as [loop unrolling](https://en.wikipedia.org/wiki/Loop_unrolling) and [peephole optimization](http://homepage.cs.uiowa.edu/~dwjones/compiler/notes/38.shtml).
> Quote from [RISC-V Instruction Set Manual](https://github.com/riscv/riscv-isa-manual): The RDCYCLE pseudo-instruction reads the low XLEN bits of the cycle CSR which holds a count of the number of clock cycles executed by the processor on which the hardware thread is running from an arbitrary start time in the past. RDCYCLEH is an RV32I-only instruction that reads bits 63–32 of the same cycle counter. The underlying 64-bit counter should never overflow in practice.
4. Write down your thoughts and progress in [HackMD notes](https://hackmd.io/s/features).
* [Example page](https://hackmd.io/@wIVnCcUaTouAktrkMVLEMA/SJEP_amvK)
> :warning: Do not modify this note.
* Insert your HackMD notes and programs in the following table.
* Of course, you MUST write in English.
5. BONUS: Find the bugs inside [rv32emu](https://github.com/sysprog21/rv32emu) and send pull requests to improve it!
## Fill in the table for your homework
> :warning: Be aware of spaces. Separate each item with ==` | `==
| Formal given name | Descriptions | HackMD note |
| ----------------- | ------------ | ------------ |
| Sample0 | Monotonic Array | [Homework2](https://hackmd.io/@kdnvt/arch-2022-hw2) |
| Sample1 | Move zeroes | [Homework2](https://hackmd.io/@wanghanchi/S1q0aBHQj) |
| Sample2 | Sort Colors | [Homework2](https://hackmd.io/@wIVnCcUaTouAktrkMVLEMA/SJEP_amvK) |
| Sample3 | Length of Last Word | [Homework2](https://hackmd.io/@Wc9buHGHRaudUz3D5Dp_FA/By0vP2C8t) |
| Sample4 | Single number | [Homework2](https://hackmd.io/@WeiCheng14159/S1H6X_VOw) |
| 黃柏叡 | Palindrome detection with counting leading zeros | [Homework2](https://hackmd.io/@coding-ray/2023-ca-hw-2) |
| 施宇庭 | Logarithm of bfloat16 numbers | [Homework2](https://hackmd.io/@yutingshih/arch2023-homework2) |
| 洪胤勛 | Reducing memory usage with bfloat and bfloat multiplication | [Homework2](https://hackmd.io/@KXkA4u0LQuyNTwOorDw2RA/SkLfc-fGT) |
| 陳彥佑 | Implement log base power of 2 with CLZ | [Homework2](https://hackmd.io/@yAB_kQtlST6mqZV2uCHm3Q/rkhWQsiZp) |
| 戴鈞彥 | Find Leftmost 0-byte using CLZ | [Homework2](https://hackmd.io/@ranvd/computer-arch-hw2)|
| 廖泓博 | Calculate the Hamming Distance using Counting Leading Zeros | [Homework2](https://hackmd.io/@kc71486/computer_architecture_hw2) |
| 陳川曜 | Find First String of 1-bits of a Given Length by CLZ | [Homework2](https://hackmd.io/@cychen/computer_architecture_hw2) |
| 吳堉銨 | Find First String of 1-bits of a Given Length by CLZ | [Homework2](https://hackmd.io/@c3WNnG7RRK2J17ifSiezZA/BycALdrfp)|
| 林柏全 | Data encryption using CLZ | [Homework2](https://hackmd.io/@c3qLIGuDRtWykAmg5L50Ww/HyTcqcCWa) |
| 張澤家 | Get sine value without floating point multiplication support | [Homework2](https://hackmd.io/@NeedSleep/Hy0OdyWz6) |
| 鄭吉廷 | A Novel Approach to IP Routing: CLZ for Prefix Match | [Homework2](https://hackmd.io/@2ytPc3wiSXuOhkj2eWyplA/rydxljGMa) |
| 林子勝 | Shell sort with FP32 in BF16 format | [Homework2](https://hackmd.io/@r9mdQxj3TwqEuQFZ41LNsw/BJTdUszfT) |
| 盧俊銘 | Implementation of multiplication overflow prediction for unsigned integers using CLZ | [Homework2](https://hackmd.io/@jimmylu0303/assignment2) |
| 張偉治 | Matrix multiplication with floating point addition and multiplication | [Homework2](https://hackmd.io/@nfUUgsYRTGy81y5d9AYOyg/Sy1UeY7zp) |
| 侯廷翰 | Find the position of MSB by CLZ | [Homework2](https://hackmd.io/@M1Il4baLQwe1hoqHMQez_g/B1LkhCVGa) |
| 林以薰 | Calculate the Hamming Distance using Counting Leading Zeros | [Homework2](https://hackmd.io/@scones525/SJ2gJgpW6) |
| 洪佑杭 | Find Leftmost 0-byte using CLZ | [Homework2](https://hackmd.io/@hungyuhang/risc-v-hw2) |
| 邱德昌 | Improve conversion of FP32 to BF16 and count the number of ones in binary representation | [Homework2](https://hackmd.io/cawa3-fBTR6NiBz7ykTUbQ?view) |
| 李冠澄 | Matrix multiplication with floating point addition and multiplication and improvement | [Homework2](https://hackmd.io/@j2MyJwXxSyCtQi-KVu4CKQ/B1IH_A2Za) |
| 陳金諄 | Implement priority encoder using CLZ | [Homework2](https://hackmd.io/@david96514/rJgCn1NGT) |
| 鄭朝駿 | Matrix multiplication using bfloat16 | [Homework2](https://hackmd.io/@J7GZnFx3Qe-HPvOgHWEQsg/Hym2DrGM6) |
| 魏彥庭 | Implenet priority encoder using CLZ | [Homework2](https://hackmd.io/@Terry7Wei7/homework2/edit) |
| 謝維倫 | Multiplication overflow prediction for unsigned int using CLZ | [Homework2](https://hackmd.io/@VCNgJgo3RCyrEhvI9NKLUQ/rkTiCzEGp) |
| 林允顥 | Implement unsigned int mul by count leading zero | [Homework2](https://hackmd.io/@fewletter/riscvtoolchain) |
| 高紹捷 | Reducing memory usage with bfloat and bfloat multiplication | [Homework2](https://hackmd.io/@F3cNkb4bSWKg00J7O-_y8w/B16yJCif6) |
| 陳浩文 | Find Leftmost 0-byte using CLZ | [Homework2](https://hackmd.io/@K7_Ko_hwTnC73N7fg9U_xQ/H1zux4ufT) |
| 簡志耀 | Calculate the Hamming Distance using Counting Leading Zeros | [Homework2](https://hackmd.io/@RayqVUcSSOiX_pVJKh0DLg/r1Ev8NEMp) |
| 王豊惟 | Implement palindrome detection and using CLZ | [Homework2](https://hackmd.io/@mlFpoYoxSjevnbdVLdrijw/ByRlqt7f6) |
| 周育晨 | Implement palindrome detection and using CLZ | [Homework2](https://hackmd.io/@8G9q08Y6Tnq9OJMgzCV1TA/SJFksBEfa) |
| 劉智恩 | Implement transformation from integer to float by clz | [Homework2](https://hackmd.io/@chihenliu/ByLOiP9zp) |
| 蔡忠翰 | Implement function to find maximum absolute value in bfloat16 array for quantization | [Homework2](https://hackmd.io/@n-g2ouCxQbmy_er1MvIKhQ/rySv1tDGa) |
| 鄭博文 | Multiplication overflow prediction for unsigned int using CLZ | [Homework2](https://hackmd.io/@PWCheng/CAHW02) |
| 許唯萱 | Using Hamming Codes to implement number of "1" bits | [Homework2](https://hackmd.io/@weishiuan/rynG9KHMT) |
| 施柏安 | Implement and analyze BF16 multiplication | [Homework2](https://hackmd.io/@brianPA/r1C-9UIbT) |
| 黃定山 | Find the square root of a integer number | [Homework2](https://hackmd.io/@ShanHuang/H1oTu7Yza) |
| 洪碩星 | Image scaling with Bilinear interpolation | [Homework2](https://hackmd.io/@shhung/HkflviQGp) |
| 楊宇翔 | Find first set using CLZ | [Homework2](https://hackmd.io/@5HlvW0J1QTmqg6azf_URYQ/SyyC5iiz6) |
| 林勁羽 | Calculate the Hamming Distance using CLZ | [Homework2](https://hackmd.io/@edenlin/GNU_Toolchain_HammingDistanceByCLZ) |
| 魏泳禎 | Implementation of multiplication overflow prediction for unsigned integers using CLZ | [Homework2](https://hackmd.io/@qOvjgDvTQrGZGAlv5oHqsA/SJ5zn9zGa) |
| 彭煜博 | Bitwise AND of Numbers Range | [Homework2](https://hackmd.io/@normal/riscv-toolchain) |
| 陳燦仁 | Implement unsigned int mul by count leading zero | [Homework2](https://hackmd.io/@TRChen/ryy_JcRW6) |
| 丁竟烽 | Generate bitmask by CLZ | [Homework2](https://hackmd.io/@Paintako/CA_HW02) |
| 李亮穎 | Implementation of Bit-Plane Slicing with CLZ | [Homework2](https://hackmd.io/@LLL00/Computer_Architecture_HW2) |
| 顏伯丞 | Implement Binarization by count leading zero | [Homework2](https://hackmd.io/@QtzWvn_wQCicLQ65E_OREQ/risc-v-hw2) |
| 李承泰 | Sum of Leading Zeros in Linked List by CLZ | [Homework2](https://hackmd.io/y5Fq1G-ES228t76RJx-3Jw?view) |
| 陸品潔 | Calculate the Hamming Distance using Counting Leading Zeros | [Homework2](https://hackmd.io/@GliAmanti/H1XlUJE-p) |
| 江冠霆 | Data encryption using CLZ | [Homework2](https://hackmd.io/@VBHMCAcXSo2j5UzcTBAQZQ/Hy83EpqM6) |
| 李晨瑞 | Implement Binarization by count leading zero | [Homework2](https://hackmd.io/@terry23304/CA_HW02) |
| 蕭明祥 | Implement Binarization by count leading zero | [Homework2](https://hackmd.io/@TXKjP2SFR8O-nS9pbDQAjw/SydvLYwGp) |
| 鍾沅熹 | Implement Variable Byte Compression By Counting Leading Zeros | [Homework2](https://hackmd.io/@freshLiver/2023-arch-hw2) |
| 范紘維 | Indexing of hierachical data structures by CLZ | [Homweork2](https://hackmd.io/@gV8IONkMS_a6aHt20QNuAg/r1u4OqLM6) |
| 李熙堃 | Bits Compression Using CLZ| [Homework2](https://hackmd.io/@JoshuaLee0321/Assignment2-GNU-Toolchain) |
| 曾鼎棊 | Output different number with same binary format by using clz | [Homework2](https://hackmd.io/@NdB0NahsRSSnwZagYYcL6g/ByUyEA2G6) |
| 張正德 | Image scaling with Bilinear interpolation by float32 multiplication | [Homework2](https://hackmd.io/@gofzKoaiTI6mFzp4FTuenw/H1MYtY-MT) |
| 林昊霆 | Implementing FP32 Operations by Applying FP32 to Bfloat16 Conversion Algorithm | [Homework2](https://hackmd.io/@TBL/B1b69Zaz6) |
| 林晉宇 | Convert RGB image into grayscale by using RV32I ISA | [Homework2](https://hackmd.io/@linyu0425/S1_d8FWzT) |
| 許卜元 | Implement Binarization by count leading zero | ? |
| 倪英智 | Multiplication Overflow Prediction| [Homework2](https://hackmd.io/@NIYINGCHIH/ByKY55cba) |
| 陳冠元 | Approximating a bfloat number using binary search | [Homework2](https://hackmd.io/@K1NCVjKnTCmNaikFb4gt-A/ry-p1dAzT) |
| 黃于睿 | Shell sort with FP32 in BF16 format | [Homework2](https://hackmd.io/@DarrenHuang0411/SyfITn0b6) |
| 唐飴苹 | Implementation of Bit-Plane Slicing with CLZ | [Homework2](https://hackmd.io/@O6C2C3zQQBanDM55QRZ7DQ/Lab2_GNU_Toolchain) |
| 劉庭聿 | Calculate the Hamming Distance using Counting Leading Zeros| ? |
> :warning: Be aware of spaces. Separate each item with ==` | `==