# PART A: Mathematical Functions In this section, I focused on implementing fundamental matrix operations commonly used in neural networks. Specifically, I developed functions for dot product, matrix multiplication, element-wise ReLU, and ArgMax. All matrices were represented as 1D vectors in row-major order. This representation required careful attention to memory access patterns, particularly when working with strides to access elements non-contiguously in memory. --- ## Challenges and Solutions ### 1. Input Handling: Flattening the 2D Matrix **Challenge:** Transforming a 2D matrix into a 1D vector was the first hurdle. Working with MNIST-like data, I had to ensure that the flattened representation accurately followed the row-major order. **Solution:** Through precise index calculations, I ensured that each row of the 2D matrix was concatenated correctly. This set a solid foundation for subsequent matrix operations. --- ### 2. ReLU Implementation **Implementation:** ReLU was implemented by looping through the input array, using the index `i` as a pointer to process each value individually. Negative values were replaced with zero. **Challenge:** The VENUS system call requires `a0` as the control register. During static data testing, using `a0` for output caused conflicts when performing system calls. **Solution:** To resolve this, I temporarily stored output in `a1` before making system calls, ensuring that `a0` could properly control the output. --- ### 3. ArgMax Implementation **Implementation:** ArgMax was based on a "max register" function. The algorithm iterated through the array, comparing each element and updating the "reg_kingdom" (max value register) when a new maximum was found. Simultaneously, it tracked the corresponding index of the maximum value. --- ### 4. Dot Product Implementation **Thought Process:** - **Version 1:** The initial implementation focused on functionality, assuming both arrays had the same stride. Sizes `size1` and `size2` were independently configurable. ⚠️ **Key Issues:** - The dot product’s role in `matmul` assumes `stride1` = 1 and `stride2` = `matrixB_col`. - The sizes of both arrays must match by design. - **Version 2 (Improvements):** Added the ability to handle different strides (`stride1` and `stride2`). 🔑 **Key Insights:** - For `matmul`, `stride1` is always 1, while `stride2` typically exceeds 1 (e.g., the number of columns in matrix B). Additionally, I reduced the number of registers by combining `size1` and `size2` into a single size register. **Future Improvements:** Develop a version where both `stride1` and `stride2` can be independently configurable, enhancing the function's general utility. --- ### 5. Matrix Multiplication (MatMul) Implementation **Implementation:** Matrix multiplication was achieved by calculating the dot product between rows of `M1` and columns of `M2`. This process repeated for each pair, with the total number of computations equaling `M1_col_number` × `M2_row_number`. --- ## Key Takeaways from PART A ### 1. Static Memory Allocation for Testing Throughout PART A, I learned to define static memory in the data segment for testing each function. This approach allowed for easy verification of function outputs without worrying about dynamic memory issues during initial development. --- ### 2. Debugging with VENUS Web Simulator Debugging assembly can be daunting, but **VENUS Web Simulator** proved invaluable. By stepping through each instruction and observing register values, I could pinpoint errors. Setting breakpoints using `EBREAK` allowed me to halt execution at critical points, making the debugging process more manageable and systematic. --- ### 3. Mastery of Function Calling Conventions Implementing these functions solidified my understanding of **RISC-V calling conventions**. Specifically: - **Caller-saved registers** (`reg_a`, `reg_t`) were used for temporary values. - **Callee-saved registers** (`reg_s`, `reg_ra`) ensured that critical values were preserved across function calls. Through practice, I became adept at crafting robust **prologue** and **epilogue** sections for each function, ensuring that register states were properly saved and restored. --- ## Final Reflection on PART A This section was a deep dive into low-level programming and assembly concepts. It challenged my understanding of memory management, register handling, and function calling conventions. By the end, I felt more confident in writing efficient, bug-free assembly code, a foundational skill for building more complex systems in the future.