Quiz2 of Computer Architecture (2025 Fall)

--- tags: computer-arch --- # Quiz2 of Computer Architecture (2025 Fall) > Solutions ## Problem `A` You are tasked with designing a RISC-V assembly program that solves the classic Tower of Hanoi puzzle using an iterative method. The key insight lies in Gray codes, which trace a Hamiltonian path on an $n$-dimensional cube by interpreting each bit as a coordinate. The reflected Gray code is the most common variant, constructed as follows: * 1-bit Gray code: `0`, `1`. * 2-bit Gray code: Mirror the 1-bit sequence and prefix with `0` and `1`, producing `00, 01, 11, 10`. * n-bit Gray code: Recursively mirror the $(n-1)$-bit sequence, prefix with `0` and `1`, then concatenate. A key property of Gray codes is that consecutive entries differ by exactly one bit flip. Determining the flipped bit * Even number of `1`s: flip the last bit. * Odd number of `1`s: flip the bit immediately to the left of the rightmost `1`. Gray code generation and the Tower of Hanoi share the same recursive nature: 1. Recursive structure * Tower of Hanoi: solve for $n-1$ disks, move the largest disk, then solve again for $n-1$ disks. * Gray code: mirror the $(n-1)$-bit sequence, prefix with `0` and `1`, then concatenate. 2. Pattern formation Both processes generate a self-similar sequence, often described by the recursive “ABACABADABACABA” pattern. 3. Iterative move generation via Gray code * Disk labeling: assign letters to disks, with the smallest as `A`, the next as `B`, and so on. * Move determination: each bit flip in the Gray code indicates moving the corresponding disk. * Special handling for disk `A`: always move the smallest disk in a fixed cyclic direction (e.g., Peg 1 → Peg 2 → Peg 3 → Peg 1 → …). This scheme ensures that exactly one disk moves per step, and that no larger disk is ever placed on a smaller one, preserving the rules of the Tower of Hanoi. Your task is to complete the iterative RV32I implementation so that it executes correctly in the Ripes simulator. ```c .text .globl main main: addi x2, x2, -32 sw x8, 0(x2) sw x9, 4(x2) sw x18, 8(x2) sw x19, 12(x2) sw x20, 16(x2) li x5, 0x15 sw x5, 20(x2) sw x5, 24(x2) sw x5, 28(x2) # Fix disk positions (BLANK 1-3: neutralize x5 effect) # BLANK 1: Fix position at x2+20 sw x0, 20(x2) # BLANK 2: Fix position at x2+24 sw x0, 24(x2) # BLANK 3: Fix position at x2+28 sw x0, 28(x2) addi x8, x0, 1 game_loop: # BLANK 4: Check loop termination (2^3 moves) addi x5, x0, 8 beq x8, x5, finish_game # Gray code formula: gray(n) = n XOR (n >> k) # BLANK 5: What is k for Gray code? srli x5, x8, 1 # BLANK 6: Complete Gray(n) calculation xor x6, x8, x5 # BLANK 7-8: Calculate previous value and its shift addi x7, x8, -1 srli x28, x7, 1 # BLANK 9: Generate Gray(n-1) xor x7, x7, x28 # BLANK 10: Which bits changed? xor x5, x6, x7 # Initialize disk number addi x9, x0, 0 # BLANK 11: Mask for testing LSB andi x6, x5, 1 # BLANK 12: Branch if disk 0 moves bne x6, x0, disk_found # BLANK 13: Set disk 1 addi x9, x0, 1 # BLANK 14: Test second bit with proper mask andi x6, x5, 2 bne x6, x0, disk_found # BLANK 15: Last disk number addi x9, x0, 2 disk_found: # BLANK 16: Check impossible pattern (multiple bits) andi x30, x5, 5 addi x31, x0, 5 beq x30, x31, pattern_match jal x0, continue_move pattern_match: continue_move: # BLANK 17: Word-align disk index (multiply by what?) slli x5, x9, 2 # BLANK 18: Base offset for disk array addi x5, x5, 20 add x5, x2, x5 lw x18, 0(x5) bne x9, x0, handle_large # BLANK 19: Small disk moves by how many positions? addi x19, x18, 2 # BLANK 20: Number of pegs addi x6, x0, 3 blt x19, x6, display_move sub x19, x19, x6 jal x0, display_move handle_large: # BLANK 21: Load reference disk position lw x6, 20(x2) # BLANK 22: Sum of all peg indices (0+1+2) addi x19, x0, 3 sub x19, x19, x18 sub x19, x19, x6 display_move: la x20, obdata add x5, x20, x18 # BLANK 23: Load byte from obfuscated data lbu x11, 0(x5) # BLANK 24: Decode constant (0x6F) li x6, 0x6F xor x11, x11, x6 # BLANK 25: Final offset adjustment addi x11, x11, -0x12 add x7, x20, x19 lbu x12, 0(x7) xor x12, x12, x6 addi x12, x12, -0x12 la x10, str1 addi x17, x0, 4 ecall addi x10, x9, 1 addi x17, x0, 1 ecall la x10, str2 addi x17, x0, 4 ecall addi x10, x11, 0 addi x17, x0, 11 ecall la x10, str3 addi x17, x0, 4 ecall addi x10, x12, 0 addi x17, x0, 11 ecall addi x10, x0, 10 addi x17, x0, 11 ecall # BLANK 26: Calculate storage offset slli x5, x9, 2 addi x5, x5, 20 add x5, x2, x5 # BLANK 27: Update disk position sw x19, 0(x5) # BLANK 28-29: Increment counter and loop addi x8, x8, 1 jal x0, game_loop finish_game: lw x8, 0(x2) lw x9, 4(x2) lw x18, 8(x2) lw x19, 12(x2) lw x20, 16(x2) addi x2, x2, 32 addi x17, x0, 10 ecall .data obdata: .byte 0x3c, 0x3b, 0x3a str1: .asciz "Move Disk " str2: .asciz " from " str3: .asciz " to " ``` A01: `sw x0, 20(x2)` A02: `sw x0, 24(x2)` A03: `sw x0, 28(x2)` A04: `8` A05: `x8` A06: `x5` A07: `1` A08: `xor` A09: `x8` A10: `x5` A11: `xor` A12: `-1` A13: `x7` A14: `x7` A15: `x28` A16: `x6` A17: `x7` A18: `1` A19: `bne` A20: `x0` A21: `addi` A22: `2` A23: `2` A24: `5` A25: `2` A26: `20` A27: `2` A28: `3` A29: `20(x2)` A30: `3` A31: `lbu` A32: `0(x5)` A33: `0x6F` A34: `-0x12` A35: `2` A36: `sw` A37: `0(x5)` A38: `addi` A39: `1` A40: `jal` A41: `game_loop` A42: (意思相近就給分) The program needs to display three peg names: - Peg A: ASCII code 65 (0x41) - Peg B: ASCII code 66 (0x42) - Peg C: ASCII code 67 (0x43) Encoding Process ``` Step 1: Choose XOR key = 0x6F Step 2: Add offset = 0x12 Step 3: Apply encoding 'A' encoding: Original: 0x41 (65) Add offset: 0x41 + 0x12 = 0x53 XOR: 0x53 XOR 0x6F = 0x3C 'B' encoding: Original: 0x42 (66) Add offset: 0x42 + 0x12 = 0x54 XOR: 0x54 XOR 0x6F = 0x3B 'C' encoding: Original: 0x43 (67) Add offset: 0x43 + 0x12 = 0x55 XOR: 0x55 XOR 0x6F = 0x3A ``` Therefore obdata stores: `.byte 0x3c, 0x3b, 0x3a` The decoding steps in the program: ```asm # 1. Load encoded data la x20, obdata # x20 points to encoded data add x5, x20, x18 # x18 is position index (0/1/2) lbu x11, 0(x5) # Load encoded byte # 2. XOR decoding li x6, 0x6F # Load XOR key xor x11, x11, x6 # Apply XOR # 3. Subtract offset addi x11, x11, -0x12 # Restore ASCII value ``` --- ## Problem `B` Consider a 16-bit fixed-point format with 1 sign bit, 5 integer bits, and 10 fraction bits. The fraction field encodes ($2^{-1},2^{-2},\dots,2^{-10}$). For example, `0b0 10010 1010000000` encodes (18+0.625=18.625). Also consider a 16-bit floating-point format that follows IEEE-754 half precision conventions (including subnormals, NaNs, etc.). It has 1 sign bit, a 5-bit exponent with **bias (=15)**, and a 10-bit fraction. Part 1 * Write (-18.625) in hexadecimal using the 16-bit **fixed-point** format. **B01** * Write (-18.625) in hexadecimal using the 16-bit **floating-point** format. **B02** * How many numbers in the range (`[8,16)`) (including 8, excluding 16) are representable by the fixed-point format? **B03** --- * B01: `0xCA00` Reasoning: sign=1, integer=18=`10010`, fraction=0.625=`.101` → `1010000000` (10 bits). Bit pattern: `1 10010 1010000000` → nibbles: `1100 1010 1000 0000`? → `0xCA80`. ==FIXME== * B02: `0xCCA8` Reasoning: (18.625 = 1.0010101_2 \times 2^{4}). sign=1, exponent=(4+15=19=0b10011), fraction (drop the leading 1, pad to 10 bits)=`0010101000`. Bit pattern: `1 10011 0010101000` → hex `0xCCA8`. * B03: $2^{13}$ or 8192 Step size is (2^{-10}). The width of ([8,16)) is ($8=2^{3}$). The count is $$ \frac{16-8}{2^{-10}} = 8 \times 2^{10} = 2^{3} \times 2^{10} = 2^{13}. $$ Complete the following RV32I program. Using only integer instructions, it should pack the two encodings into 16-bit values and store them in memory at the labels `fx_out` and `fp_out`. The expected output in the Ripes simulator is: ``` FX1: 0xca80 FP1: 0xc92a FX2: 0x4aa0 FP2: 0x4ca8 --- Bit fields for FX1 --- Sign = 1, Int = 268435522, Frac = 640 ``` ```c # Fixed-Point and Floating-Point Encoding - Answer Key # Fixed-point: 1 sign, 5 integer, 10 fraction bits # Half-precision: 1 sign, 5 exponent (bias=15), 10 mantissa bits # # IMPORTANT: Using only x-register names, no ABI names # Only allowed pseudo-instructions: li, la .text .globl _start _start: # Part 1: Encode -10.25 in fixed-point # -10.25 = -(10 + 0.25) # Binary: 1 01010 0100000000 = 0xA900 lui x1, 0xA9000 srli x1, x1, 16 # Part 2: Encode -10.328125 in half-precision # -10.328125 = -1.291015625 × 2^3 # Binary: 1 10010 0100101010 = 0xC92A lui x4, 0xC92A0 srli x4, x4, 16 # Part 3: Encode 18.625 in both formats # Fixed-point: 0 10010 1010000000 = 0x4A80 lui x6, 0x4A800 srli x6, x6, 16 # Half-precision: 0 10011 0010101000 = 0x4CA8 lui x7, 0x4CA80 srli x7, x7, 16 # Part 4: Store results for verification la x15, results sh x1, 0(x15) sh x4, 2(x15) sh x6, 4(x15) sh x7, 6(x15) # Part 5: Bit field extraction # Extract sign bit from x1 (fixed-point -10.25) srli x9, x1, 15 andi x9, x9, 1 # Extract integer part (bits [14:10]) srli x10, x1, 10 andi x10, x10, 0x1F # Extract fraction (bits [9:0]) andi x11, x1, 0x3FF # Part 6: Set return values for verification # Place key results in specific registers for checking # x1 = 0xA900, x4 = 0xC92A, x6 = 0x4A80, x7 = 0x4CA8 # x9 = 1 (sign), x10 = 10 (integer), x11 = 256 (fraction) # Set exit code to indicate success li x10, 0 # Exit code 0 = success # Exit properly li x17, 10 # exit syscall ecall .data .align 4 results: .half 0, 0, 0, 0 # Four 16-bit results # Validation checklist: # After running in Ripes, check: # 1. Register x1 = 0x0000A900 (fixed-point -10.25) # 2. Register x4 = 0x0000C92A (half-precision -10.328125) # 3. Register x6 = 0x00004A80 (fixed-point 18.625) # 4. Register x7 = 0x00004CA8 (half-precision 18.625) # 5. Register x9 = 0x00000001 (sign bit) # 6. Register x10 = 0x00000000 (exit code, was 10 for integer) # 7. Register x11 = 0x00000100 (fraction = 256) # 8. Memory at 'results': A900, C92A, 4A80, 4CA8 ``` B04: `0xA9000` or equivalent B05: `16` B06: `0xC92A0` or equivalent B07: `srli` B08: `0x4A800` or equivalent B09: `x6` B10: `16` B11: `0x4CA80` or equivalent B12: `x1` B13: `2` B14: `4` B15: `x7` B16: `srli` B17: `15` B18: `10` B19: `0x1F` B20: `0x3FF` For the fixed-point format used above, compute the absolute error and relative error compared to the true value −18.625. Briefly explain why such an error arises in fixed-point representation. B21: `0` B22: `0` B023: 意思相近就給分 Verify what -18.625 should encode to: - Integer part: 18 = 10010 (5 bits) ✓ - Fractional part: 0.625 = 5/8 Converting 0.625 to the 10-bit fraction field: - 0.625 × 1024 = 640 - 640 in binary = 1010000000 (10 bits) Full encoding: - Sign: 1 (negative) - Integer: 10010 - Fraction: 1010000000 - Combined: 1 10010 1010000000 = 0xCA80 Verification of represented value: - Integer: 18 - Fraction: 640/1024 = 0.625 - Total: -(18 + 0.625) = -18.625 True value: -18.625Represented value: -18.625 Absolute Error: |Represented value - True value| = |-18.625 - (-18.625)| = 0 Relative Error: |Represented value - True value| / |True value| = 0 / 18.625 = 0 Although -18.625 has no representation error, fixed-point formats generally introduce errors due to: 1. Quantization Error: Values must be rounded to the nearest multiple of 2^(-10). For example, -18.627 would need to be rounded to either $-18.626953125$ (642/1024) or $-18.62793125$ (643/1024). 2. Finite Precision: Only values that are exact multiples of 1/1024 in the fractional part can be represented precisely. Most decimal values (like 0.1, 0.2, 0.3) cannot be represented exactly. 3. Rounding Effects: When the true value falls between two representable values, rounding to the nearest introduces error. Example with error: If we wanted to represent -18.7: - True value: -18.7 - Fraction needed: 0.7 × 1024 = 716.8 - Must round to 717/1024 ≈ 0.70019531 - Represented value: -18.70019531 - Absolute error: 0.00019531 - Relative error: 0.00019531/18.7 ≈ 0.001%