Quiz3 of Computer Architecture (2022 Fall)

--- tags: computer-arch --- # Quiz3 of Computer Architecture (2022 Fall) :::info :information_source: General Information * You are allowed to read **[lecture materials](http://wiki.csie.ncku.edu.tw/arch/schedule)**. * That is, an open book exam. * We are using the honor system during this quiz, and would like you to accept the following: 1. You will not share the quiz with anyone. 2. You will not talk to anyone about the information on the quiz until the answers are made public. * Each answer has 4 points. * You must answer in valid numeric/C99/assembly representation and/or English alphabet except your formal name. * Message **[the instructor via Facebook](https://www.facebook.com/JservFans)** if you found something wrong. * :timer_clock: 09:10 ~ 10:15AM on Oct 25, 2022 ::: ## Problem `A` You may use the [strlen](https://man7.org/linux/man-pages/man3/strlen.3.html) function to count the length of a string in the C programming language. For this function to be implemented in RISC-V, please fill out the following. Pseduoinstructions MUST NOT be used. ```c addi s1, x0, 1 jal strlen add a0, x0, s1 ecall strlen: A01 # Fill your answer sw s1, 0(sp) mv a1, x0 la s1, str loop: A02 # Fill your answer A03 # Fill your answer addi a1, a1, 1 addi s1, s1, 1 j loop epilogue: lw s1, 0(sp) A04 # Fill your answer jr ra ``` > * A01 = ? > * A02 = ? > * A03 = ? > * A04 = ? --- ## Problem `B` Write a RISC-V function that returns `0` if the 32-bit float input is an infinite value and a non-zero value otherwise. As usual, `a0` will be used to hold the input and output. Pseduoinstructions MUST NOT be used. ```c is_not_inf: B01 # Fill your answer addi a1, a1, -1 and a0, a0, a1 B02 # Fill your answer B03 # Fill your answer ret # Return instruction ``` > * B01 = ? > * B02 = ? > * B03 = ? --- ## Problem `C` The following circuit has a delay of 3 ns for NOT gates, 7 ns for AND and OR gates, and 8 ns for the "Black Box" logic component. The registers have setup times of 5 ns and a clk-to-q latency of 6ns. ![](https://hackmd.io/_uploads/SkNkEDV4j.png) 1. What is the maximum allowable hold time of the registers? __ C01 __ ns > * C01 = ? 2. What is the minimum acceptable clock period for this circuit? __ C02 __ ns > * C02 = ? --- ## Problem `D` The circuit seen below can be made simpler. Please write the circuit's origin boolean expression and then simplify it using the fewest possible two-input logic gates. __ D01 __ ![](https://hackmd.io/_uploads/S1M7BPN4s.png) > * D01 = ? --- ## Problem `E` All logic gates in the circuit below have a delay of 5 ns, RegC has a setup time of 6 ns, and RegA and RegB have setup, hold, and clk-to-q durations of 4 ns. ![](https://hackmd.io/_uploads/rkHtnvV4i.png) 1. What is the maximum allowable hold time for RegC? __ E01 __ ns > * E01 = ? 2. What is the minimum acceptable clock cycle time for this circuit (E02 ns), and clock frequency does it correspond to (E03 MHz)? > * E02 = ? > * E03 = ? --- ## Problem `F` Look at the RISC-V function below, whose start address is in `a0` and length is in a1, which sorts a word array in-place. ```c sort: # prologue mv s0, a0 0x00259493 # p1 addi s1, s1, -4 add s1, s1, s0 mv t0, s0 outer_loop: mv t1, t0 0x00032383 # p2 mv t3, t1 inner_loop: addi t1, t1, 4 lw t4, 0(t1) ble t4, t2, mystery_label mv t2, t4 mv t3, t1 mystery_label: blt t1, s1, inner_loop lw t5, 0(t0) sw t2, 0(t0) sw t5, 0(t3) addi t0, t0, 4 blt t0, s1, outer_loop # epilogue ret ``` 1. Please disassemble machine code marked `#p1` and `#p2` above to RISC-V instructions. (Please use register names, e.g., `s0`, `s1`, etc., NOT `x8`, `x9`, etc.) * `0x00259493 # p1`: __ F01 __ > * F01 = ? * `0x00032383 # p2`: __ F02 __ > * F02 = ? 2. `li x1, 0xDCBAABCD` is also a pseudo instruction. Below it is expanded to 2 normal instructions. Please fill in the blanks and translate each of them to machine code. | Assembly | Machine code | | -------- | -------- | | lui x1, 0x __ F03 __ | 0x __ F04 __ | | addi x1, __ F05 __ , __ F06 __ | 0x __ F07 __ | > * F03 = ? > * F04 = ? > * F05 = ? > * F06 = ? > * F07 = ? --- ## Problem `G` Assume that all input originates from a register and that no hold time laws have been broken. What is the highest frequency at which you can run this circuit's clock such that it operates properly? __ G01 __ Using these variables, formulate your response as a mathematical expression (you may also use `min()`, `max()`, `abs()`, and other simple operations if necessary): * X = XOR delay * N = NOT delay * C = t~clk−to−Q~ * S = t~setup~ * H = t~hold~ ![](https://hackmd.io/_uploads/Skq1oO4Vi.png) > * G01 = ? --- ## Problem `H` The `fabs` instruction in the x86 architecture returns a floating-point number's absolute value. Compared to utilizing branches, this command is quicker. 1. Complete the following C code that mimics this instruction without ternary operators or `if-else` statements. Assume that int and float are of equal size. (The x86 architecture makes use of the IEEE 754). ```c float fabs(float x) { int tmp = *(int *) &x; tmp &= H01 /* Your code here */; return H02 /* Your code here */; } ``` > * H01 = ? > * H02 = ? 2. Can we use the same method to get the absolute value of a single-precision 32-bit [HFP number](https://en.wikipedia.org/wiki/IBM_hexadecimal_floating-point)? __ H03 __ (Yes or No) Why? __ H04 __ > * H03 = ? > * H04 = ? --- ## Problem `J` We wish to develop a new ISA while we work on a new processor for an embedded application. We choose to include only one, the X-type instruction, because we are tired of the several RISC-V instruction types. Let's say we want to include the instructions below: ```c add rd1, rs1, rs2 and rd1, rs1, rs2 lw rd1, offset1 (rs1) sw rs2, offset1 (rs1) addi rd1, rs1, imm1 beq rs1, rs2, offset1 lui rd1, offset1 jal rd1, imm stw rs3, offset1, offset2 (rs1) ``` The new stw instruction stores the contents of rs3 into both `rs1 + offset1` and `rs1 + offset2`. 1. We want to do away with the `funct3` and `funct7` fields and only use an opcode. If we only wish to support the instructions listed above, what is the minimum number of bits the opcode field can be? __ J01 __ bit(s) > * J01 = ? 2. We want to be able to jump up to 64 KiB in either direction with a single instruction. How many bits are necessary to encode an immediate that would allow us to do this? __ J02 __ bit(s) Assume that, just like RV32, the least significant bit is an implicit 0 and is not stored in the instruction. > * J02 = ? 3. Then, we switch to a 32-bit machine, and finalize our instruction format to have 4 bits for each of the immediate fields, 4 bits for each register, and 4 bits for the opcode. Convert the instruction `stw x8, 0, 4 (x5)` into machine code. Leave your answer in binary (don’t forget the prefix!). If a field is not used, fill in the field with `x`s. __ J03 __ > * J03 = ? --- ## Problem `K` A [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) is a very clever data structure for effectively and probabilistically storing a set. It does two functions: **insert** and **check**. The fundamental principle of checking is to hash your search term many times. Each hash identifies a specific bit that has to be set or checked. So, to verify, you check to see if the bit is set. You do this repeatedly, with each iteration's count of repetitions being included in the hash (so each hash is different). The element does not exist in the bloom filter if not all bits are set. The element PROBABLY occurs in the bloom filter if all bits are set. Similar to that, you just set an element as present when using a bloom filter. Similar to that, you just set all those bits to 1 to indicate that an element is present in a bloom filter. A flexible and portable bloom filter design is what we are aiming for. Therefore, we create the structure below. ```c struct BloomFilter { uint32_t size; /* Size is # of bits, NOT BYTES, in the bloom filter */ uint16_t itercount; uint64_t (*)(void *data, uint16t iter) hash; uint8_t *data; }; ``` 1. On a 32-bit architecture that requires word alignment for 32-bit integers and pointers, what is `sizeof(struct BloomFilter)`? __ K01 __ bytes > * K01 = ? 2. And now we have the `insert` function. For this we need to set the appropriate bit for each iteration. ```c void insert(struct BloomFilter *b, void *element) { uint64_t bitnum; /* which bit we need to set */ for (int i = 0; i < (K02 /* Your code here */); ++i) { bitnum = b->hash(element, (uint16 t) i) % b->size; b->data[bitnum >> 3] |= 1 << (K03 /* Your code here */); } } ``` > * K02 = ? > * K03 = ?

Read more

案例探討: Ring buffer

並行程式設計: Atomics 操作

並行程式設計: Lock-Free Programming

2024 年 Linux 核心設計/實作課程作業 —— ktcp