--- title: "Lab1: R32I Simulator" --- Lab1: R32I Simulator === ###### tags: `RISC-V`, `Computer Architeture` [TOC] ## Definition of `printf(const char* fmt, ...)` :::success Writes the C string pointed by format to the standard output (stdout). If format includes format specifiers (subsequences beginning with %), the additional arguments following format are formatted and inserted in the resulting string replacing their respective specifiers. ::: ## Limit and Compromisation 1. Because we are using RV32I as our simulation platform, without multiplication extension, it's too complex to get digits in decimal. Thus, I didn't implement `%d` or other base-10 format, such as `%f`, `%e`, `%a` and so on. As for `%p`, because it doesn't differ from `%x`, I skip the implementation of it. 2. While using convetional printf, it doesn't support binary printing. Thus, I defined a `%b` myself different from normal format. :::info Thus, in the end I implement `%s`(string/char*), `%c`(char), `%b`(binary), `%o`(octal), `%x/%X`(hexidecimal). | specifier | description | example | |:---------:| ---------------------------------------------------------- | -------- | | %s | String of characters ending with null character. | asdfghjk | | %c | Single character. | d | | %b | Binary represented unsigned integer. | 11010 | | %o | Octal represented unsigned integer. | 32 | | %x | Hexidecimal representaed unsigned integer with lower case. | 1a | | %X | Hexidecimal representaed unsigned integer with upper case. | 1A | | %% | A % followed by another % character represent a single %. | % | | %a | **Example for unsupported specifier.** | a | (Table 1) Supported Specifier ::: ## C Code Preliminary Implementation [source code]() The following is execution flow of this subroutine. ![](https://i.imgur.com/b5qum8f.jpg) (Fig. 1)Flowchart of C code logic ### Variadic Function In function declaration I use `void self_print(const char* format, ...)` to let compiler know this function takes infinite arity. In body of function, I use `va_list ap` as argument parser, and initialize it with `va_start(ap, format)`, so that `ap` would be pointed to the argument after `format`. Because the implementation varies in different compiler, and I couldn't find any documentation explaining `va_list` and its related operations; thus I would explain what `va_list` is according to [reference 1](https://www.itread01.com/content/1541061843.html). The compiler of which is told to be VC6.0. - `va_list` is actually a `char *` type, and would be pointed to arguments passed into this function. Three related macros are used to handle `va_list` in this function, which are `va_start`, `va_arg` and `va_end`. Below is how it's used in c code. ```c= va_list ap; ``` - `va_start(ap, v)` is to set `ap` pointed to the argument after last known argument `v`. Below is how it's used in c code. ```c= va_start(ap, format); ``` - `va_arg(ap, type)` is to get current pointed argument as `type`, and move ap to next argument. It should be noted that in stack, it's aligned to size of integer due to [type promotion](https://en.cppreference.com/w/c/language/conversion). Thus, even taking `char` or `short`, we still need to put `int` as `type`, and cast it to `(char)` or `(short)` afterward. Below is how it's used in c code. ```c= char chrBuff = (char)va_arg(ap, int); ``` - `va_end(ap)` is simply assign null pointer to ap, avoding memory leak. ```c= va_end(ap); ``` For each qualified `%` specifier, I use `va_arg(ap, type)` to get arguments, and do corresponding procedure to process the argument. ### Specifier Recognition and Handling To recognize specifier in `format`, I check if current char buffer is `%` at the first place. If so, then char buffer would be updated with the next character, else way it would just print out char buffer. If the character following `%` is `s`, then it would use `char* pChar` to handle `va_arg(ap, char*)`, which would get the argument as char*. The `pChar` would iterate every byte of this char* argument and print out each of character until getting a null-ending of this string. If the character follwing `%` is `c`, then `putchar((char)va_arg(ap, int))` would be call immediately to print the character in argument. The reason that using ### Test Case To test this funciton thoroughly, I set up the following variable to run on `self_print` function. | name | type | value | tested specifier | | -------- | ------------ |:----------------------------------:| ---------------- | | pChr | char* | "StringInArg" | %s | | chr | char | 'c' | %c | | testHex0 | unsigned int | 0x102345 | %x/%X | | testHex1 | unsigned int | 0x6789AB | %x/%X | | testHex2 | unsigned int | 0xCDEF | %x/%X | | testOct | unsigned int | 012345670 | %o | | testHex0 | unsigned int | 0b00111011000101001111010100001111 | %b | (Table 2) Test Cases ```c= int main(int argc, char *argv[]) { char *pChr = "StringInArg"; char chr = 'c'; unsigned int testHex0 = 0x102345, testHex1 = 0x6789AB, testHex2 = 0xCDEF, testOct = 012345670, testBin = 0b00111011000101001111010100001111; self_printf("%s\nnormalString\nCharInArg: %c\nPercentSign: %%\nNot supporting: %e%e%r%d\nHex0: %x %X\nHex1: %x %X\nHex2: %x %X\nOct: %o\nBin: %b", pChr, chr, testHex0, testHex0, testHex1, testHex1, testHex2, testHex2, testOct, testBin); return 0; } ``` After compiled by gcc 9.2.0 and executed on Windows10 1909 version The result would be as below. ![](https://i.imgur.com/EkTjDkE.jpg) (Fig. 2) Execution Result of C Code ## Assembly Code Implementation ```clike= .data pChr: .string "StringInArg" str1: .string "%s\nnormalString\nCharInArg: %c\nPercentSign: %%\nNot supporting: %e%e%r%d\nHex0: %x %X\nHex1: %x %X\nHex2: %x %X\nOct: %o\nBin: %b" .text main: la s0, pChr # pChr = "StringInArg" addi s1, zero 99 # chr = 'c' lui s2, 0x102 addi s2, s2, 0x345 # testHex0 = 0x102345 lui s3, 0x679 addi s3, s3, -0x655 # testHex1 = 0x6789AB lui s4, 0xD addi s4, s4, -0x211 # testHex2 = 0xCDEF lui s5, 0x29d addi s5, s5, -0x448 # testOct = 0o12345670 lui s6, 0x3b14f addi s6, s6, 0x50f # testBin = 0b00111011000101001111010100001111 = 0x3b14f50f la t0, str1 # "%s\nnormalString\nCharInArg: %c\nPercentSign: %%\nNot supporting: %e%e%r%d\nHex0: %x %X\nHex1: %x %X\nHex2: %x %X\nOct: %o\nBin: %b" addi sp, sp, -44 sw t0, 0(sp) sw s0, 4(sp) sw s1, 8(sp) sw s2, 12(sp) sw s2, 16(sp) sw s3, 20(sp) sw s3, 24(sp) sw s4, 28(sp) sw s4, 32(sp) sw s5, 36(sp) sw s6, 40(sp) jal self_print addi sp, sp, 44 addi a0, zero, 0 # return 0 addi a7, zero, 93 ecall self_print: lw t0, 0(sp) # const char* format addi t1, sp, 4 # va_list ap; va_start(ap, format); addi sp, sp, -40 # s1 # s0 # char digits[32]; sw s0, 32(sp) sw s1, 36(sp) while_begin: lb t2, 0(t0) # chr = *format beq t2, zero, while_end addi t0, t0, 1 # format++ addi t3, zero, 37 # '%' bne t2, t3, else # if(chr == '%') lb t2, 0(t0) # chr = *format addi t0, t0, 1 # format ++ addi t3, zero, 115 bne t2, t3, case_c # case 's' lw t3, 0(t1) addi t1, t1, 4 # pChr = va_arg(ap, char*) lb t4, 0(t3) case_s: # for loop beq t4, zero, while_begin # *pChr != 0 addi a0, t4, 0 addi a7, zero, 11 ecall # putchar(*pChr) addi t3, t3, 1 lb t4, 0(t3) jal zero, case_s case_c: addi t3, zero, 99 bne t2, t3, case_x # case 'c' lw a0, 0(t1) addi t1, t1, 4 # (char)va_arg(ap, int) addi a7, zero, 11 ecall # putchar jal zero, while_begin case_x: addi t3, zero, 120 bne t2, t3, case_x_ # case 'x' addi t2, zero, 0 # capFlag = 0 jal zero, case_xx case_x_: addi t3, zero, 88 bne t2, t3, case_o # case 'X' addi t2, zero, 1 # capFlag = 1 case_xx: addi t3, zero, 4 # shiftAmt = 4 addi t4, zero, 15 # remain = 15 jal zero, switch_end case_o: addi t3, zero, 111 bne t2, t3, case_b # case 'o' addi t3, zero, 3 # shfitAmt = 3 addi t4, zero, 7 # remain = 7 jal zero, switch_end case_b: addi t3, zero, 98 bne t2, t3, case_% # case 'b' addi t3, zero, 1 # shiftAmt = 1 addi t4, zero, 1 # remain = 7 jal zero, switch_end case_%: addi t3, zero, 37 bne t2, t3, default # case '%' addi a0, zero, 37 addi a7, zero, 11 ecall # putchar('%') jal zero, while_begin default: addi a0, t2, 0 addi a7, zero, 11 ecall # putchar(chr) jal zero, while_begin switch_end: lw t5, 0(t1) addi t1, t1, 4 # value = va_arg(ap, unsigned int); addi t6, zero, 0 # digitIdx = 0 parse: and s0, t5, t4 # digit = (char)(value & remain); srl t5, t5, t3 # value >> = shiftAmt; addi s0, s0, -10 blt s0, zero, not_Hex # if(digit > 9) slli s1, t2, 5 # 0x20 if capFlag == 1 xori s1, s1, 0x20 # 0x20 if capFlag == 0 slti t2, s1, 0x1 addi s1, s1, 7 # (capFlag)?0x7:0x27 add s0, s0, s1 # digit += (capFlag)?0x7:0x27; not_Hex: addi s0, s0, 58 # digit + '0' + 10@L:112 add t6, t6, sp # &digits[digitIdx] sw s0, 0(t6) # digits[digitIdx] = digit + '0'; sub t6, t6, sp # restore digitIdx addi t6, t6, 1 # digitIdx ++ ble t5, zero, dump addi s1, zero, 32 blt t6, s1, parse # while(value && digitIdx < sizeof(digits)); dump: addi t6, t6, -1 # -- digitIdx add t6, t6, sp # &digits[--digitIdx] lb a0, 0(t6) addi a7, zero, 11 ecall # putchar(digits[--digitIdx]); sub t6, t6, sp # restore digitIdx bgt t6, zero, dump # while(digitIdx) jal zero, while_begin else: # else addi a0, t2, 0 addi a7, zero, 11 ecall # putchar(chr); jal zero, while_begin while_end: lw s1, 36(sp) lw s0, 32(sp) addi sp, sp, 40 jalr zero, ra, 0 ``` ![](https://i.imgur.com/nvXXULy.jpg) (Fig. 3) Execution Result through ![](https://i.imgur.com/I76fbs2.jpg) (Fig. 3-1) Console after Execution on Ripes ### Variadic Implementation with Stack The way I implement variadic function is referring to reference 1, by putting all of arguments to pass in stack, in order of last argument to first. The arguments are aligned to word-size. In variadic function itself, it can refer to stack pointer regiser and get first argument. By adding 4 each time, the variadic function can reach to subsequent arguments. Below is for concept demonstration. ``` === Bottom of stack arg_n-1 arg_n-2 . . . arg_0 ← Top of stack ``` ### Printing method in Ripes To implement `putchar(char)` function in, I used and only used environment call number 11 in Ripes. ``` add a0, zero, register_contains_character addi a7, zero, 11 ecall ``` ## Ripe Simulator Result 1. Calling self_print ![](https://i.imgur.com/0rjOwIM.jpg) (Fig. 4) Break Point on Jumping to self_print ![](https://i.imgur.com/h3bDmJr.jpg) (Fig. 5) Corresponding Stack Memory Data ![](https://i.imgur.com/PMyINhx.jpg) (Fig. 6) Corresponding Data Segment On the buttom of stack, we can see the value of `testBin` of `0x3b14f50f`, followed by `testOct`, and so on. The last one, on the top of stack, is `fotmat`, which is a string stored in data segment. It's easy to recognize "%s" and "normalString". As for '...' character, it's 0xa, which is newline character. 2. Initialization of self_print ![](https://i.imgur.com/pTewgpp.jpg) (Fig. 7) Initialization Step of self_Print ![](https://i.imgur.com/eiZSQBy.jpg) (Fig. 8) RegFile after Initialization First of all, for later uasage, it extract `format` to `to`, and do the counter part of `va_list ap; va_start(ap, format);` when storing to `t1`. Later, to allocate space for `s0`, `s1`, and `digits[32]`, stack grows by 40 bytes. As for beginning of `while_begin:`, it loads `chr` with `*format` in `t2`, in order to compare it with 37('%'). After first time running in this loop, `t0` is `format + 1`, `t1` is pointed to argument after `format`, and t2 is `0x25`. All proceed normally. 3. Different Specifiers Handling ![](https://i.imgur.com/H75gQAO.jpg) (Fig. 9) Case Handling of 'x' and 'X' ![](https://i.imgur.com/aIX83bj.jpg) (Fig. 10) CapFlag(`t2`) after case 'x' ![](https://i.imgur.com/6c91L7y.jpg) (Fig. 11) shiftAmt(`t3`) and remain(`t4`) after case 'x' There are several kinds of specifiers, and I take `%x` for example to explain how it works. In the program, after reaching `case_x` label, it would set `t2` as 0, and branch to `case_xx`, which is common part of `%x` and `%X`. `t3` and `t4` would be set to 4 and 15 repectively after the program reaches `case_xx` and branch to `switch_end` for digit parsing. In some other cases, such as `%c`, `%s`, `%%`, and `default`, due to no further process is required, thus the program would jump to `while_begin` unconditionally. 4. Digit Parsing ![](https://i.imgur.com/CcPJYrD.jpg) (Fig. 13) Digit Parsing of `%x` ![](https://i.imgur.com/EmLOFmv.jpg) (Fig. 14) Stack after Parsing In the beginning, memory in address of `t1` is loaded to `t5` as `value`, and `t1` is increased by 4 to move to another argument. `t6` plays the role of `digitIdx` and thus set to 0. As for `parse` block, it's doing ``` digit = value & remain; value >>=shiftAmt; if(digit > 9) digit += (capFlag)? 0x7: 0x27; ``` `s0`(`x8`) plays role of `digit`, and `s1`(`x9`) plays role of a temporary variable to store 0x7/0x27. As for `not_Hex` block, it's doing common part to process a digit. ``` digit += '0'; digits[digitIdx] = digit; if(value < 0 || digitIdx >= 32) goto dump; ``` In convenience of using `digits`, I set it at top of stack. Thus, in `not_Hex` block, there's lots of operation relating `x2`(`sp`), which is getting current `digits[digitIdx]` address. In (Fig. 14) it's easy to see from top of stack, the ascii is "102345". 5. Digit Dumping ![](https://i.imgur.com/N44OVCj.jpg) (Fig. 15) Digit Dumping for `%x` ![](https://i.imgur.com/G9KNH4k.jpg) (Fig. 16) Console after Execution For counter part of ``` do{putchar(digits[--digitIdx]);}while(digitIdx); ``` , it's not very complicated. What to note is loading chracter to `a0`, write `a7` with 11 and make an environment call would make Ripes to put character in `a0` on console. (Fig. 16) is what it looks like on console after reaching the end of the loop. ## Reference 1. [va_list原理及用法](https://www.itread01.com/content/1541061843.html)