黃書堯
This sqrtf
RISC-V code originates from Quiz3 - Problem C.
Below is an explanation of the code logic.
We have a floating point number
A floating-point number can be represented as :
where
Since
The square root of the exponent part can be easily calculated by right-shifting the exponent by 1 bit.
We aim to compute
Instead, we can achieve this using multiplication operations only with:
From the above, we realize that we only need to calculate
The formula of the Newton-Raphson method is as follows:
Assuming the function is:
The iterative form becomes:
The B extension part shows as follows:
bseti a1, a1, 31
This line sets the hidden leading bit of the mantissa. To remove the B extension dependency, we replace it with lui
and or
.
-bseti a1, a1, 31
+lui t0, 0x80000
+or a1, a1, t0
To emulate the code, follow these steps:
$ git clone https://github.com/sysprog21/rv32emu
$ cd rv32emu
$ make
$ make check
Initially, an error might occur:
$ make check
Running hello.elf ... [OK]
Fetching SHA-1 of prebuilt binaries ... [OK]
Checking SHA-1 of prebuilt binaries ... [OK]
rv32emu: src/riscv.c:415: rv_create: Assertion `elf && elf_open(elf, attr->data.
user.elf_program)' failed.
Aborted (core dumped)
rv32emu: src/riscv.c:415: rv_create: Assertion `elf && elf_open(elf, attr->data.
user.elf_program)' failed.
Aborted (core dumped)
rv32emu: src/riscv.c:415: rv_create: Assertion `elf && elf_open(elf, attr->data.
user.elf_program)' failed.
Aborted (core dumped)
Running puzzle ... Failed.
make: *** [Makefile:354: check] 錯誤 1
To resolve this, change the system language to English (United States) via:
Select Setting
-> Region & Language
-> Language English(United States)
.
After changing the language, rerun:
$ make check
Check the file build/.config for configured items.
Running hello.elf ... [OK]
Fetching SHA-1 of prebuilt binaries ... [OK]
Checking SHA-1 of prebuilt binaries ...
SHA-1 verification fails! Re-fetching prebuilt binaries from "rv32emu-prebuilt" ...
- 100%[===================>] 4.49M 14.7MB/s in 0.3s
Running puzzle ... [OK]
Running fcalc ... [OK]
Running pi ... [OK]
This should pass all tests.
To evaluate the performance of the sqrt.S
implementation, we can utilize the performance counter example available in rv32emu/tests/perfcounter
.
$ cp -r tests/perfcounter .
Makefile
:.PHONY: clean
include ../mk/toolchain.mk
-CFLAGS = -march=rv32i_zicsr_zifencei -mabi=ilp32 -O2 -Wall
+CFLAGS = -march=rv32im_zicsr_zifencei -mabi=ilp32 -O2 -Wall
OBJS = \
getcycles.o \
getinstret.o \
- sparkle.o \
+ sqrtf.o \
main.o \
BIN = perfcount.elf
%.o: %.S
$(CROSS_COMPILE)gcc $(CFLAGS) -c -o $@ $<
%.o: %.c
$(CROSS_COMPILE)gcc $(CFLAGS) -c -o $@ $<
all: $(BIN)
$(BIN): $(OBJS)
$(CROSS_COMPILE)gcc -o $@ $^
clean:
$(RM) $(BIN) $(OBJS)
main.c
:#include <stdint.h>
#include <stdio.h>
#include <string.h>
extern uint64_t get_cycles();
extern uint64_t get_instret();
/*
* Taken from the Sparkle-suite which is a collection of lightweight symmetric
* cryptographic algorithms currently in the final round of the NIST
* standardization effort.
* See https://sparkle-lwc.github.io/
*/
-extern void sparkle_asm(unsigned int *state, unsigned int ns);
+extern float my_sqrtf(float input);
-#define WORDS 12
-#define ROUNDS 7
int main(void)
{
- unsigned int state[WORDS] = {0};
+ float input = 12.4;
/* measure cycles */
uint64_t instret = get_instret();
uint64_t oldcount = get_cycles();
- sparkle_asm(state, ROUNDS);
+ float result = sqrt(x);
uint64_t cyclecount = get_cycles() - oldcount;
+ printf("square root of %f is %f\n", input, result);
printf("cycle count: %u\n", (unsigned int) cyclecount);
printf("instret: %x\n", (unsigned) (instret & 0xffffffff));
- memset(state, 0, WORDS * sizeof(uint32_t));
- sparkle_asm(state, ROUNDS);
- printf("Sparkle state:\n");
- for (int i = 0; i < WORDS; i += 2)
- printf("%X %X\n", state[i], state[i + 1]);
return 0;
}
$ cd perfcounter
$ make
$ ../build/rv32emu perfcount.elf
Initially, the compilation might result in an error:
(leetcode) (base) phlab@phlab-System-Product-Name:~/rv32emu/perfcounter$ make
riscv-none-elf-gcc -march=rv32im_zicsr_zifencei -mabi=ilp32 -O2 -Wall -c -o getcycles.o getcycles.S
riscv-none-elf-gcc -march=rv32im_zicsr_zifencei -mabi=ilp32 -O2 -Wall -c -o getinstret.o getinstret.S
riscv-none-elf-gcc -march=rv32im_zicsr_zifencei -mabi=ilp32 -O2 -Wall -c -o sparkle.o sparkle.S
riscv-none-elf-gcc -march=rv32im_zicsr_zifencei -mabi=ilp32 -O2 -Wall -c -o main.o main.c
main.c:7:13: warning: conflicting types for built-in function 'sqrtf'; expected 'float(float)' [-Wbuiltin-declaration-mismatch]
7 | extern void sqrtf();
| ^~~~~
main.c:4:1: note: 'sqrtf' is declared in header '<math.h>'
3 | #include <string.h>
+++ |+#include <math.h>
4 |
riscv-none-elf-gcc -march=rv32im_zicsr_zifencei -mabi=ilp32 -O2 -Wall -c -o sqrtf.o sqrtf.S
sqrtf.S: Assembler messages:
sqrtf.S:24: Error: unrecognized opcode `bseti a1,a1,31', extension `zbs' required
make: *** [Makefile:16: sqrtf.o] Error 1
To resolve this, rename the function and file from sqrtf
to my_sqrt
.
After removing the B extension, the test output is:
../build/rv32emu perfcount.elf
square root of 16.900000 is 4.110961
cycle count: 67
instret: 2d6
inferior exit code 0