or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing
xxxxxxxxxx
2022 Homework2: RISC-V Toolchain
tags:
RISC-V
,jserv
Before Start
We need to install RISC-V Toolchain on our virtual machine or computer and change the PATH variable of system, but the instruction in lab2 cannot change it permanently so we have to activate
riscv-none-elf-gcc/setenv
as a source file each time we log in. I think it is quite annoying so I rewrite the~/.profile
to automatically add our toolchain into user path each time we log in.If follow the instruction, the
riscv-none-elf-gcc
should under"/home/YourUserName"
directory, so~/.profile
should add following instruction:Add this instruction to
~/.profile
. It will automatically check whether toolchain exist and add the directory to user path.Then you can restart the terminal or use
source ~/.profile
to update PATH variable.Check $PATH
You should be able to see this in your path variable:
The following is my result:

Rewrite Concatenation of Array
I choose the Concatenation of Array from tonych1997
Motivation: My homework one is a practice of reducing array element. And this time I want to prctice how to increase the size of array with assembly code. Also, tonych1997 wrote enough comment so the assembly code is easy to understand.
Before I start to rewrite assembly program, I encountered a question. In the implementation of system call, the
syscall_write
function always print input data byte by byte, so the emulator cannot print 32 bits integer unless we first convert it into string.In rv32emu/src/state.h we can see that the default opened file of this emulator is stdin, atdout, stderr respectively, so the sample code in rv32emu/tests/asm-hello/hello.S will set a0 to 1 before ecall to get access of standard output.
Because there is no space for other argument to determine whether print integer or string unless rewrite some structure in emulator.c, but I think it is too complicated to modified. So I decided to use a very naive way to implement the integer output.
I modified syscall.c as following:
If
0xfff
passed into a0 register, thesyscall_write
function will become integer mode, which will use fprintf rather than fwrite to output data into stdout. But it cannot print integer into specified file.After modified all of the system call to fit rv32emu's SPEC, I tried to make file and got these errors:

The resons is that rv32emu and Ripes are not totally compatable, former do not support some instruction syntax sugar so we can modify the code by simply add comma between operands.
System calls is changed to fit rv32emu, and I also change some sigle character string into char to reduce memory size. Here is my modification:
Compile:
Execution and results:
And the assembly provide at here has the following output on Ripes:
Analysis
The CSR count is 225 in the picture, and line of code is 66 in this implementation.
There are some problem in this program:
Tonych1997 used two function (loopCon1, loopCon2) to implement an inline for loop in main function, which will spend lots of time. They stated the result might be wrong if combine these for loop together but after some studies, I realized that is because they didn't initialize the base address of array 2.
Because they didn't initialized the register value, array 2 will be stored at 0x00 and there is code section! The following is a monitoring of instruction memory.
before execution
after execution
The former is the memory contents before execution and latter is after. The code will even change instruction which haven'd been executed. I think if there is no instruction cache, which stored the unmodified instruction, the modified code will be executed and lead to some umpredictable behavior.
But in modern system, this situation is unlikely to happend because operating system will monitor the usage of memory and deny invalid memory access, causing a segmentation fault.
Optimization
In seek of solving the problems I mentioned before, I combine loopCon1 and loopCon2 together and make it inline. I also extend stack in
_start
function to store our new array.After the optimization, CSR cycle count reduced to 142 and LOC reduced to 41.
Observation: My implementation of printInt will push and pop stack each time be called, which is unnecessary because there is no other function call there. Each iteration we will do push and pop once and only 2 times is necessary. I also want to modifiy it to avoid function call overhead of storing return address.
Cycle count becomes 107 and LOC becomes 32. And I realized that I can take
li a7, SYSWRITE
outside from for loop because we only use SYSWRITE here. There is no need to specify for each iteration. And also, there is no need to push value into stack because we already have the address of value we want to show on screen. After this optimization, function calling, function returning, stack operation and a7 configuration will be eliminated. Instruction count will have a reduction of 10n - 1, where n is two times of input array's length.Compile C Code
To get execution file from C code, simply type:
Read the header:
I will explain some line I am instresting in.
The first line is magic number, and in wikipedia, it says magic number is
a constant numerical or text value used to identify a file format or protocol
. The first byte in this line is 0x7f, which is a leet and the reason is described in this stackoverflow page. We can see this in first line of file's heximal format withhexdump
command:And the entry point is the first address of
_start
, rather than our main function. The compiler will add a_start
function in our code to avoid invalid access of computer resources. After excecution, the_start
will do some necessary initialization and invokemain
function.For more information, there is a book name ELF format details each part of ELF file.
Read the disassembly file:
The code has 17092 lines, which too large to show, so I simply pick up two function defined in our c implementation.
As showed, the compiler will automatically use compressed extension of rv32 to compile our c code if flags remain unspecified, but it is still very large.
As we can see in object dump file, the compiler will automatically add a function
_start
to invoke our main function. The OS will first load_start
function in to physical memory and the position of_start
finction is same as entry point address got with readelf command and in this example, it is000100c4
.c4
is the offset of_start
function, we can know that by printing out. We can also see that risc-v is based on little endian.And the
10000
is the default virtual memory address when load program into RAM, but I can't find the reason of this address rather than 0x00000000.Comparing Optimization Levels
-march=rv321, -mabi=ilp32
: specify to use interger and 4-bytes-long instruction.-fdata-sections, -ffunction-sections
: specify to seperate unused function and data.-Wl,--gc-sections
:-Wl
will tell gcc to pass the arguments after comma to linker, and--gc-sections
tells linker not to link unused sections. (what is -Wl)how to link used functions only
The pictures show that after removing unused data and function from program, code section reduced from 74740 to 60120, data section reduced from 2816 to 2776 and bss section reduced from 812 to 104.
-O0
-O1
-O2
-O3
-Os
-Ofast
Analysis
We can see that first address of main is at
0x10184
whenO0
andO1
are specified. This means the entry point address (100c4
) doesn't point to self written code and a strat routine provied by library is needed. As I mentioned before, usually a_start
function will be added in our elf file, but in seek of optimization, it might be excluded.O2
,O3
,Os
andOfast
will directly execute main function without the help of_start
, and the size of main function seems comparably large besides Os. I think it is because althought the efficiency may increased by eliminating_start
, the main function should handel some condition that will hazard the systeym by itself. I have tried to read the main functions' assembly code but it it too understand to understand withot help of O1's and Os'.There is a problem. The entry point might be diffrent from each elf file. After reviewing my homework again, I realized that the
_start
function exists in every elf file.And we can realize that after optimization level of O2, the text section and cycle counts remain the same. There is a limitation of compiler optimization. And the code size is significantly larger than hand due to heavy work of
printf