contributed by < peishan3 >
computer architure 2021
LeetCode58 : Length of Last Word
Given a string s consisting of some words separated by some number of spaces, return the length of the last word in the string.
A word is a maximal substring consisting of non-space characters only.
Example:
Input: s = "Hello World"
Output: 5
Explanation: The last word is "World" with length 5.
#include <stdio.h>
int lengthOfLastWord(char * s)
{
//finding the end of string
while(*s != 0)
{
s++;
}
int position = 0;
//Because last word in the string will be '/0', go back for one char
s--;
//if there are some space in the end, keep going back for word
while(*s == 32)
{
s--;
}
//After finding the word, we can count for how long is the word here
while(*s != 32)
{
position++;
s--;
}
printf("Length of the last word is %d", position);//print result
}
int main()
{
char *string = "Hello World";
lengthOfLastWord(string);
return 0;
}
.data
str1: .string "Hello World"
str2: .string "Length of the last word is "
space: .string " "
a: .string "a"
.text
# s1 = str1 address
# s2 = space
# s3 = counter
# t1 = str1[i]
main:
#function:lengthOfLastWord(char * s)
la s1, str1
lb s2, space
lb s4, 5(s1)
add s3, x0, x0
jal ra, loop1
#if the last letter is space, find the position of the last letter is not space
jal ra, loop2
jal ra, loop3
#print result
la a0, str2
addi a7, x0, 4
ecall
add a0, s3, x0
li a7, 1
ecall
li a7, 10
ecall
loop1:
addi s1, s1, 1 #find the last position of the string
lb t0, 0(s1)
bne t0, x0, loop1
ret
loop2:
addi s1, s1, -1 #s--
lb t0, 0(s1) #t0=char[i](s), i=length(s)-1
beq t0, s2, loop2 #while(char[i]==" ")
ret
loop3:
addi s1, s1, -1 #s--
lb t0, 0(s1) #t1=char[i](s), i=last word
addi s3, s3, 1 #length++
bne t0, s2, loop3 #while(char[i]!=" ")
ret
Can you use fewer instructions?
I have reduced some unnecessary instruction, like
reduce jal
to print fuction, and lb t0
originally located before loop2
If there is anything I didn't notice, please let me know.
Thank you so much QwQ
Here are 3 sections in this code: 3 main loops for giving the result, 1 for print the text, 1 section is main.
Loop1
: Finding the end of string. Because the memory in RISC-V for char is 1 byte, here we use addi s1, s1, 1
. s1
is the address of first char in str1
.Loop2
: If the last char is space, we have to find the memory address of the last letter in last word.Loop3
: After finding the address of last letter, this loop is for counting the length of last word.a0
argument register.s1
, s2
, s3
.Test our assembly code on Ripes simulator. We can find that Ripes will translate our code to pseudo instruction.
After translation, our code will be look like:
0000000a <main>:
a: 10000497 auipc x9 0x10000
e: ff648493 addi x9 x9 -10
12: 10000917 auipc x18 0x10000
16: 01690903 lb x18 22 x18
1a: 00548a03 lb x20 5 x9
1e: 000009b3 add x19 x0 x0
22: 01c000ef jal x1 28 <loop1>
26: 00048283 lb x5 0 x9
2a: 024000ef jal x1 36 <loop2>
2e: 030000ef jal x1 48 <loop3>
32: 040000ef jal x1 64 <print>
36: 00a00893 addi x17 x0 10
3a: 00000073 ecall
0000003e <loop1>:
3e: 00148493 addi x9 x9 1
42: 00048283 lb x5 0 x9
46: fe029ce3 bne x5 x0 -8 <loop1>
4a: 00008067 jalr x0 x1 0
0000004e <loop2>:
4e: fff48493 addi x9 x9 -1
52: 00048283 lb x5 0 x9
56: ff228ce3 beq x5 x18 -8 <loop2>
5a: 00008067 jalr x0 x1 0
0000005e <loop3>:
5e: fff48493 addi x9 x9 -1
62: 00048283 lb x5 0 x9
66: 00198993 addi x19 x19 1
6a: ff229ae3 bne x5 x18 -12 <loop3>
6e: 00008067 jalr x0 x1 0
00000072 <print>:
72: 10000517 auipc x10 0x10000
76: f9a50513 addi x10 x10 -102
7a: 00400893 addi x17 x0 4
7e: 00000073 ecall
82: 00098533 add x10 x19 x0
86: 00100893 addi x17 x0 1
8a: 00000073 ecall
8e: 00008067 jalr x0 x1 0
5-stage Pipeline means that there are five stage of the instruction excution.
Picture below is block diogram of processor provided on the simulator.
This type has to use 20 bits for immediate part. There are ususally 2 instruction be this format, LUI
and AUIPC
.
The first instruction in this program auipc x9 0x10000
is U-type format. Here is the explanation of AUIPC from The RISC-V Instruction Set Manual:
AUIPC (add upper immediate to pc) is used to build pc-relative addresses and uses the U-type format. AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the pc, then places the result in register rd.
Let's see how it works in the processor.
0000000a <main>:
, so the addr is equal to 0x0000000a
.10000497 auipc x9 0x10000
, instr is equal to 0x10000497
.0x0000000e
auipc
0x09
0x10000000
(0x10000 in upper 20 bits, filling in the lowest 12 bits with zeros)0x0000000a
) + either a register or the immediate value(0x10000000
)0x0000000e
) + immediate0x1000000a
). Memory read data at address 0x1000000a
, and Read out is equal to 0x654c0064
.0x1000000a
) is wrote to x9
register.The NOP instruction does not change any user-visible state, except for advancing the pc.
Pipeline of R-type instructions are basiclly similar to U-type instruction. So I just show the most different part (ID) below.
ID (Instruction Decode)
Take second line of machine code: e: ff648493 addi x9 x9 -10
for example.
0xff648493
ADDI
0x09
0xfffffff6
(0xff6 in upper 12 bits, filling in the highest 20 bits with one)R-type instructions would read the rs1 and rs2 registers as source operands and write the result into register rd.
ID (Instruction Decode)
Take second line of machine code: 1e: 000009b3 add x19 x0 x0
for example.
add s3, x0, x0
, so rs1 = 0x00
.0x000009b3
ADD
0x00
0x13
0xfffffff6
(0xff6
in upper 12 bits, filling in the highest 20 bits with one)Load instructions transfer a value between the registers and memory . It also copy a value from memory to register rd.
LB and LBU are defined analogously for 8-bit values.
ID (Instruction Decode)
Take second line of machine code: 16: 01690903 lb x18 22 x18
for example.
lb s2, space
, so rs1 = 0x12
.0x01690903
LB
0x12
0x00000016
(0x016
in upper 12 bits, filling in the highest 20 bits with zeros