# problem B
Assignment1: RISC-V Assembly and Instruction Pipeline
Problem:Refer to [Quiz1 of Computer Architecture (2025 Fall) Problem B](https://https://hackmd.io/@sysprog/arch2025-quiz1-sol#Problem-B)
## explain
THIS ASSIGNMENT is a logarthmic 8-bits codec
we will use encode and decode to acheive this logarthemic 8-bits codec and recover it and we will check decode parameter can correspond to encode one.hat maps 20-bit unsigned integers (
[
0
,
1,015,792
]
) to 8-bit symbols via logarithmic quantization, delivering 2.5:1 compression and ≤6.25% relative error.
## decode

when e is floor(b/16),m is b mod 16.
we use this format to decode 8bits symbols to 20-bits unsigned integer
## ENCODE

we choose this format to encode 20 bits unsigned integer to get the 8 bits symbol and we preserve that the whole loop will error?
when the output is 1 the whole loop is correct
## sloution
### AI TOOL
I use chatgpt to help me when I have bug that I cant solve,some grammer
,and some translation,and which time need to store bits in s0-s11,and explain the c code in [Quiz1 of Computer Architecture (2025 Fall) Problem B](https://https://hackmd.io/@sysprog/arch2025-quiz1-sol#Problem-B) and almost(above 90%) code are write by my own hand think by my brain( less than 10%)is that I comment above (i cant slove bug) and i find it(it is condition mistake to jump wrong branch some logical mistake i didnt find out)
## clz
find the msb to help encode and decode
```c
clz:
addi sp ,sp -16
sw ra ,12(sp)
li t0,,16 #c = 16 n = 32 ,copy16 to s2
li t1,,32
loop:
beq t0 ,zero,end # if c = 0 leave loop
srl a1 ,a0, t0 #y = x>>c
beq a1 ,zero,decision # if y =0 shift c>>1
sub t1 ,t1,t0 # n- = c
mv a0 ,a1 # x = y
decision:
srli t0, t0 ,1
j loop
end:
sub a0 ,t1 , a0 #n = n-x
lw ra,12(sp)
addi,sp,sp,16
ret #return caller
```
and we can get the msb to continue to solve encode and decode
## decode
```c
decode:
addi ,sp ,sp ,-16
sw ,ra,12(sp)
andi t0 ,a0,0x0f #b mod 16 mantissa
srli t1,a0,4 #b/16 exponent
addi t2,t1,-15
sub t2,zero,t2
li t3,0x7fff #then we let 2^15=32768 is biggest step size we shift make it /2^n-1 and we finished the 2^e-1
srl t3,t3,t2
slli t3,t3,4
sll t0,t0,t1
add a0,t0,t3 #we find d(b)leftand add offset we get decode
lw ra ,12(sp)
addi ,sp ,sp 16
ret
```
this part is decode the uf8 t0 uint_20 so we will use it in last part of this code
## ENcode
```c
encode:
addi,sp,sp,-16
sw ra,12(sp)
mv t5,a0
li t1,16
bltu t5,t1,small #if 16>=value jump to end
fex:
mv a0,t5
jal ra,clz
add t0 ,a0,zero
addi t0,t0,-31
sub t0,zero,t0 #31-clz(value) decision msb in which bit
addi t1,zero,-4
bltu t1,x0,tu0 # if msb<5 goto next loop
li t0,15
blt t0,t1,tu15 #if exp>15 goto nextloop
li t1,15
tu0:
li t1 0
j nextf
tu15:
li t1 15
nextf:
li t2,1
sll t2,t2,t1
addi t2,t2,-1 #t1 is exp
slli t2,t2,4 #find overflow
adj:
sltu t0,zero,t1 #if wxp<=0 or value>=overflow
mv t4,t5
sltu t3,t4,t2
and t0,t0,t3 #t0 is 2 condition sucess
addi t3,zero,1
bne t0,t3,fae
addi t2,t2,-16
srli t2,t2,1
addi t1,t1,-1
beq t0 ,zero,adj
fae:
addi t0,zero,15
bgeu t1, t0,fend
slli t2,t2,1
addi,t2,t2,16 #find exact exp
bltu t5,t2,fend
addi,t1,t1,1
j fae
fend:
sub t0,t5,t2
srl t0,t0,t1
slli t1,t1,4
andi t0,t0,0xf
add t3,t1,t0
fin:
andi a0,t3,0xff
lw ra ,12(sp)
addi sp,sp ,16
ret
small:
mv a0,t5
lw ra,12(sp)
addi sp,sp,16
ret
```
and this part is most complicated part in this homework cause too many branch may jump to wrong branch and this is encode uint_32bit to corrspond
uf8 with its stepsize and we can find out the uf8
## verify
```c
bool:
addi sp,sp ,-16
sw ra,8(sp)
li,s0,-1 #t0 is preious value
li,s1,1 # pass is ture ,t1
li,s2,0 #t2 is fl
li,s3,256 #loop i<256
confirm:
mv a0,s2 #fl = t2
andi a0,s2,0xff
jal ra,decode
mv t5,a0
mv a0,t5 #value is t5
jal ra,encode
mv t6,a0 # fl2 is t6
andi t4,s2,0xff
andi t6,t6,0xff
beq t4,t6,ch1
li a0,0
mv a0,t2
li a7,1
ecall
la a0,str1
li a7,4
ecall
mv a0,t5
li a7,1
ecall
la a0,str2
li a7,4
ecall
mv a0,t6
li a7,1
ecall
la a0,n
li a7,4
ecall
li s1,0
ch1:
blt s0,t5,ch2
li a0,0
mv a0,t2
li a7,1
ecall
la a0,str1
li a7,4
ecall
mv a0,t5
li a7,1
ecall
la a0,str2
li a7,4
ecall
mv a0,t6
li a7,1
ecall
la a0,n
li a7,4
ecall
li s1,0
ch2:
mv s0,t5
addi,s2,s2,1
bltu s2,s3 ,confirm
mv a0,s1
lw ra ,8(sp)
addi sp,sp,16
ret
```
and the last function we need to call the encode and decode to verify
if the decode number equal to encode number?
so that's find out we run uf8 total 256 numbers and decode it
## FULL CODE
```c
.data
str1:.string ":produce value"
str2:.string "but encodes back to"
n:.asciz "\n"
str3:.string ":value"
str4:.string "<= previous_value"
msg:.asciz "\n"
.text
.globl main
main:
jal ra ,bool
li a7,1
ecall
li a7,10
ecall
clz:
addi sp ,sp -16
sw ra ,12(sp)
li t0,,16 #c = 16 n = 32 ,copy16 to s2
li t1,,32
loop:
beq t0 ,zero,end # if c = 0 leave loop
srl a1 ,a0, t0 #y = x>>c
beq a1 ,zero,decision # if y =0 shift c>>1
sub t1 ,t1,t0 # n- = c
mv a0 ,a1 # x = y
decision:
srli t0, t0 ,1
j loop
end:
sub a0 ,t1 , a0 #n = n-x
lw ra,12(sp)
addi,sp,sp,16
ret #return caller
decode:
addi ,sp ,sp ,-16
sw ,ra,12(sp)
andi t0 ,a0,0x0f #b mod 16 mantissa
srli t1,a0,4 #b/16 exponent
addi t2,t1,-15
sub t2,zero,t2
li t3,0x7fff #then we let 2^15=32768 is biggest step size we shift make it /2^n-1 and we finished the 2^e-1
srl t3,t3,t2
slli t3,t3,4
sll t0,t0,t1
add a0,t0,t3 #we find d(b)leftand add offset we get decode
lw ra ,12(sp)
addi ,sp ,sp 16
ret
encode:
addi,sp,sp,-16
sw ra,12(sp)
mv t5,a0
li t1,16
bltu t5,t1,small #if 16>=value jump to end
fex:
mv a0,t5
jal ra,clz
add t0 ,a0,zero
addi t0,t0,-31
sub t0,zero,t0 #31-clz(value) decision msb in which bit
addi t1,zero,-4
bltu t1,x0,tu0 # if msb<5 goto next loop
li t0,15
blt t0,t1,tu15 #if exp>15 goto nextloop
li t1,15
tu0:
li t1 0
j nextf
tu15:
li t1 15
nextf:
li t2,1
sll t2,t2,t1
addi t2,t2,-1 #t1 is exp
slli t2,t2,4 #find overflow
adj:
sltu t0,zero,t1 #if wxp<=0 or value>=overflow
mv t4,t5
sltu t3,t4,t2
and t0,t0,t3 #t0 is 2 condition sucess
addi t3,zero,1
bne t0,t3,fae
addi t2,t2,-16
srli t2,t2,1
addi t1,t1,-1
beq t0 ,zero,adj
fae:
addi t0,zero,15
bgeu t1, t0,fend
slli t2,t2,1
addi,t2,t2,16 #find exact exp
bltu t5,t2,fend
addi,t1,t1,1
j fae
fend:
sub t0,t5,t2
srl t0,t0,t1
slli t1,t1,4
andi t0,t0,0xf
add t3,t1,t0
fin:
andi a0,t3,0xff
lw ra ,12(sp)
addi sp,sp ,16
ret
small:
mv a0,t5
lw ra,12(sp)
addi sp,sp,16
ret
bool:
addi sp,sp ,-16
sw ra,8(sp)
li,s0,-1 #t0 is preious value
li,s1,1 # pass is ture ,t1
li,s2,0 #t2 is fl
li,s3,256 #loop i<256
confirm:
mv a0,s2 #fl = t2
andi a0,s2,0xff
jal ra,decode
mv t5,a0
mv a0,t5 #value is t5
jal ra,encode
mv t6,a0 # fl2 is t6
andi t4,s2,0xff
andi t6,t6,0xff
beq t4,t6,ch1
li a0,0
mv a0,t2
li a7,1
ecall
la a0,str1
li a7,4
ecall
mv a0,t5
li a7,1
ecall
la a0,str2
li a7,4
ecall
mv a0,t6
li a7,1
ecall
la a0,n
li a7,4
ecall
li s1,0
ch1:
blt s0,t5,ch2
li a0,0
mv a0,t2
li a7,1
ecall
la a0,str1
li a7,4
ecall
mv a0,t5
li a7,1
ecall
la a0,str2
li a7,4
ecall
mv a0,t6
li a7,1
ecall
la a0,n
li a7,4
ecall
li s1,0
ch2:
mv s0,t5
addi,s2,s2,1
bltu s2,s3 ,confirm
mv a0,s1
lw ra ,8(sp)
addi sp,sp,16
ret
```
## RESULT

THIS result seems the whole loop is correct and
single cycle execute info is below in single cycle

this is 5 stage cpu info

although 5-stage cpu is more loops count but it's more efficent
## pipeline
### STAGE 1:IF(INSTRTION FETCH)

.Fetch the next instruction from the Instruction Memory and update the Program Counter (PC) to point to the next instruction.
### STAGE2:INSTRUCTION DECODE AND REGISTER FETCH(ID)

Decode the instruction (determine its operation type such as add, lw, jal, etc.) and read operands from the Register File.
### STAGE3: EXECUTE(EX)

Perform ALU operations (addition, subtraction, multiplication, shift, or logic). For memory instructions, calculate the effective address. Branch decisions are also made in this stage.
### STAGE4:MEMORY ACCESS(MEM)

Access Data Memory if the instruction is lw or sw. For other instructions, this stage simply passes the result forward.
### STAGE5:WRITE BACK (WB)

Write the result of computation or memory access back to the Register File, e.g., writing the ALU result to the destination register rd.

### OBERSERVATION
we can oberservation this part of code and we can see hazard in this part,
cause when we execute **blt** we can see the ID is **mv** so we need to flush **mv** and **jal** to let **blt** can execute smoothly.
### HAZARD
1.Data Hazard: One instruction depends on data from a previous one
2.Control Hazard Branch : The branch decision (jump target) isn’t known yet
SO, in my code will encounter this 2 hazard in which jal want to jump to the poiner but **mv,a0,t5** did not write back this is data hazard ,and **blt**
want to jump but **mv** is int the IF so this is call controll hazard.we need to flush or bubble to solve hazard.
# PEOBLEM C
Assignment1: RISC-V Assembly and Instruction Pipeline Problem:[Refer to Quiz1 of Computer Architecture (2025 Fall) Problem C](https://hackmd.io/@sysprog/arch2025-quiz1-sol)
## EXPLAIN

## solution
### TEST DATA
#### TEST1:

input: oxff23,0x0345
sqrt input: 0x3f80
OUTPUT IN BFLOAT:
ADD:0XFF23 ,sub: MUL:0XC2FA DIV:0XFF80SQRT:0X3F80

#### TEST 2:
input:0x5438,3842
sqrt input: 0x3f80

output in bfloat:
add 0x5438 sub:0x5438 mul :0x4d0b div:0x5bff sqrt:0x56df
#### test 3
input:0x3f80,0x3f00
sqrt:0x3f80


output in bfloat:
add:0x3fc0 sub:0x3f00 mul:0x3f00 div:0x407f sqrt:0x3f80
### ASSEMBLY CODE
```c
.#problem 3
.data
smask:.half 0x8000
emask:.half 0x7f80
mmask:.half 0x007f #all unsigned
ebias:.word 127
nan:.half 0x7fc0
bzero:.half 0x0000
.text
.globl main
main:
li a0,0x3f80
li a1,0x3f00
jal add
li t6,0xffff
and a0,a0,t6
slli a0,a0,16
li a7,34
ecall
li a0,0x3f80
li a1,0x3f00
jal sub
li t6,0xffff
and a0,a0,t6
slli a0,a0,16
li a7,34
ecall
li a0 ,0x3f80
li a1,0x3f00
jal mul
li t6,0xffff
and a0,a0,t6
slli a0,a0,16
li a7,34
ecall
li a0 ,0x3f80
li a1,0x3f00
jal div
li t6,0xffff
and a0,a0,t6
slli a0,a0,16
li a7,34
ecall
li a0,0x3f80
jal sqrt0
li t6,0xffff
and a0,a0,t6
slli a0,a0,16
li a7,34
ecall
li a7,10
ecall
isnan:
addi sp,sp,-16
sw ra ,12(sp)
la t1,emask
lhu t0,0(t1)
and t1,a0,t0
srli t1,t1,7
sltiu t2,t1,0xff #把平移後的a.bit exp跟0xff堆比果如小於就取0
xori t2,t2,0x01 #not so,if !(t1<0xff) 1 t1<0xff 0
la t1,mmask
lhu t0,0(t1)
and t3,a0,t0 #mantissa produce
sltu t3,zero,t3 #mantissa!=0 1
and a0,t2,t3
lw ra ,12(sp)
addi sp,sp,16
ret
isinf:
addi sp,sp,-16
sw ra ,12(sp)
la t1,emask
lhu t0,0(t1)
and t1,a0,t0
srli t1,t1,7
sltiu t2,t1,0xff #把平移後的a.bit exp跟0xff堆比果如小於就取0
xori t2,t2,0x01 #not so,if !(t1<0xff) 1 t1<0xff 0
la t1,mmask
lhu t0,0(t1)
and t3,a0,t0 #mantisa produce
sltu t3,zero,t3 #mantissa!=0 1
xori t3,t3,1
and a0,t2,t3
lw ra ,12(sp)
addi sp,sp,16
ret
iszero:
addi sp,sp,-16
sw ra ,12(sp)
li t0,0x7fff
and t1,a0,t0
sltu t1,zero,t1
xori a0,t1,0x01
lw ra,12(sp)
addi sp,sp,16
ret
f322bf16:
addi sp,sp,-16
sw ra ,12(sp)
mv t0,a0
srli t2,a0,23 #現在抓她後8位如果是0xff就把原數字為移16做邏輯位移
andi t2,t2,0xff
li t1,0xff
beq t2,t1,q
srli t0,a0,16
andi t0,t0,1
li t1,0x7fff
add t0,t0,t1
add t0,t0,a0
q:
srli a0,t0,16
lw ra,12(sp)
addi sp,sp,16
ret
b162f32:
addi sp,sp,-16
sw ra ,12(sp)
slli a0,a0,16
lw ra ,12(sp)
addi sp,sp,16
ret
unpackbf16:
srli t0,a0,15 #sign a = t0
andi t0,t0,1
srli t1,a1,15 #sign b = t1
andi t1,t1,1
srli t2,a0,7 #exp a= t2
andi t2,t2,0xff
srli t3,a1,7 #exp a= t3
andi t3,t3,0xff
andi t4,a0,0x7f #mantissa a=t4
andi t5,a1,0x7f #mantissa b=t5
mv s0,t0
mv s1,t1
mv s2,t2
mv s3,t3
mv s4,t4
mv s5,t5
ret
add:
addi sp,sp,-48
sw ra ,12(sp)
sw s0,0(sp)
sw s1,4(sp)
sw s2,8(sp)
sw s3,16(sp)
sw s4,20(sp)
sw s5,24(sp)
sw a0,28(sp)
sw a1,32(sp)
jal ra,unpackbf16
li t0,0xff
bne s2,t0,other
bne s4,zero,reta
bne s3,t0,reta
sltu t1,x0,s5
xor t2,s0,s1
sltiu t2,t2,1
or t3,t1,t2
bne t3,x0,retb
li a0,0x7fc0
j last
other:
li t0,255
bne s3,t0,add1 # if wxpb==0xff
j retb
add1:
bne s2,x0,add2
beq s4,x0,retb #if(exp!=0&&!mantb!0)
add2:
bne s3,x0,x80
beq s5,x0,reta #if(exp!=0&&!mantb!0)
x80:
beq s2,x0,x802
ori s4,s4,128
x802:
beq s3,x0,doother
ori s5,s5,128
doother:
sub t0,s2,s3 #t0 is exp diff
mv t6,t0
bge x0,t0,expdif1
mv t1,s2 #t1 is result exp
li t2,8
blt t2,t0,reta
srl s5,s5,t0
j sign
expdif1:
beq t6,x0,expdif2
mv t1,s3
li t2,-8
blt t0,t2,retb
sub t0,x0,t0
srl s4,s4,t0
j sign
expdif2:
mv t1,s2
j sign
sign:
bne s0,s1,sign2
mv t2,s0 #result sign =signa t2,result sign
add t3,s4,s5 #reslut mant =t3
andi t4,t3,0x100
beq t4,x0,psign
srli t3,t3,1
addi t1,t1,1
li t5,0xff
blt t1,t5,psign
slli t4,t2,15
li t5,0x7f80
or a0,t4,t5
j last
psign:
slli t4,t2,15
andi t5,t1,0xff
slli t5,t5,7
andi t6,t3,0x7f
or t4,t4,t5
or a0,t4,t6
j last
sign2:
blt s4,s5,bch2
mv t2,s0
sub t3,s4,s5
j almost
bch2:
mv t2,s1
sub t3,s5,s4
almost:
beq t3,x0,ret0
j slt
slt:
loop:
andi t4,t3,0x80
bne t4,x0,gg
slli t3,t3,1
addi t1,t1,-1
bge zero,t1,ret0
j loop
gg:
slli t4,t2,15
andi t5,t1,0xff
slli t5,t5,7
andi t6,t3,0x7f
or t4,t4,t5
or a0,t4,t6
j last
reta:
lw a0,28(sp)
j last
retb:
lw a0,32(sp)
j last
ret0:
li a0,0
j last
last:
lw ra ,12(sp)
lw s0,0(sp)
lw s1,4(sp)
lw s2,8(sp)
lw s3,16(sp)
lw s4,20(sp)
lw s5,24(sp)
addi sp,sp,48
ret
sub:
addi sp,sp,-16
sw ra,12(sp)
la t1,smask
lhu t1,0(t1)
xor a1,a1,t1
jal ra ,add
lw ra,12(sp)
addi sp,sp,16
ret
mul:
addi sp,sp,-48
sw ra ,12(sp)
sw s0,0(sp)
sw s1,4(sp)
sw s2,8(sp)
sw s3,16(sp)
sw s4,20(sp)
sw s5,24(sp)
sw a0, 28(sp)
sw a1,32(sp)
jal ra,unpackbf16
xor t0,s0,s1 #t0 is reult sign
li t1,0xff
bne s2,t1,mult2
bne s4,x0,mreta
bne s3,x0,mult1
beq s5,x0,retn
mult1:
slli t2,t0,15
li t1,0x7f80
or a0,t1,t2
j multend
mult2:
bne s3,t1,mult3
bne s5,x0,mretb
bne s2,x0,mult1
beq s4,x0,retn
mult3:
or t1,s2,s4
beq t1,x0,mret0
or t1,s3,s5
beq t1,x0,mret0
mult4:
mv t2,x0 #expadjust=0
beq s2,x0,mloop
j else
mloop:
andi t1,s4,0x80
bne t1,x0,expa1
slli s4,s4,1
addi t2,t2,-1
j mloop
expa1:
li t1,1
mv s2,t1
j else
else:
ori s4,s4,0x80
mult5:
beq s3,x0,mloop1
j else1
mloop1:
andi t1,s5,0x80
bne t1,x0,expb1
slli s5,s5,1
addi t2,t2,-1
j mloop1
expb1:
li t1,1
mv s3,t1
j else1
else1:
ori s5,s5,0x80
newm:
mv t5,s4
mv t1,s5
mv t6,x0
muloop:
andi t3,t1,1
beq t3,x0,nadd
add t6,t6,t5 #t6 is reult mant
nadd:
srli t1,t1,1
slli t5,t5,1
bne t1,x0,muloop
reltexp:
add t5,s2,s3
addi t5,t5,-127
add t5,t5,t2 #t5 is reslt exp
findmant:
li t1,0x8000
and t1,t6,t1
beq t1,x0,felse
srli t6,t6,8
andi t6,t6,0x7f
addi t5,t5,1
j findexp
felse:
srli t6,t6,7
andi t6,t6,0x7f
findexp:
li t1,0xff
blt t5,t1,elsep
slli t0,t0,15
li t1,0x7f80
or a0,t0,t1
j multend
elsep:
blt x0,t5,mlast
li t1,-6
blt t5,t1,mnext
li t1,1
sub t1,t1,t5
srl t6,t6,t1
mv t5,x0
j mlast
mnext:
slli a0,t0,15
j multend
mlast:
slli t0,t0,15
andi t6,t6,0x7f
andi t5,t5,0xff
slli t5,t5,7
or t0,t0,t5
or a0,t0,t6
j multend
mreta:
lw a0, 28(sp)
j multend
mretb:
lw a0, 32(sp)
j multend
retn:
li a0,0x7fc0
j multend
mret0:
slli a0,t0,15
j multend
multend:
lw ra ,12(sp)
lw s0,0(sp)
lw s1,4(sp)
lw s2,8(sp)
lw s3,16(sp)
lw s4,20(sp)
lw s5,24(sp)
addi sp,sp,48
ret
div:
addi sp,sp,-48
sw ra ,12(sp)
sw s0,0(sp)
sw s1,4(sp)
sw s2,8(sp)
sw s3,16(sp)
sw s4,20(sp)
sw s5,24(sp)
sw a0,28(sp)
sw a1,32(sp)
jal ra,unpackbf16
xor t0,s0,s1 #t0 is reult sign
li t1,0xff
bne s3,t1,div2
bne s5,x0,dretb
bne s2,t1,nextd
beq s4,x0,dret0
j nextd
nextd:
slli a0,t0,15
j divend
div2:
bne s3,x0,div3
bne s5,x0,div3
bne s2,x0,ret1nf
beq s4,x0,dret0
j ret1nf
div3:
bne s2,t1,div4
bne s4,x0,dreta
slli t0,t0,15
li t1,0x7f80
or a0,t0,t1
j divend
div4:
bne s2,x0,div5
bne s4,x0,div5
j nextd
div5:
beq x0,s2,div6
ori s4,s4,0x80
div6:
beq x0,s3,dpset
ori s5,s5,0x80
dpset:
slli t3,s4,15 #t3 is dividend
mv t4,s5 #t4 is divisor
mv t5,x0 #t5 is qotient
mv t1,x0
li t2,16
dloop:
slli t5,t5,1
sub t6,x0,t1
addi t6,t6,15
srl t6,t4,t6
blt t3,t6,divjump
sub t6,x0,t1
addi t6,t6,15
srl t6,t4,t6
sub t3,t3,t6
ori t5,t5,1
j divjump
divjump:
addi t1,t1,1
blt t1,t2,dloop
chexp:
sub t1,s2,s3 #t1 is reslt exp
addi t1,t1,127
bne s2,x0,chexp2
addi t1,t1,-1
j chexp2
chexp2:
bne s3,x0,qucheck
addi t1,t1,1
j qucheck
qucheck:
li t2,0x8000
and t2,t5,t2
bne t2,x0,rl8
qucheck2:
li t3,1
bge t3,t1,rl8
slli t5,t5,1
addi t1,t1,-1
li t2,0x8000
and t2,t5,t2
beq t2,x0,qucheck2
rl8:
srli t5,t5,8
andq:
andi t5,t5,0x7f
lastcheck:
li t3,0xff
bge t1,t3,ret1nf
bge x0,t1,nextd
slli t0,t0,15
andi t1,t1,0xff
slli t1,t1,7
andi t5,t5,0x7f
or t0,t0,t1
or a0,t0,t5
j divend
dreta:
lw a0,28(sp)
j divend
dretb:
lw a0,32(sp)
j divend
dret0:
la t1,nan
lhu a0,0(t1)
j divend
ret1nf:
slli t0,t0,15
li t1,0x7f80
or a0,t0,t1
j divend
divend:
lw ra ,12(sp)
lw s0,0(sp)
lw s1,4(sp)
lw s2,8(sp)
lw s3,16(sp)
lw s4,20(sp)
lw s5,24(sp)
addi sp,sp,48
ret
sqrt0:
addi sp,sp,-16
sw ra ,12(sp)
srli t0,a0,15
andi t0,t0,1 # t0 is sign
srli t1,a0,7
andi t1,t1,0xff #t1 is exp
andi t2,a0,0x7f # t2 is mant
hs:
li t3,0xff
bne t3,t1,sqr
bne t2,x0,sreta
j ches
ches:
bne t0,x0,sretn
j sreta
sqr:
bne t1,x0,sqrn
beq t2,x0,sret0
j sqrn
sqrn:
bne t0,x0,sretn
j denorm
denorm:
beq t1,x0,sret0
j algo
algo:
addi t3,t1,-127 #t3 is e
ori t4,t2,0x80 #t4 is m
andi t5,t3,1
bne t5,x0,selse
srai t5,t3,1
addi t6,t5,127 #t6 is new esp
j binarysearch
selse:
slli t4,t4,1
addi t5,t3,-1
srai t5,t5,1
addi t6,t5,127
binarysearch:
li t0 ,90
li t1,256
li t2,128
sloop:
blt t1,t0,normlize
add t5,t0,t1
srli t5,t5,1
mul t3,t5,t5
srli t3,t3,7
blt t4,t3,selse2
mv t2,t5
addi t0,t5,1
j sloop
selse2:
addi t1,t5,-1
j sloop
normlize:
li t0,256
blt t2,t0,elseif
srli t2,t2,1
addi t6,t6,1
j sfmant
elseif:
li t0,128
bge t2,t0,sfmant
li t0,1
loopnew:
li t1,128
bge t2,t1,sfmant
bge t0,t6,sfmant
slli t2,t2,1
addi t6,t6,-1
j loopnew
sfmant:
li t0,0xff
bge t6,t0,sret07f80
bge x0,t6,sret0
j retnew
sreta:
mv a0,a0
j sqrtend
sretn:
la t3,nan
lhu a0,0(t3)
j sqrtend
sret0:
li a0,0
j sqrtend
sret07f80:
li a0,0x7f80
j sqrtend
retnew:
andi t1,t6,0xff
slli t1,t1,7
andi t2,t2,0x7f
or a0,t1,t2
j sqrtend
sqrtend:
lw ra ,12(sp)
addi sp,sp,16
ret
```
## result

this is execute info
# [leetcode 190. Reverse Bits](https://leetcode.com/problems/reverse-bits/)
## explain
Reverse bits of a given 32 bits signed integer.
Example 1:
Input: n = 43261596
Output: 964176192
Explanation:
Integer Binary
43261596 00000010100101000001111010011100
964176192 00111001011110000010100101000000
Example 2:
Input: n = 2147483644
Output: 1073741822
Explanation:
Integer Binary
2147483644
1073741822
Constraints:
0 <= n <= 231 - 2
n is even.
## solution
i have 3 version c code for this leetcode
**version 1**:
```c
int reverseBits(int n) {
int x = n>>24;
int y = n>>16;
int z = n>>8;
int u = n;
x = x&0xff;
y = y&0xff;
z = z&0xff;
u = u&0xff;
int t = 0;
int s = 0;
int p = 0;
int q = 0;
for(int i =0;i<8;i++){
int bit;
int v = 1;
bit= x&v;
x>>=1;
bit <<=(7-i);
t|=bit;
}
x =t;
for(int i =0;i<8;i++){
int bit;
int v = 1;
bit= y&v;
y>>=1;
bit <<=(7-i);
s|=bit;
}
y =s;
for(int i =0;i<8;i++){
int bit;
int v = 1;
bit = z&v;
z>>=1;
bit <<=(7-i);
p|=bit;
}
z=p;
for(int i =0;i<8;i++){
int bit;
int v = 1;
bit= u&v;
u>>=1;
bit <<=(7-i);
q|=bit;
}
u = q;
u<<=24;
z<<=16;
y<<=8;
int answer = x+y+z+u;
return answer;
}
```
my first think is divid 32 bits integer for 4 part and we do each part reverse and let 1st in last byte etc......,so i get this code for the first version.
**version 2**:
```c
int reverseBits(int n) {
int t=0;
u_int u = (u_int) n;
for(int i = 0;i<32;i++){
int bit;
bit = u&1;
u>>=1;
bit<<=(31-i);
t|=bit;
}
u =t;
n = u;
return n;
```
this is my second version ,i think version 1 is too complicated,so i decide to construct only a loop for whole 32bits and do it 32 time always let the last bit goto 1 goto 31-loop times bit so i can get the final answer
**version 3**:
```c
int reverseBits(int n) {
n= (n&0x55555555u)<<1|(n>>1)&0x55555555u;
n = (n&0x33333333u)<<2|(n>>2)&0x33333333u;
n = (n&0x0f0f0f0fu)<<4|(n>>4)&0x0f0f0f0fu;
n = (n&0x00ff00ffu)<<8|(n>>8)&0x00ff00ffu;
n = (n&0x0000ffff)<<16|(n>>16)&0x0000ffff;
return n;
}
```
the last version only complexity O(1),version 1 and version 2 both complexity is O(n) ,this version only bitwise no longer need to use loop
we dont need any branch inside so can avoid control hazard in it.
### assembly code
**TEST 1**:
INPUT:2147483644

OUTPUT:1073741822
**TEST 2**:
INPUT:43261596

OUTPUT:964176192
**TEST 3**:
INPUT:1084732

OUTPUT:1018234880
```c
.data
oddevenmask:.word 0x55555555
twobitsmask:.word 0x33333333
fourbitsmask:.word 0x0f0f0f0f
eightbitsmask:.word 0x00ff00ff
.text
.globl main
main:
li a0,1084732
la t0,oddevenmask
lw t0,0(t0)
la t1,twobitsmask
lw t1,0(t1)
la t2,fourbitsmask
lw t2,0(t2)
la t3,eightbitsmask
lw t3,0(t3)
srli t4,a0,1
and t4,t4,t0
and t5,a0,t0
slli t5,t5,1
or a0,t5,t4
srli t4,a0,2
and t4,t4,t1
and t5,a0,t1
slli t5,t5,2
or a0,t5,t4
srli t4,a0,4
and t4,t4,t2
and t5,a0,t2
slli t5,t5,4
or a0,t5,t4
srli t4,a0,8
and t4,t4,t3
and t5,a0,t3
slli t5,t5,8
or a0,t5,t4
srli t4,a0,16
slli t5,a0,16
or a0,t4,t5
li a7,34
ecall
```