# 阿貴期中
{%hackmd BJrTq20hE %}
## 是非題
### 2018
- [ ] (a.) Performance constraint is not one of the characteristics of embedded system.
- [ ] (b.) Compiler enhancements can lower Instruction count and CPI.
- [ ] (c.) Static energy consumption occurs because of leakage current that flows even when a transistor is off.
- [ ] (d.) For branches, we assumed that we won't want to branch too far, so we can specify change in PC.
- [ ] (e.) ISA (Instruction Set Architecture) with register to register model is slower than that of register to memory.
- [ ] (f.) In multiplication, we can combine Multiplier register and Product register together, and then shift product to left instead of shifting multiplicand to right.
- [ ] (g.) By register convention, each register also has a name to make it easier to code.
- [ ] (h.) High MIPS implies high MFLOPS.
- [ ] (i.) In all RISC processors, R0(register 0) is always zero.
- [ ] (j.) For a chip design, with high volumes, the manufacturing process can be turned to a particular design, increasing the yield.
### 2019
- [ ] (a.) Embedded systems are designed to run one application or one set of related applications.
- [ ] (b.) Throughput is not one of the characteristics of servers.
- [ ] (c.) "yield" is the percentage of good dies from the total number of dies in a wafer.
- [ ] (d.) Compiler can determine the number of computer instructions for each source-level statement.
- [ ] (e.) Programming language can affects CPI.
- [ ] (f.) An assembler is responsible for translating assembly instruction into binary codes.
- [ ] (g.) MIPS RISC processors are little endian architectures.
- [ ] (h.) An executable file contains no debugging information.
- [ ] (i.) One of the tasks of loader is to copy the parameters to the main program onto the stack.
- [ ] (j.) Improvement in processor organizations can lower the CPI.
---
> 1-5


---
1. Two different compilers are being tested for a 4 GHz. machine with three different classes of instructions: Class A, Class B, and Class C, which require one, two, and three cycles (respectively). Both compilers are used to produce code for a large piece of software.
==MIPS = Instruction count / (Execution time x 106)==
The first compiler's code uses 5 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions.
The second compiler's code uses 10 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions.
> 
---
2. Compile can have a profound impact on the performance of an application. Assume that for a program, compiler A results in a dynamic instruction count of 1.0E9 and has an execution time of 1.1 s, while compiler B results in dynamic instruction count of 1.2E9 and an execution time 1.5 s.
> 
> 
---
3. Consider two different implementations of the same instruction set architecture.
The instructions can be divided into four classes according to their CPI (class A, B, C, and D). P1 with a clock rate of 2.5 GHz and CPIs of 1, 2, 3, and 3, and P2 with a clock rate of 3 GHz and CPIs of 2, 2, 2, and 2.Given a program with a dynamic instruction count of 1.0E6 instructions divided into classes as follows: 10% class A, 20% class B, 50% class C, and 20% class D, whichimplementation is faster?
a. What is the global CPI for each implementation?
b. Find the clock cycles required in both cases.
> 
---
4. Assume for arithmetic, load/store, and branch instructions, a processor has CPIs of 1, 12, and 5, respectively. Also assume that on a single processor a program requires the execution of 2.56E9 arithmetic instructions, 1.28E9 load/store instructions, and 256 million branch instructions.
Assume that each processor has a 2 GHz clock frequency. Assume that, as the program is parallelized to run over multiple cores, the number of arithmetic and load/store instructions per processor is divided by 0.7 x p (where p is the number of processors) but the number of branch instructions per processor remains the same.https://quizlet.com/explanations/questions/assume-for-arithmetic-loadstore-and-branch-instructions-a-processor-has-cpis-of-1-12-and-5-respectiv-e7c4b145-0df9-4861-ad0b-59c90e62f76e
(a) Find the total execution time for this program on 1, 2, 4, and 8 processors, and show the relative speedup of the 2, 4, and 8 processor result relative to the single processor result.

(b.) If the CPI of the arithmetic instructions was doubled, what would the impact be on the execution time of the program on 1, 2, 4, or 8processors?

(c.)(c)To what should the CPI of load/store instructions be reduced in order for a single
processor to match the performance of four processors using the original CPI values?

---
5. The following instruction is not included in the MIPS instruction set:
> rpt $t2, loop # if(R[rs]>0) R[rs]=R[rs]-1, PC=PC+4+BranchAddr
(a.)If this instruction were to be implemented in the MIPS instruction set, what is the most appropriate instruction format?
> 
(b.)What is the sequence of MIPS instructions that performs the same operation?
> 
---
6. Implement the following C code in MIPS assembly. What is the total number of MIPS instructions needed to execute the function?
```
int fib(int n){
if (n==0)
return 0;
else if (n==1)
return 1;
else
return fib(n-1) + fib(n-2);
}
```
> We assume $a0 is non-negative.
> 
---
7. IEEE 754-2008 contains a half precision that is only 16 bits wide. The leftmost bit is still the sign bit, the exponent is 5 bits wide and has a bias of 15, and the mantissa is 10 bits long. A hidden 1 is assumed.
Write down the bit pattern to represent −1.5625 × 10^−1^ assuming aversion of this format, which uses an excess-16 format to store theexponent.
Comment on how the range and accuracy of this 16-bit floating point format compares to the single precision IEEE 754 standard https://quizlet.com/explanations/questions/ieee-754-2008-contains-a-half-precision-that-is-only-16-bits-wide-the-left-most-bit-is-still-the-sig-72a04f26-120e-431f-9aa4-4791f290a30a
> 
---
8. Using the IEEE 754 floating point format, write down the bit pattern that would represent -1/4. Can you represent -1/4 exactly?
> 
---
9. Assume $t0 holds the value 0x00101000. What is the value of $t2 after the following instructions?
```
slt $t2, $0, $t0
bne $t2, $0, ELSE
j DONE
ELSE: addi $t2, $t2, 2
DONE :
```
> 
---
10.