# 阿貴期中 {%hackmd BJrTq20hE %} ## 是非題 ### 2018 - [ ] (a.) Performance constraint is not one of the characteristics of embedded system. - [ ] (b.) Compiler enhancements can lower Instruction count and CPI. - [ ] (c.) Static energy consumption occurs because of leakage current that flows even when a transistor is off. - [ ] (d.) For branches, we assumed that we won't want to branch too far, so we can specify change in PC. - [ ] (e.) ISA (Instruction Set Architecture) with register to register model is slower than that of register to memory. - [ ] (f.) In multiplication, we can combine Multiplier register and Product register together, and then shift product to left instead of shifting multiplicand to right. - [ ] (g.) By register convention, each register also has a name to make it easier to code. - [ ] (h.) High MIPS implies high MFLOPS. - [ ] (i.) In all RISC processors, R0(register 0) is always zero. - [ ] (j.) For a chip design, with high volumes, the manufacturing process can be turned to a particular design, increasing the yield. ### 2019 - [ ] (a.) Embedded systems are designed to run one application or one set of related applications. - [ ] (b.) Throughput is not one of the characteristics of servers. - [ ] (c.) "yield" is the percentage of good dies from the total number of dies in a wafer. - [ ] (d.) Compiler can determine the number of computer instructions for each source-level statement. - [ ] (e.) Programming language can affects CPI. - [ ] (f.) An assembler is responsible for translating assembly instruction into binary codes. - [ ] (g.) MIPS RISC processors are little endian architectures. - [ ] (h.) An executable file contains no debugging information. - [ ] (i.) One of the tasks of loader is to copy the parameters to the main program onto the stack. - [ ] (j.) Improvement in processor organizations can lower the CPI. --- > 1-5 ![check Yourself](https://i.imgur.com/wbvq6ku.png) ![MFLOPS](https://i.imgur.com/G2XynCF.png) --- 1. Two different compilers are being tested for a 4 GHz. machine with three different classes of instructions: Class A, Class B, and Class C, which require one, two, and three cycles (respectively). Both compilers are used to produce code for a large piece of software. ==MIPS = Instruction count / (Execution time x 106)== The first compiler's code uses 5 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions. The second compiler's code uses 10 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions. > ![2021. hw1 ans](https://i.imgur.com/KLSeJnY.png) --- 2. Compile can have a profound impact on the performance of an application. Assume that for a program, compiler A results in a dynamic instruction count of 1.0E9 and has an execution time of 1.1 s, while compiler B results in dynamic instruction count of 1.2E9 and an execution time 1.5 s. > ![2021. hw1 2.a](https://i.imgur.com/sfqDUB9.png) > ![2021. hw1 2.b.c](https://i.imgur.com/b91w5hz.png) --- 3. Consider two different implementations of the same instruction set architecture. The instructions can be divided into four classes according to their CPI (class A, B, C, and D). P1 with a clock rate of 2.5 GHz and CPIs of 1, 2, 3, and 3, and P2 with a clock rate of 3 GHz and CPIs of 2, 2, 2, and 2.Given a program with a dynamic instruction count of 1.0E6 instructions divided into classes as follows: 10% class A, 20% class B, 50% class C, and 20% class D, whichimplementation is faster? a. What is the global CPI for each implementation? b. Find the clock cycles required in both cases. > ![](https://i.imgur.com/7ap0LYS.png) --- 4. Assume for arithmetic, load/store, and branch instructions, a processor has CPIs of 1, 12, and 5, respectively. Also assume that on a single processor a program requires the execution of 2.56E9 arithmetic instructions, 1.28E9 load/store instructions, and 256 million branch instructions. Assume that each processor has a 2 GHz clock frequency. Assume that, as the program is parallelized to run over multiple cores, the number of arithmetic and load/store instructions per processor is divided by 0.7 x p (where p is the number of processors) but the number of branch instructions per processor remains the same.https://quizlet.com/explanations/questions/assume-for-arithmetic-loadstore-and-branch-instructions-a-processor-has-cpis-of-1-12-and-5-respectiv-e7c4b145-0df9-4861-ad0b-59c90e62f76e (a) Find the total execution time for this program on 1, 2, 4, and 8 processors, and show the relative speedup of the 2, 4, and 8 processor result relative to the single processor result. ![](https://i.imgur.com/sDNpIJl.png) (b.) If the CPI of the arithmetic instructions was doubled, what would the impact be on the execution time of the program on 1, 2, 4, or 8processors? ![](https://i.imgur.com/m1uZk4V.png) (c.)(c)To what should the CPI of load/store instructions be reduced in order for a single processor to match the performance of four processors using the original CPI values? ![](https://i.imgur.com/XW8A8vi.png) --- 5. The following instruction is not included in the MIPS instruction set: > rpt $t2, loop # if(R[rs]>0) R[rs]=R[rs]-1, PC=PC+4+BranchAddr (a.)If this instruction were to be implemented in the MIPS instruction set, what is the most appropriate instruction format? > ![](https://i.imgur.com/mWFnBKT.png) (b.)What is the sequence of MIPS instructions that performs the same operation? > ![](https://i.imgur.com/2Ej2foP.png) --- 6. Implement the following C code in MIPS assembly. What is the total number of MIPS instructions needed to execute the function? ``` int fib(int n){ if (n==0) return 0; else if (n==1) return 1; else return fib(n-1) + fib(n-2); } ``` > We assume $a0 is non-negative. > ![](https://i.imgur.com/8jtoVOg.png) --- 7. IEEE 754-2008 contains a half precision that is only 16 bits wide. The leftmost bit is still the sign bit, the exponent is 5 bits wide and has a bias of 15, and the mantissa is 10 bits long. A hidden 1 is assumed. Write down the bit pattern to represent −1.5625 × 10^−1^ assuming aversion of this format, which uses an excess-16 format to store theexponent. Comment on how the range and accuracy of this 16-bit floating point format compares to the single precision IEEE 754 standard https://quizlet.com/explanations/questions/ieee-754-2008-contains-a-half-precision-that-is-only-16-bits-wide-the-left-most-bit-is-still-the-sig-72a04f26-120e-431f-9aa4-4791f290a30a > ![](https://i.imgur.com/W3SjtKV.png) --- 8. Using the IEEE 754 floating point format, write down the bit pattern that would represent -1/4. Can you represent -1/4 exactly? > ![](https://i.imgur.com/ZjXNuBm.png) --- 9. Assume $t0 holds the value 0x00101000. What is the value of $t2 after the following instructions? ``` slt $t2, $0, $t0 bne $t2, $0, ELSE j DONE ELSE: addi $t2, $t2, 2 DONE : ``` > ![](https://i.imgur.com/huYZGGk.png) --- 10.