Chapter 1 Computer Abstractions and Technology
1.1 Intro
Algorithm
- Determines both the number of source-level statements and the number of I/O operations executed
- Not in this class
Programming language, compiler, and architecture
- Determines the number of computer instructions for each source-level statement
- Chapter 2, 3
Processor and memory system
- Determines how fast instructions can be executed
- Chapter 4, 5, 6
I/O system (hardware and operating system)
- Determines how fast I/O operations may be executed
- Chapter 4, 5, 6
1.2 Eight great ideas
- Design for Moore’s Law
- integrated circuit resources double every 18–24 months
- Use abstraction to simplify design
- lower-level details are hidden to offer a simpler model at higher levels
- Make the common case fast
- Making the common case fast will tend to enhance performance better than optimizing the rare case
- Performance via parallelism
- Performance via pipelining
- Performance via prediction
- it can be faster on average to guess and start working rather than wait until you know for sure
- Hierarchy of memories
- cache, main memory, secondary memory
- Dependability via redundancy
- we make systems dependable by including redundant components that can take over when a failure occurs and to help detect failures
1.3 Below your program
- Application software
- Written in high-level language.
- System software
- OS: Service code
- Handle I/O
- Manage memory and storage
- schedule tasks and share resources
- Compiler: Translates high-level language to assembly code.
- Hardware
- Processor, memory, I/O controllers
1.4 Under the Covers
1.5 Technologies for Building Processors and Memory
- Electronics technology continues to evolve.
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Integrated Circuit(IC) cost
\(cost\ per\ die = \frac{cost\ per\ wafer}{dies\ per\ wafer\ \times yeld}\)
\(dies\ per\ wafer \approx \frac{wafer\ area}{die\ area}\)
\(yield = \frac{1}{(1+defects\ per\ area \times die\ area/2)^2}\)
- Nonlinear relation to area and defect rate
- Wafer cost and area are fixed.
- Defect rate determined by manufacturing process
- Die area determined by architecture and circuit design
- Response time(aka execution time): the time between the start and completion of a task
- throughput(aka bandwidth): the total amount of work done in a given time
\(performance = \frac{1}{execution\ time}\)
- Computer X is \(n\) times faster than Y.
\(\frac{performance_X}{performance_Y}=\frac{execution\ time_Y}{execution\ time_X} = n\)
- Response time, elapsed time or wall clock time
- Total time to complete a task
- System performance
- CPU(execution) time
- The actual time the CPU spends computing a task
- Comprise user CPU time and system CPU time.
- CPU performance
- Operation of digital hardware governed by a constant-rate CPU clock
- Clock period(週期): duration of a clock cycle
- Clock frequency(頻率): cycles per second
- Hz
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
\(\ CPU time = CPU\ clock\ cycles \times clock\ cycle\ time\\ =\frac{CPU\ clock\ cycles}{clock\ rate}\)
- Performance improved by
- Reduce number pof clock cycles
- Increase clock rate
- Hardware designer must trade off clock rate against clock cycle
\(clock\ cycles = instruction\ count \times CPI\)
\(CPU\ time = instruction\ count \times CPI\ times clock cycle time \\ =\frac{instruction\ count \times CPI}{clock\ rate}\)
- Instruction count for a program
- Determined by program, ISA and compiler
- Clock cycles per instruction (CPI)
- Determined by CPU hardware
- If different instructions have different CPI, average CPI is affected by instruction mix.
CPI in More Detail
- If different instruction classes take different number of cycles
\(clock\ cycles = \mathop{\sum_{i=1}^{n}}(CPI_i \times instruction\ count_i)\)
- Weighted average CPI
\(CPI=\frac{clock/ cycles}{instruction\ count}= \mathop{\sum_{i=1}^{n}}(CPI_i \times \frac{instruction\ count_i}{instruction\ count})\)
\(CPU\ time =\frac{instructions}{program} \times \frac{clock\ cycles}{instructions} \times \frac{seconds}{clock\ cycle}\)
- Performance depends on
- Algorithm: affects instruction count (IC), possibly CPI
- Programming language: affects IC, CPI
- Compiler: affects IC, CPI
- Instruction set architecture: affects IC, CPI, clock rate
1.7 The power Wall
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors
- More than one processor per chip
- Require explicitly parallel programming
- Compare with instruction-level parallelism
- Hardware executes multiple instructions at once.
- Hidden from the programmer
* Hard to do
- Programming for performance
- Load balancing
- Optimizing communication and synchronization
1.9 Real Stuff: Benchmarking the Intel Core i7
1.10 Fallacies and Pitfalls
- Admahl's Law
\(T_{improved} = \frac{T_{addected}}{improvement\ factor}+ T_{unaffected}\)
- collary: make the common case fast
- MIPS (millions of instructions per second) does not account for
- Differences in ISAs between computers
- Differences in complexity between instructions
\(MIPS = \frac{instruction\ count}{execution\ time \times 10^6}\\ =\frac{instruction\ count}{\frac{instruction\ count \times CPI}{clock\ rate}\times 10^6} = \frac{clock\ rate}{CPI \times 10^6}\)
- CPI varies between programs on a given CPU. CPI varies between programs on a given CPU.
- Cost/performance is improving.
- Due to underlying technology development
- Hierarchical layers of abstraction
- In both hardware and software
- Instruction set architecture
- The hardware/software interface
- Execution time: the best performance measurement
- Power is a limiting factor (skipped)
- Use parallelism to improve performance.
QE
1.3


1.5


1.6


1.7


1.9




1.12


1.13

1.14


1.15

