History-Perofrmance
Download
Report
Transcript History-Perofrmance
Introduction to Computer
Architecture
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING
SUMMER 2015
RAMYAR SAEEDI
Course Outline
Computers!
Computer Architecture’s Components (Big Picture)
Performance Measurements
Digital System Design Background
Number Representation and Operation
MIPS Microprocessor and Assembly Language
Processor Design Components
Advanced Concepts(Pipeline, Control, Data path)
Memory(Types, Cache…)
Input/output
….
Course Structure
Homework Assignments (4-5 set) (30-35%)
One or two midterms (30-40%)
Final exam(40%-45%)
Bonus PROGECT(10-15%)
Resources
Main: Computer Organization and Design — The Hardware Software
Interface – 5th Edition, David Patterson and John Hennessy
Digital Design and Computer Architecture: ARM Edition, Sarah Harris
David Harris
Course Structure(Cont.)
We will have two review sessions
Two workshops on assembly programming/digital design
3rd and 5th week of class
Office Hours: Tuesday-Thursday 2pm-3pm; Sloan 314
Programming background(e.g. c++, java)
Contact: [email protected]
Homepage of Course: http://epsl.eecs.wsu.edu/introduction-tocomputer-architecture/
HWs and announcement will be posted on the homepage!
What You Will Learn
How programs are translated into the machine language
And how the hardware executes them
The hardware/software interface
What determines program performance
And how it can be improved
How hardware designers improve performance
What is parallel processing!
Components of a Computer
Same components for all kinds of
computer
Desktop,
server,
embedded
Input/output includes
User-interface
Display,
Storage
Hard
keyboard, mouse
devices
disk, CD/DVD, flash
Network
For
devices
adapters
communicating with other computers
The BIG Picture
Programming Levels
High-level language
Level
of abstraction closer to problem domain
Provides for productivity and portability
Assembly language
Textual
representation of instructions
Hardware representation
Binary
digits (bits)
Encoded instructions and data
Inside the Processor
Apple A5
Different Components?
What do you think?
Memory?
Cores?
Response Time and Throughput
Response time
How long it takes to do a task
Throughput
Total work done per unit time
e.g., tasks/transactions/… per hour
How are response time and throughput affected by
Replacing the processor with a faster version?
Adding more processors?
We’ll focus on response time for now…
Relative Performance
Define Performance = 1/Execution Time
“X is n time faster than Y”
Performanc e X Performanc e Y
Execution time Y Execution time X n
Example: time taken to run a program
10s on A, 15s on B
Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
So A is 1.5 times faster than B
CPU Clocking
Operation of digital hardware governed by
a constant-rate clock
Clock period
Clock (cycles)
Data transfer
and computation
Update state
Clock period: duration of a clock cycle
e.g., 250ps = 0.25ns = 250×10–12s
Clock frequency (rate): cycles per second
e.g., 4.0GHz = 4000MHz = 4.0×109Hz
Performance Improvement
How?
Reducing number of clock cycles
Increasing clock rate
Hardware designer must often trade off clock rate against cycle
count
CPU Time CPU Clock Cycles Clock Cycle Time
CPU Clock Cycles
Clock Rate
Instruction Count and CPI
Instruction Count for a program
Determined
by program, ISA and compiler
Average cycles per instruction
Determined
by CPU hardware
If different instructions have different CPI
Average
CPI affected by instruction mix
Clock Cycles Instructio n Count Cycles per Instructio n
CPU Time Instructio n Count CPI Clock Cycle Time
Instructio n Count CPI
Clock Rate
CPI Example
Computer A: Cycle Time = 250ps, CPI = 2.0
Computer B: Cycle Time = 500ps, CPI = 1.2
Same ISA
Which is faster, and by how much?
CPU Time
CPU Time
A
Instructio n Count CPI Cycle Time
A
A
I 2.0 250ps I 500ps
A is faster…
B
Instructio n Count CPI Cycle Time
B
B
I 1.2 500ps I 600ps
B I 600ps 1.2
CPU Time
I 500ps
A
CPU Time
…by this much
CPI in More Detail
If different instruction classes take different numbers of cycles
n
Clock Cycles (CPIi Instructio n Count i )
i1
Weighted average CPI
n
Clock Cycles
Instructio n Count i
CPI
CPIi
Instructio n Count i1
Instructio n Count
Relative frequency
CPI Example
Alternative compiled code sequences using
instructions in classes A, B, C
Class
A
B
C
CPI for class
1
2
3
IC in sequence 1
2
1
2
IC in sequence 2
4
1
1
Sequence 1: IC = 5
Clock Cycles
= 2×1 + 1×2 + 2×3
= 10
Avg. CPI = 10/5 = 2.0
Sequence 2: IC = 6
Clock Cycles
= 4×1 + 1×2 + 1×3
=9
Avg. CPI = 9/6 = 1.5
Performance Summary
Performance depends on
Algorithm: affects IC, possibly CPI
Programming language: affects IC, CPI
Compiler: affects IC, CPI
Instruction set architecture: affects IC, CPI, Tc
Instructio ns Clock cycles
Seconds
CPU Time
Program
Instructio n Clock cycle
Amdahl’s Law
Improving an aspect of a computer and expecting a proportional
improvement in overall performance
Taf f ected
Timprov ed
Tunaf f ected
improvemen t factor
Example: multiply accounts for 80s/100s
How much improvement in multiply performance to get 5× overall?
80
20
20
n
Corollary: make the common case fast
Can’t be done!