History-Perofrmance

Download Report

Transcript History-Perofrmance

Introduction to Computer
Architecture
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING
SUMMER 2015
RAMYAR SAEEDI
Course Outline











Computers!
Computer Architecture’s Components (Big Picture)
Performance Measurements
Digital System Design Background
Number Representation and Operation
MIPS Microprocessor and Assembly Language
Processor Design Components
Advanced Concepts(Pipeline, Control, Data path)
Memory(Types, Cache…)
Input/output
….
Course Structure

Homework Assignments (4-5 set) (30-35%)

One or two midterms (30-40%)

Final exam(40%-45%)

Bonus PROGECT(10-15%)

Resources

Main: Computer Organization and Design — The Hardware Software
Interface – 5th Edition, David Patterson and John Hennessy

Digital Design and Computer Architecture: ARM Edition, Sarah Harris
David Harris
Course Structure(Cont.)

We will have two review sessions

Two workshops on assembly programming/digital design

3rd and 5th week of class

Office Hours: Tuesday-Thursday 2pm-3pm; Sloan 314

Programming background(e.g. c++, java)

Contact: [email protected]

Homepage of Course: http://epsl.eecs.wsu.edu/introduction-tocomputer-architecture/

HWs and announcement will be posted on the homepage!
What You Will Learn

How programs are translated into the machine language

And how the hardware executes them

The hardware/software interface

What determines program performance

And how it can be improved

How hardware designers improve performance

What is parallel processing!
Components of a Computer

Same components for all kinds of
computer
 Desktop,
server,
embedded

Input/output includes
 User-interface
 Display,
 Storage
 Hard
keyboard, mouse
devices
disk, CD/DVD, flash
 Network
 For
devices
adapters
communicating with other computers
The BIG Picture
Programming Levels

High-level language
 Level
of abstraction closer to problem domain
 Provides for productivity and portability

Assembly language
 Textual

representation of instructions
Hardware representation
 Binary
digits (bits)
 Encoded instructions and data
Inside the Processor

Apple A5

Different Components?

What do you think?

Memory?

Cores?
Response Time and Throughput

Response time


How long it takes to do a task
Throughput

Total work done per unit time



e.g., tasks/transactions/… per hour
How are response time and throughput affected by

Replacing the processor with a faster version?

Adding more processors?
We’ll focus on response time for now…
Relative Performance

Define Performance = 1/Execution Time

“X is n time faster than Y”
Performanc e X Performanc e Y
 Execution time Y Execution time X  n

Example: time taken to run a program



10s on A, 15s on B
Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
So A is 1.5 times faster than B
CPU Clocking

Operation of digital hardware governed by
a constant-rate clock
Clock period
Clock (cycles)
Data transfer
and computation
Update state

Clock period: duration of a clock cycle


e.g., 250ps = 0.25ns = 250×10–12s
Clock frequency (rate): cycles per second

e.g., 4.0GHz = 4000MHz = 4.0×109Hz
Performance Improvement

How?

Reducing number of clock cycles

Increasing clock rate

Hardware designer must often trade off clock rate against cycle
count
CPU Time  CPU Clock Cycles  Clock Cycle Time
CPU Clock Cycles

Clock Rate
Instruction Count and CPI

Instruction Count for a program
 Determined

by program, ISA and compiler
Average cycles per instruction
 Determined
by CPU hardware
 If different instructions have different CPI
 Average
CPI affected by instruction mix
Clock Cycles  Instructio n Count  Cycles per Instructio n
CPU Time  Instructio n Count  CPI  Clock Cycle Time
Instructio n Count  CPI

Clock Rate
CPI Example
Computer A: Cycle Time = 250ps, CPI = 2.0
 Computer B: Cycle Time = 500ps, CPI = 1.2
 Same ISA
 Which is faster, and by how much?

CPU Time
CPU Time
A
 Instructio n Count  CPI  Cycle Time
A
A
 I  2.0  250ps  I  500ps
A is faster…
B
 Instructio n Count  CPI  Cycle Time
B
B
 I  1.2  500ps  I  600ps
B  I  600ps  1.2
CPU Time
I  500ps
A
CPU Time
…by this much
CPI in More Detail

If different instruction classes take different numbers of cycles
n
Clock Cycles   (CPIi  Instructio n Count i )
i1

Weighted average CPI
n
Clock Cycles
Instructio n Count i 

CPI 
   CPIi 

Instructio n Count i1 
Instructio n Count 
Relative frequency
CPI Example


Alternative compiled code sequences using
instructions in classes A, B, C
Class
A
B
C
CPI for class
1
2
3
IC in sequence 1
2
1
2
IC in sequence 2
4
1
1
Sequence 1: IC = 5


Clock Cycles
= 2×1 + 1×2 + 2×3
= 10
Avg. CPI = 10/5 = 2.0

Sequence 2: IC = 6


Clock Cycles
= 4×1 + 1×2 + 1×3
=9
Avg. CPI = 9/6 = 1.5
Performance Summary

Performance depends on

Algorithm: affects IC, possibly CPI

Programming language: affects IC, CPI

Compiler: affects IC, CPI

Instruction set architecture: affects IC, CPI, Tc
Instructio ns Clock cycles
Seconds
CPU Time 


Program
Instructio n Clock cycle
Amdahl’s Law

Improving an aspect of a computer and expecting a proportional
improvement in overall performance
Taf f ected
Timprov ed 
 Tunaf f ected
improvemen t factor

Example: multiply accounts for 80s/100s

How much improvement in multiply performance to get 5× overall?
80
20 
 20
n

Corollary: make the common case fast
Can’t be done!