Lecture 1: Course Introduction and Review

Download Report

Transcript Lecture 1: Course Introduction and Review

EENG 449bG/CPSC 439bG
Computer Systems
Lecture 1
Introduction
January 13, 2004
Prof. Andreas Savvides
Spring 2004
1/13/04
EENG449/Savvides
Lec 1.1
Outline
•
•
•
•
•
•
1/13/04
Why take this class?
Trends in Computer Architecture
Course Structure
Administrative Issues
Performance Evaluation
Summary
EENG449/Savvides
Lec 1.2
Why take this class?
• To design the next great instruction set?...well...
– instruction set architecture has largely converged
– especially in the desktop / server / laptop space
– dictated by powerful market forces
• Tremendous organizational innovation relative to
established ISA abstractions
• Many New instruction sets or equivalent
– embedded space, controllers, specialized devices, ...
• Design, analysis, implementation concepts vital to all
aspects of EE & CS
– systems, PL, theory, circuit design, VLSI, comm.
• Equip you with an intellectual toolbox for dealing with
a host of systems design challenges
1/13/04
EENG449/Savvides
Lec 1.3
What is the focus of this class?
• Cover the basic underlying principles of
computer architecture
– What’s inside a microprocessor
– How memories and storage systems work
• Develop a broader view of computer systems
design, not only based on CPU design, but
also
– How software and hardware come together in tiny
networked devices
– Explored with a selection of research papers and class
projects
1/13/04
EENG449/Savvides
Lec 1.4
Example Hot Developments ca. 2002
• Manipulating the instruction set abstraction
–
–
–
–
itanium: translate ISA64 -> micro-op sequences
transmeta: continuous dynamic translation of IA32
tinsilica: synthesize the ISA from the application
reconfigurable HW
• Virtualization
– vmware: emulate full virtual machine
– JIT: compile to abstract virtual machine, dynamically compile
to host
• Parallelism
– wide issue, dynamic instruction scheduling, EPIC
– multithreading (SMT)
– chip multiprocessors
• Communication
– network processors, network interfaces
– Sensor nodes and sensor networks
• Exotic explorations
– nanotechnology, quantum computing
1/13/04
EENG449/Savvides
Lec 1.5
Forces on Computer Architecture
Technology
Programming
Languages
Applications
Computer
Architecture
Operating
Systems
History
1/13/04
EENG449/Savvides
Lec 1.6
Technology Trends
•
•
•
•
•
•
Clock Rate:
~30% per year
Transistor Density:
~35%
Chip Area:
~15%
Transistors per chip: ~55%
Total Performance Capability: ~100%
by the time you graduate...
– 3x clock rate (3-4 GHz)
– 10x transistor count (1 Billion transistors)
– 30x raw capability
• plus 16x dram density, 32x disk density
1/13/04
EENG449/Savvides
Lec 1.7
Price of DRAM over 6 Generations
1/13/04
EENG449/Savvides
Lec 1.8
Moore’s Law
(Obtained from Intel)
Gordon Moore (1965): The number of transistors per
integrated circuit doubles every couple of years
Intel: This trend will continue at least until the end of
this decade…
1/13/04
EENG449/Savvides
Lec 1.9
A take on Moore’s Law
Bit-level parallelism
Instruction-level
T hread-level (?)
100,000,000

10,000,000





1,000,000



R10000




 









 

Transistors
Pentium


 i80386



i80286 
100,000


 R3000
 R2000

 i8086
10,000
 i8080
 i8008

 i4004
1,000
1970
1/13/04
1975
1980
1985
1990
1995
2000
2005
EENG449/Savvides
Lec 1.10
Architecture in the New Millenium
• Today computer architecture is becoming important in
new ways
• Desktops, notebooks and PDAs are everywhere
• Micro-controllers are growing the fastest rate
– Embedded processors in many new devices
» PDAs, remote controls, wireless devices, wireless sensors
– A new set of trends for embedded systems
» Cost, power and networking are becoming key drivers
» Peripherals are becoming equally important to the core
» Memories (FLASH and RAM) in the same package next to
processor core
1/13/04
EENG449/Savvides
Lec 1.11
Measurement and Evaluation
Design
Architecture is an iterative process
-- searching the space of possible designs
-- at all levels of computer systems
Analysis
Creativity
Cost /
Performance
Analysis
Good Ideas
Bad Ideas
1/13/04
Mediocre Ideas
EENG449/Savvides
Lec 1.12
What is “Computer Architecture”?
Application
Operating
System
Compiler
Firmware
Instr. Set Proc. I/O system
Instruction Set
Architecture
Datapath & Control
Digital Design
Circuit Design
Layout
• Coordination of many levels of abstraction
• Under a rapidly changing set of forces
• Design, Measurement, and Evaluation
1/13/04
EENG449/Savvides
Lec 1.13
Coping with this class
• Students with too varied background?
– Undergraduate students take this as a first course in
architecture (need to have EENG 348a, CPSC 323a, and
programming experience in a high-level language)
– Graduate students – use this class to transition to grad
school and to start your research
– Complete the class questionnaire
• Week 1 Introduction and Performance
– Chapter 1 of textbook available online from MKP website
http://books.elsevier.com/companions/1558605967
Look under sample chapters section
• Required Text: Computer Architecture A
Quantitative Approach by John Hennesy and
David Patterson
– You can order this online from Amazon, Barnes and Noble, or
other online bookstores
1/13/04
EENG449/Savvides
Lec 1.14
Course Overview
•
•
•
•
Introduction and Performance Evaluation (Chapter 1)
Memory Hierarchy (Chapter 5)
Instruction Sets (Chapter 2)
Instruction Level Parallelism (ILP) (Appendix A &
Chapter 3
• Exploiting ILP in Software (Chapter 5)
• In addition to the traditional microprocessor
architecture topics this class will also examine other
aspects affecting computer architecture such as
networks of embedded devices and low power design
issues
– Handouts will be distributed in class
• Some of the chapters will be partially covered and
supplementary material for other topics will be
distributed in class.
1/13/04
EENG449/Savvides
Lec 1.15
EENG449/CMSC439 Administrivia
• TA: Sobeeh Almukhaizim
([email protected])
• Assignment information, lectures via WWW page:
http://www.eng.yale.edu/enalab/eeng449bG
• 2 Quizzes: Dates to be determined
• Projects:
–
–
–
–
1/13/04
Your success largely depends on your own initiative
A list of projects will be discussed in class next week
You are encouraged to pick your own project
Work in groups of 2 (groups of 1 require my approval)
EENG449/Savvides
Lec 1.16
Grading
•
•
•
•
10% Homeworks
40% Examinations (2 Midterms)
40% Project
Draft of Conference Quality Paper
–
–
–
–
–
–
pick topic
meet 3 times with faculty/TA to see progress
give oral presentation in class during last week of classes
written report like conference paper
3 weeks work full time for 2 people (over more weeks)
Opportunity to do “research in the small” to help make transition
from good student to research colleague
• 10% Class Participation
• 1 Page project description due at the end of week 3
• Projects will be discussed in class next week
1/13/04
EENG449/Savvides
Lec 1.17
Review of Performance
1/13/04
EENG449/Savvides
Lec 1.18
Which is faster?
Plane
DC to
Paris
Speed
Passengers
Throughput
(pmph)
Boeing 747
6.5 hours
610 mph
470
286,700
BAD/Sud
Concorde
3 hours
1350 mph
132
178,200
• Time to run the task (ExTime)
– Execution time, response time, latency
• Tasks per day, hour, week, sec, ns …
(Performance)
– Throughput (total work done in a given time), bandwidth
1/13/04
EENG449/Savvides
Lec 1.19
Definitions
• Performance is in units of things per sec
– bigger is better
• If we are primarily concerned with response time
– performance(x) =
1
execution_time(x)
" X is n times faster than Y" means
Execution_time(Y)
Performance(X)
n
=
=
Performance(Y)
1/13/04
Execution_time(Y)
EENG449/Savvides
Lec 1.20
CPI
Computer Performance
inst count
CPU time
= Seconds
= Instructions x
Program
CPI
Program
Compiler
X
(X)
Inst. Set.
X
X
Technology
x Seconds
Instruction
Inst Count
X
Organization
1/13/04
Program
Cycles
X
Cycle time
Cycle
Clock Rate
X
X
EENG449/Savvides
Lec 1.21
Cycles Per Instruction
(Throughput)
“Average Cycles per Instruction”
(e.g 5 secs)
(e.g 1 GHz)
CPI = (CPU Time * Clock Rate) / Instruction Count
= Cycles / Instruction Count
(time for 1 clock tick)
CPU time  Cycle Time 
n
 CPI  IC
j 1
j
j
CPU Clock Cycles
1/13/04
EENG449/Savvides
Lec 1.22
Calculating the Overall CPI
n
CPI 
 CPI  IC
j 1
j
j
Instruction count
ICi
 CPIj
j1 Instruction count
n

Example: A program has
25% FP instructions with average CPI = 4.0
Average CPI of other instructions = 1.33
CPI  (4  25%) (1.33 75%)  2.0
1/13/04
EENG449/Savvides
Lec 1.23
Example: Calculating CPI bottom up
Base Machine
Op
ALU
Load
Store
Branch
(Reg /
Freq
50%
20%
10%
20%
Reg)
Cycles
1
2
2
2
Typical Mix of
instruction types
in program
1/13/04
CPI(i)
.5
.4
.2
.4
1.5
(% Time)
(33%)
(27%)
(13%)
(27%)
EENG449/Savvides
Lec 1.24
Amhdal’s Law
• Defines the speedup that can be gained by
an improvement in a particular feature
Speedup
Performance of entire task using the enhancement when possible
Performance of the entire task without using the enhancement
• Speedup computed based on 2 factors
– Fraction of computation time that can leverage the
enhancement
– Improvement on the overall task by this enhancement
1/13/04
EENG449/Savvides
Lec 1.25
Amdahl’s Law (cont.)

Fractionenhanced
Executiontimenew  Executiontimeold   1  Fractionenhanced  
Speedupenhanced

Fraction of the task that can
use the enhancement
Speedupoverall 
1/13/04
Executiontimeold

Executiontimenew




Improvement offered
By the enhancement
1
1  Fractionenhanced   Fractionenhanced
Speedupenhanced
EENG449/Savvides
Lec 1.26
Leveraging Parallelism
Parallelism is a recurring theme for improving
performance
• At the machine level
– multiple processors, multiple disks
• At the processor level
– Pipelining – overlapping instruction execution
• During digital design
– Set associative caches with multiple banks
– ALUs use carry lookahead
» Compute 2 outcomes followed by late selection
1/13/04
EENG449/Savvides
Lec 1.27
Price-Performance Metrics
Workstations vs. Embedded Processors
• In desktops performance is typically measured with a
set of benchmark programs (SPEC CINT2000 and
SPEC CFP2000)
– Influenced by a variety of different factors
» Memories, peripherals, operating systems etc.
• In embedded systems processors are harder to
compare
– Embedded processors are application specific
» Features: on-chip peripherals, on-chip memories
– Price and power consumption are also decisive factors
» Many electronic devices need to operate on batteries, be
small and low cost
» Processor performance may be of secondary importance in
many applications (e.g your mp3 player does not require
Pentium type performance but you want it to be cheap
and last long hours)
1/13/04
EENG449/Savvides
Lec 1.28
Summary
• Modern Computer Architecture is about managing and
optimizing across several levels of abstraction wrt
dramatically changing technology and application load
• Key Abstractions
– instruction set architecture
– memory
– bus
• Key concepts
–
–
–
–
HW/SW boundary
Compile Time / Run Time
Pipelining
Caching
• Performance Iron Triangle relates combined effects
– Total Time = Inst. Count x CPI x Cycle Time
1/13/04
EENG449/Savvides
Lec 1.29
Reading for Week 1
• Chapter 1 of textbook
– Available at
http://books.elsevier.com/companions/1558605967
• Project Proposals due end of week 3
• For course information and updates visit the
class website at
http://www.eng.yale.edu/enalab/courses/eeng449bG
1/13/04
EENG449/Savvides
Lec 1.30