Slides1 - TAMU Computer Science Faculty Pages

Download Report

Transcript Slides1 - TAMU Computer Science Faculty Pages

Computer Architecture
CPSC 350
E. J. Kim
Course Contents
Course Contents
•
•
•
•
•
Organization of a computer
Assembly language
Design of a computer
Verilog
Future architectures
How does the course fit into the
curriculum?
CPSC 483 Computer
Sys Design
CPSC 4xx compiler OS
CPSC 350 Computer Architecture
ELEN 220 Intro to
Digital Design
ELEN 248 Intro to
DGTL Sym Design
Syllabus
• Two exams 50%
• Assignments and quizzes 20%
• Projects 30%
Course Information
• Labs
• MIPS (Assembly Programming), Verilog
(HDL)
• Projects
• Project 1: MIPS
• Projects 2 & 3: Verilog (Datapath Design)
• TA
• Lei Wang
• Peer Teacher: Molly Hicks
Course Information [contd…]
• Course Webpage
• http://faculty.cs.tamu.edu/ejkim/Courses/
cpsc350
• CS Accounts
• Use your CS accounts to turnin and check
any email regarding course
Computer Architecture
Some people define computer
architecture as the combination of
• the instruction set architecture
(what the executable can see of the
hardware, the functional interface)
• and the machine organization
(how the hardware implements the
instruction set architecture)
What is Computer Architecture ?
Applications
Operating
System
Compiler
Firmware
Instr. Set Proc. I/O system
Datapath & Control
Digital Design
Circuit Design
Layout
Many levels of abstraction
Instruction set
architecture
Machine
organization
Factors affecting ISA ???
Technology
Programming
Languages
Applications
Computer
Architecture
Cleverness
Operating
Systems
History
ISA: Critical Interface
software
instruction set
hardware
Examples: 80x86 50,000,000 vs. MIPS 5500,000 ???
The Big Picture
Processor
Input
Control
Memory
Datapath
Output
Since 1946 all computers have had 5
components!!!
Technology Trends
• Processor
• logic capacity: about 30% per year
• clock rate:
about 20% per year
• Memory
• DRAM capacity: about 60% per year (4x every 3 years)
• Memory speed: about 10% per year
• Cost per bit: improves about 25% per year
• Disk
• capacity: about 60% per year
• Total use of data: 100% per 9 months!
• Network Bandwidth
• Bandwidth increasing more than 100% per year!
Technology Trends
Microprocessor Logic Density
DRAM chip capacity
10000000
uP-Name
R10000
Pentium
R4400
i80486
1000000
Transistors
Year
1980
1983
1986
1989
1992
1996
1999
2002
DRAM
Size
64 Kb
256 Kb
1 Mb
4 Mb
16 Mb
64 Mb
256 Mb
1 Gb
100000000
i80386
i80286
100000
R3010
i8086
SU MIPS
i80x86
M68K
10000
MIPS
Alpha
i4004
1000
1965
1970
1975
1980
1985
1990
1995
2000
2005
°
In ~1985 the single-chip processor (32-bit) and the single-board computer
emerged
°
In the 2002+ timeframe, these may well look like mainframes compared single-chip
computer (maybe 2 chips)
Technology Trends
Smaller feature sizes – higher speed, density
ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)
Technology Trends
Number of transistors doubles every 18 months
(amended to 24 months)
ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)
Levels of Representation
High Level Language
Program
Compiler
Assembly Language
Program
Assembler
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
lw
lw
sw
sw
$15,
$16,
$16,
$15,
Machine Language
Program
0($2)
4($2)
0($2)
4($2)
0000
1010
1100
0101
1001
1111
0110
1000
1100
0101
1010
0000
0110
1000
1111
1001
1010
0000
0101
1100
1111
1001
1000
0110
0101
1100
0000
1010
Machine Interpretation
Control Signal
Specification
ALUOP[0:3] <= InstReg[9:11] & MASK
1000
0110
1001
1111
Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand
Fetch
Execute
Result
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in storage for later use
Store
Next
Instruction
Determine successor instruction
What next?
• How does better technology help to
improve performance?
• How can we quantify the gain?
• The MIPS architecture
• MIPS instructions
• First contact with assembly language
programming
The Role of Performance
Performance
• Response time: time between start and
finish of the task (aka execution time)
• Throughput: total amount of work done
in a given time
Question
• Suppose that we replace the processor
in a computer by a faster model
• Does this improve the response time?
• How about the throughput?
Question
• Suppose we add an additional processor
to a system that uses separate
processors for separate tasks.
• Does this improve the response time?
• Does this improve the throughput?
CPU Performance
• CPU Execution time for a program
= CPU Clock Cycles for a program
X Clock Cycle time
= CPU Clock Cycles for a program /
Clock rate
CPU Performance Example
• A program runs in 10 secs on computer
A with a 4 GHz clock. We try to build
computer B, that runs this program in 6
secs where computer B requires 1.2
times as many clock cycles as computer
A for this program. What clock rate
should we target?
CPU Performance
• CPU clock cycles
= instruction count X CPI
• CPU time
= instruction count X CPI X Clock cycle
time
inst count
Cycle time
CPI
Cycles Per Instruction
(Throughput)
“Average Cycles per Instruction”
CPI = (CPU Time * Clock Rate) / Instruction Count
= Cycles / Instruction Count
n
CPU time  Cycle Time   CPI j  I j
j 1
n
CPI   CPI j  Fj
j 1
where Fj 
Ij
Instruction Count
“Instruction Frequency”
Example: Calculating CPI
Base Machine
Op
ALU
Load
Store
Branch
(Reg /
Freq
50%
20%
10%
20%
Reg)
Cycles
1
2
2
2
Typical Mix of
instruction types
in program
CPI(i)
.5
.4
.2
.4
1.5
(% Time)
(33%)
(27%)
(13%)
(27%)
Performance
(Absolute) Performance
Relative Performance
" X is n times faster than Y" means
n=
MIPS as a performance measure
• MIPS = Instruction count
/ (Execution time X 10^6)
Ex P. 269
Amdahl’s Law
The execution time after making an
improvement to the system is given by
Exec time after improvement = I/A + E
I = execution time affected by improvement
A = amount of improvement
E = execution time unaffected
Amdahl’s Law
Suppose that program runs 100 seconds on a
machine and multiplication instructions take
80% of the total time. How much do I have to
improve the speed of multiplication if I want my
program to run 5 times faster?
20 seconds = 80 seconds/n + 20 seconds
=> it is impossible!