Transcript memory
Computer Architecture
COSC 3430
Lecture 1: Introduction
1
Course administration
• Instructor:
R. N. Lea
[email protected]
531 PGH Building
Office Hrs: 10:30-11:30 am MW
and 2:30-3pm MW
• TA:
Office Hrs:
• Labs:
• URL:
www.cs.uh.edu/~blea
• Text:
Required: Computer Org and Design, 3rd
Edition revised printing, Patterson and
Hennessy ©2007. If you have to buy a
book the 4th edition is satisfactory.
2
Grading information
• Slides: ppt on the course web page after lecture
• Grade determinates
Midterm Exam
25%
End of course exam
25%
Final Exam
25%
Homework
25%
3
Course content
• Content
Principles of computer architecture: CPU datapath and
control unit design (single-issue pipelined, superscalar,
VLIW), memory hierarchies and design, I/O organization
and design, advanced processor design (multiprocessors
and SMT)
• Course goals
To learn the organizational paradigms that determine the
capabilities and performance of computer systems. To
understand the interactions between the computer’s
architecture and its software so that future software
designers (compiler writers, operating system designers,
database programmers, …) can achieve the best costperformance trade-offs and so that future architects
understand the effects of their design choices on software
applications.
• Course prerequisites
COSC 2410. Computer Organization and Design
MATH 3336. Discrete Mathematics
4
What You Should Know – COSC 2410 and Math 3336
• Basic logic design & machine organization
A little about component design
(appendix B)
processor, memory, I/O
• Create, assemble, run, debug programs in
an assembly language
MIPS preferred and is done in 2410
• Create, compile, and run C (C++, Java)
programs
• Create, organize, and edit files and run
programs on the MIPS simulator SPIM
5
Course Structure
• Design focused class
Various homework assignments throughout the
semester
Simulation of architecture alternatives using
Design Works
• Topics to be covered as time permits:
Review of the MIPS ISA and basic architecture
Review of arithmetic for computers
Assessing Performance
Datapath and control design issues
Enhancing Performance with Pipelining
Memory hierarchies and memory design issues
I/O design issues
Multiprocessor design issues
6
Course structure continued
• Introduce computer hardware design and
implementation
• Introduce major design advances in the last
decade: hierarchical memory and parallel
processing
• Focus on prototypical modern architecture
Not on specific computer brands (range)
Not on electrical circuits (blueprint)
7
WHAT YOU SHOULD READ! : PH Chapter 1
• What is Computer Architecture?
The interface between the HW and the
lowest-level software. It is referred to as
the instruction set architecture (ISA) of a
machine.
ISA includes all information needed to
write a ML program that will run correctly,
including instructions, registers, memory
access, I/O, etc.
8
Computer architecture
• Architecture describes the internal organization of a
computer in an abstract way; that is, it defines the
capabilities of the computer and its programming
model. You can have two computers that have
been constructed in different ways with different
technologies but with the same architecture.
• A computers organization describes how the
architecture is implemented; i.e., it defines the
hardware used to implement the architecture.
• One can have numerous computers that implement
the same architecture and as a result all programs
written for one of the computers will run on all of
them. However, the performances of the
computers will in general be very different.
9
Computer architecture continued
By the architecture of a system, I mean the
complete and detailed specification of the
user interface. … As Blaauw has said,
“Where architecture tells what happens,
implementation tells how it is made to
happen.”
The Mythical Man-Month, Brooks, pg 45
10
What will you learn?
• How are high-level programs (C, Java) understood by
hardware?
• How are they executed by hardware?
• What is interface between computer’s software and
hardware?
• How does software tell hardware to perform tasks?
• How can programmer improve program
performance?
• How can hardware designer improve hardware
performance?
11
1.1 Computing Applications: Broadly Speaking
• “Computers” are everywhere today
• Innumerable individual computing
applications, e.g.:
ATMs
Computers in vehicles
(steering/braking/motion sensors)
PDAs, Cellphones, Blackberries, IPods…
Kids toys, kitchen appliances, …
12
Classes of Computing Applications
• Desktop Computers (best-known)
Single user, access via keyboard and screen
Execute few tasks at a time
Emphasis: speedy performance at low cost
• Servers (widest range cost/capability)
Multiple users access via network
Execute many small tasks at once (web); execute one huge
job (forecast supercomputer)
Emphasis: dependability over all users
• Embedded Computers (widest range use/power)
Put in device to execute one job (hardwired or via
software): cell phone, PDA, video game, TV, plane…
Emphasis: doing 1 task as perfectly as possible
13
A Prototypical Computer System
• Key components:
The computer (box)
Input devices
(keyboard, mouse)
Output devices
(display, printer)
Input/Output devices (disks, networks)
14
Inside the Box
15
Inside the Computer: Motherboard
16
Inside the Computer: Motherboard
P
R
O
C
E
S
S
O
R
17
Inside the Computer: Motherboard
M
A
I
N
M
E
M
O
R
Y
18
Inside the Computer: Motherboard
I
N
P
U
T
O
U
T
P
U
T
19
Inside the Computer: Motherboard
20
Inside the Computer: Basic Design
21
Inside the Computer: Data Flow
Computer
Processor
Main
Memory
Devices
Control
INPUT
Datapath
OUTPUT
Data flows from input to memory to processor
Data is processed and flows back to main memory
Data flows to output devices for storage or display
22
Inside the Computer: Data Processing
Computer
Processor
Main
Memory
Devices
Control
INPUT
Datapath
OUTPUT
Control gets program instructions from memory
Control tells memory, datapath, I/O what to do with data
Datapath gets data from memory and operates on it
23
Inside the Computer: Data Storage
Computer
Processor
Main
Memory
INPUT
Control
VOLATILE
Datapath
Devices
NONVOLATILE
OUTPUT
Main memory: small, close, fast, expensive, volatile
2ndary (I/O) memory: big, far, slow, cheap, nonvolatile
24
1.2 Below your Program
The Basis for Machine
Communication
25
Human Language
• Humans communicate via speech, text, image
English alphabet has 26 letters: a-z
Letter sequences form words: tree
Word sequences form discourse units
(sentences, paragraphs):
Some trees have yellow leaves
26
Machine Language
• Machines communicate via electrical signals (conveyed
through wires or wireless EM waves) (power supplied by
current, battery)
Machine alphabet has 2 letters: 0, 1
(high/low voltage, on/off, true/false)
Letter sequences form meaningful units:
Data:
0011
3
Instruction:
0001000110011
3+3
Program:
000100011001111 swap two
010101000000001 variable
000010001010101 values
27
Machine Language Assembly Language
• Machine Language (ML) is easy for computers but
time-consuming for humans
• Assembly Language (AL) was developed as a
“more natural” symbolic code for ML
ADD 3,3 0001000110011
• Assemblers are programs developed to
automatically translate AL to ML
28
High Level Programming Languages
• AL still thinking like a machine
1 AL instruction for every 1 ML instruction
Why not develop a higher-level code for AL?
• High-level Programming Languages (PL) were
developed as symbolic codes for AL
C, Java, Perl, etc.
• Compilers are programs developed to
automatically translate PL to AL
29
C Program
C compiler
MIPS AL Program
swap:
MIPS assembler
MIPS ML Program
swap(int v[], int k) {
int tmp;
tmp = v[k];
v[k] = v[k+1];
v[k+1] = tmp;}
muli $2, $5, 4
add $4, $2
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)
jr $31
00000000101000010000000000011000
00000000100011100001100000100001
10001100011000100000000000000000
011100011100011101001100010001 …
30
Major Components of a Computer
Processor
Control
Datapath
Devices
Memory
Input
Output
Machine Organization
Capabilities and performance characteristics of the
principal Functional Units
e.g., register file, ALU, multiplexors, memories, ...
The ways those Functional Units are interconnected
e.g., buses
Logic and means by which information flow between
Functional Units is controlled
The machine’s Instruction Set Architecture (ISA)
Register Transfer Level (RTL) machine description
Impacts of Advancing Technology
Processor
logic capacity:
performance:
increases about 30% per year
2x every 1.5 years
Memory
DRAM capacity: 4x every 3 years
memory speed: 1.5x every 10 years
cost per bit:
decreases about 25% per year
Disk
capacity:
increases about 60% per year
Technologies for Building Processors and Memories
Transistor – on/off switch controlled by electricity
Integrated circuit IC – combined dozen to hundreds of transistors into a single chip.
Very large scale integrated circuits VLSI – millions of transistors into a single chip.
Transistor
Each individual transistor functions like a miniature switch that can be operated
electrically.
When a small current flows between points B and E, a large current can flow between
points C and E.
When the small current stops flowing through the control connection, the large current
also stops flowing.
Example: Growth in DRAM Chip Capacity
1000000
256,000
100000
Kbit capacity
64,000
16,000
10000
4,000
1000
1,000
256
100
64
10
1980
1982
1984
1986
1988
1990
1992
Year of introduction
1994
1996
1998
2000
Below the Program
High-level language program (in C)
swap (int v[], int k)
. . .
Assembly language program
swap:
sll
$2, $5, 2
add
$2, $4, $2
lw
$15, 0($2)
lw
$16, 4($2)
sw
$16, 0($2)
sw
$15, 4($2)
jr
$31
one-to-many
(for MIPS)
Machine (object) code (for MIPS)
000000
000000
100011
100011
101011
101011
000000
00000
00100
00010
00010
00010
00010
11111
00101
00010
01111
10000
10000
01111
00000
0001000010000000
0001000000100000
0000000000000000
0000000000000100
0000000000000000
0000000000000100
0000000000001000
C compiler
one-to-one
assembler
MIPS Core Instruction Set
Advantages of Higher-Level Languages
Higher-level languages
Allow the programmer to think in a more natural language and for their
intended use (Fortran for scientific computation, Cobol for business
programming, Lisp for symbol manipulation, …)
Improve programmer productivity – more understandable code that is
easier to debug and validate
Improve program maintainability
Allow programmers to be independent of the computer on which they
are developed (compilers and assemblers can translate high-level
language programs to the binary instructions of any machine)
Emergence of optimizing compilers that produce very efficient
assembly code optimized for the target machine
As a result, very little programming is done today at the
assembler level
Input Device Inputs Object Code
000000
000000
100011
100011
101011
101011
000000
Processor
Control
Datapath
Devices
Memory
Input
Output
00000
00100
00010
00010
00010
00010
11111
00101
00010
01111
10000
10000
01111
00000
0001000010000000
0001000000100000
0000000000000000
0000000000000100
0000000000000000
0000000000000100
0000000000001000
Object Code Stored in Memory
Memory
Processor
Control
Datapath
000000
000000
100011
100011
101011
101011
000000
00000
00100
00010
00010
00010
00010
11111
00101
00010
01111
10000
10000
01111
00000
0001000010000000
0001000000100000
0000000000000000
0000000000000100
0000000000000000
0000000000000100
0000000000001000
Devices
Input
Output
Processor Fetches an Instruction
Processor fetches an instruction from memory
Memory
Processor
Control
Datapath
000000
000000
100011
100011
101011
101011
000000
00000
00100
00010
00010
00010
00010
11111
00101
00010
01111
10000
10000
01111
00000
0001000010000000
0001000000100000
0000000000000000
0000000000000100
0000000000000000
0000000000000100
0000000000001000
Devices
Input
Output
Control Decodes the Instruction
Control decodes the instruction to determine what to execute
Processor
Devices
Control
000000 00100 00010 0001000000100000
Memory
Input
Datapath
Output
Datapath Executes the Instruction
Datapath executes the instruction as directed by control
Processor
Devices
Control
000000 00100 00010 0001000000100000
Memory
Input
Datapath
contents Reg #4 ADD contents Reg #2
results put in Reg #2
Output