멀티코어 프로그래밍

Download Report

Transcript 멀티코어 프로그래밍

Introduction
Introduction
This course is all about how computers work
But what do we mean by a computer?






Different types: desktop, servers, embedded devices
Different uses: automobiles, graphics, finance, genomics…
Different manufacturers: Intel, Apple, IBM, Microsoft, Sun…
Different underlying technologies and different costs!
Analogy: Consider a course on “automotive vehicles”



Many similarities from vehicle to vehicle (e.g., wheels)
Huge differences from vehicle to vehicle (e.g., gas vs. electric)
Best way to learn:



2
Focus on a specific instance and learn how it works
While learning general principles and historical perspectives
Why learn this stuff?

You want to call yourself a “computer scientist”
You want to build software people use (need performance)
You need to make a purchasing decision or offer “expert” advice

Both Hardware and Software affect performance:






Algorithm determines number of source-level statements
Language/Compiler/Architecture determine machine instructions
(Chapter 2 and 3)
Processor/Memory determine how fast instructions are executed
(Chapter 5, 6, and 7)
Assessing and Understanding Performance in Chapter 4
3
What is a computer?
Components:





Input (mouse, keyboard)
Output (display, printer)
Memory (disk drives, DRAM, SRAM, CD)
Network
Our primary focus: the processor (datapath and control)




4
Implemented using millions of transistors
Impossible to understand by looking at each transistor
We need to learn the logical design of each component
Number of Distinct Processors Sold
Embedded processors prevail



Embedded computer
Millions of computers

Cell phones, car computers, digital TVs, videogame consoles, …
Designed to run dedicated applications
Annual growth rate of 40%
 9% for desktops and servers
Desktops
Servers
1998
5
1999
2000
2001
2002
Uniprocessor Performance
Performance (vs. VAX-11/780)
10000
20%/year
1000
52%/year
100
10
25%/year
1
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006
6
Contributor 1: Technology
Processor



logic capacity:
clock rate:
about 30% per year
about 20% per year
Memory




DRAM capacity: about 60% per year (4x every 3 years)
Memory speed: about 10% per year
Cost per bit: improves about 25% per year
Disk


7
capacity: about 60% per year
Technology Improvement
Moore's law

The number of transistors per integrated circuit
would double every 18 months
108
107
Transistors

106
105
i80x86
M68K
MIPS
Alpha
104
103
1970
8
1975
1980
1985
1990
1995
2000
2005
Gordon Moore
(co-founder of Intel)
Contributor 2: Computer Architecture
Exploiting Parallelism (Single processor)




Pipelining
Superscalar
VLIW (Very Long Instruction Word)
Multiprocessor
Media Instructions
Cache Memory



9
Advanced Architectural Features

Parallelism in processing



Instruction level parallelism (ILP)
 Superscalar
 Out of order execution
 Branch prediction
 VLIW (software approach)
Data level parallelism (DLP) & Task level parallelism (TLP)
 SIMD instructions (media processing)
 Multicore (multi-processor)
Latency and capacity in memory system


10
Low latency access using cache memory
Capacity increase in main memory
Superscalar

Multiple functional units


Multiple integer units
Multiple floating point units
ALPHA
11
Pentium
How do computers work?

Need to understand abstractions such as:












Applications software
Systems software
Assembly Language
Machine Language
Architectural Issues: i.e., Caches,Virtual Memory, Pipelining
Sequential logic, finite state machines
Combinational logic, arithmetic circuits
Boolean logic, 1s and 0s
Transistors used to build logic gates (CMOS)
Semiconductors/Silicon used to build transistors
Properties of atoms, electrons, and quantum dynamics
So much to learn!
12
Levels of Abstraction


Delving into the depths
reveals more information about machines
An abstraction omits unneeded detail,
helps us cope with complexity
High level
language
program
(in C)
swap (int v[], int k)
{ int temp;
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
}
compiler
Assembly
language
program
(for MIPS)
swap:
mull $2, $5, 4
add $2, $4, $2
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)
jr $31
assembler
Binary machine
language
program
(for MIPS)
13
00000000101000010000000000011000
00000000000110000001100000100001
10001100011000100000000000000000
10001100111100100000000000000000
10101100111100100000000000000000
10101100011000100000000000000100
00000011111000000000000000001000
Instruction Set Architecture (ISA)

A very important abstraction





Design of instruction set





Interface between hardware and low-level software
Standardizes instructions, machine language bit patterns, etc.
Advantage: different implementations of the same architecture
Disadvantage: sometimes prevents using new innovations
How to specify data location
Which instructions to include
Which data formats to support
How to encode instructions
Modern instruction set architectures:

14
IA-32, PowerPC, MIPS, SPARC, ARM, and others
Historical Perspective

ENIAC built in World War II






The first general purpose computer
Used for computing artillery firing tables
80 feet long by 8.5 feet high and several feet wide
Each of the twenty 10 digit registers was 2 feet long
Used 18,000 vacuum tubes
Performed 1900 additions per second
Moore’s Law:
Transistor capacity doubles
every 18-24 months
15
Before ENIAC
16
Stored Program Computers



Instructions and data stored as binary numbers in memory
An instruction/data is referenced by its address
Advent of EDVAC by John von Neumann
17
Electronic Computers 2nd Generation

Technologies



General purposes



Processor: transistors
Memory: magnetic cores
IBM System/360
 Same architecture for a wide range of computers
Digital Equipment PDP-8
Supercomputer

18
Control Data 6600
Electronic Computers 3rd Generation

Technologies






Processor: IC
Memory: cores, SRAM and DRAM
IBM S/370
DEC PDP-11,VAX 11
CDC 7600
Cray-1
19
Electronic Computers 4th Generation

Technologies





Processor:VLSI
Memory: SRAM and DRAM
IBM 3990, 4380
DEC VAX 8400
Vector supercomputers


20
Cray-2, Cray X-MP
Fujitsu, Hitachi, NEC
Electronic Computers 5th Generation

Technologies



21
VLSI, SRAM, and DRAM with design tools
 Read “Singularity is coming”
RISC processor
 MIPS
 PA-RISC
 SPARC
 Alpha
 PowerPC
CISC processor
 Intel Pentium
 AMD
Lessons from Computer History

A new technology invents a new market



Architecture is resurrected






IBM S/360 triggers business applications
High density VLSI enables personal mobility
Simple one in ‘60 because of technology limit
Complex one in ‘80 for servicing many people
Simple one for mobility and low power
Now?
Mass market calls for standardization
Niche market is profitable but vulnerable to new technology

22
Cray, Apple, Sun, SGI