A top down look at computer technology

Download Report

Transcript A top down look at computer technology

Computer Architecture
Slide Sets
WS 2010/2011
Prof. Dr. Uwe Brinkschulte
Prof. Dr. Klaus Waldschmidt
Part 4
Fundamentals
in Computer Technology
Computer Architecture – Part 4 – page 1 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Technology trends:
things are getting smarter
Computer Architecture – Part 4 – page 2 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Technology trends:
networked systems of the future
ambient intelligence
ubiquitous computing
Networked
Systems
disappearing computer
pervasive computing
Computer Architecture – Part 4 – page 3 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Technology trends:
heterogeneous hardware-software-systems (HW/SW)
System on Chip
Processor-Core, FPGA, RF,
Bluetooth ... + Software
Heterogeneity also in the
environment (application)
Computer Architecture – Part 4 – page 4 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Technology trends:
cyber physical systems (CPS)
Integration of physical systems and
networked computing
In classical embedded systems, the
physical environment is controlled by
the computer
In cyber physical systems, the
physical environement and the
computer(s) closely interact and
cooperate
Examples: power grids, networks of
autonomoues vehicles, air traffic
control, …
Computer Architecture – Part 4 – page 5 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
System on Chip (SoC)
Functional and technology aspects
→ Components, Interfaces + Technologies
Analog/Digital Systems:
Technology Aspects:
Physical
Interfaces
Antenna
→ Process Combinations necessary
Binary
Interfaces
RF
System-on-Chip
FPGA/
ASIC
FPFA
Memory
AMP
Sensor
Sensors
Audio
Bus
/ Interface
Communication
ADC
Cable
Control
Video
Data
Link
DSP
DAC
Transmitter
CPU
Clk
Power Management
Power
Computer Architecture – Part 4 – page 6 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
An analog component with a digital processor core
Embedded digital processors are mainly used in analog environments. Life science and
technical applications are mainly analog. Therefore, an analog to digital conversion and
vice versa is necessary.
A
AD/DA
AD/DA
AD/DA
D
AD/DA
Computer Architecture – Part 4 – page 7 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Microprocessor
Important component of all
modern analog and digital
applications.
It became the basic measure
for the technological progress
in VLSI.
It became the workhorse
of all modern IT
applications
Computer Architecture – Part 4 – page 8 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Modern processor chips
Microphotographs of processor chip layout
Intel Pentium Processor
Analog Devices ADSP 21060
Intel Pentium IV
IBM Power PC 750
Altera FPGA with ARM-core
System-on-Chip
(Bluetooth SoC: Eynde et al., Alcatel, 2001)
Computer Architecture – Part 4 – page 9 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
The Moore curve
This prediction of Gordon Moore demonstrates the progress of technology in the last
decades. The complexity of integrated circuits (VLSI) doubles every 18 month.
1975
1980
1985
1990
1995
2000
2005
2010 year
1000 M
CMOS
# transistors
100M
until to the end of the Moore era
(end of this decade)
10M
1M
P III
P IV
Pentium™
80486
80386
80286
100K
8086
10K
8080
4004
Silicon will be the basic material for the next years.
Computer Architecture – Part 4 – page 10 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
The scaling (c)
Structure
size [µm]
c
0,2
0,15
0,1
year
Strukture size
[µm]
2000
2001
2002
2003
2004
2005
2008
2011
2014
0,18
0,18
0,13
0,13
0,13
0,1
0,07
0,03
0,01
Signal delay of active
components scales with c
Signal delay of passive
components will be nearly
constant
Signal delays are dominated
by the delay of the wires
(passive components)
0,05
1999 2001 2003 2005 2008 2011 2014
The area scales with c2
year
wire-centered design instead
of only logic optimized design
or better a combination of both
Computer Architecture – Part 4 – page 11 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Architecture Level
Microarchitecture Level
Register Transfer Level
Gate Level
Transistore Level
Charge Level
…
Computer Architecture – Part 4 – page 12 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Architecture Level
Instruction Set Architecture, Memory Sizes, Clock Frequency, …
Computer
Computer Architecture – Part 4 – page 13 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Microarchitecture Level
Computer
Microprocessor, CPU
control unit
functional unit
(datapath)
connection (bus)
memory
(program, data)
input/output
Computer Architecture – Part 4 – page 14 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Register Transfer Level
instruction
(program)
CPU
data in
data register
state register
next
state
control unit
output
function
data
path
control
functional
units
data path
Computer Architecture – Part 4 – page 15 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Gate Level
functional unit
e1 0
1
t
1
1
0
1
&
t
0
1
e
2 10
0
1
0
&
3t
1
0
0
1
t
0
1
a
e 30
Computer Architecture – Part 4 – page 16 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Transistor Level
Computer Architecture – Part 4 – page 17 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
A top down look at computer
technology
Charge Level
Computer Architecture – Part 4 – page 18 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Delay times
Planar process technology
Passivation (field oxide)
metal layer 3
oxide 4
metal layer 2
metal layer 2
oxide 3
metal layer 1
oxide 2
capacitance
metal to
metal
Polysilicon
oxide 1
transistor
capacitance
metal to
substrate
substrate
oxide = SiO2
Computer Architecture – Part 4 – page 19 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Delay times
Modeling of signal delays in wires for the connection
of active components (gates) in chip-layouts
1µm
1µm
Wire:
R1
Model:
UE
R2
C1
Rn
Voltage at node nj for a given input voltage UE
j
n
Differential U (t )  U  (C du j  R )

 i
n
E
j
dt i 1
Equation:
j 1
current through the
capacitance at node j
Solution:
un
Cn
C2
Delay td at node n:
td
 (U
0
n
E
j
 U n (t )) dt  Ri C j u j
j 1 i 1
Total resistance
at node j
Approximative solution of the differential equation
n(n  1)
for Ri = R0 Vi Ci= C0 Vi per 1 µm of wire length
t d  R0C0
2
Computer Architecture – Part 4 – page 20 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Typical values for resistance, capacitance and
the resulting delay times for integrated wires
capacitance resistance
fF/µm
/µm
delay time ns
per 1mm 3µm
metal
CMF 0,1
RM 0,02
tdm 0,001
polysilicon
CPF 0,15
RP 17
tdP 1,3
n-diffusion
CJN 1,8
RD 7
tdD 6,3
Computer Architecture – Part 4 – page 21 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Example of a real wire model
measure point 1
measure point 2
R1
measure point 3
R2
300 Ω
measure point 4
R3
300 Ω
300 Ω
UE
C1
350fF
C2
350 fF
Computer Architecture – Part 4 – page 22 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
C3
350 fF
Hier wird Wissen Wirklichkeit
Plot of the node voltage at different measure points
per wire length
4,0 V
measure point 1
measure point 2
measure point 3
measure point 4
2,0 V
0V
0 ns
measure point 4
measure point 3
measure point 2
measure point 1
2 ns
4 ns
6 ns
8 ns
Computer Architecture – Part 4 – page 23 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
10 ns
Hier wird Wissen Wirklichkeit
The clock
The increasing size of modern chips (chip area)
means, that also the connections between the
components on the chip become longer.
Example:
- 10 GHz clock frequency
(0,1 ns clock period)
- 30 mm distance between two
components means 10 clock cycles
on the wire (silicon)
The clock skew in this example
is 1 ns. It is too high for sequential
synchronous circuits.
The clock skew has to be
considered in the design phase
and /or has to be avoided by
architectural solutions.
(clock tree, wave pipelining, etc.)
Computer Architecture – Part 4 – page 24 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Memory gap
Access time is dominated by
charging and decharching of
the memory capacity:
Typical design of a DRAM cell
ta = Rtransistor  Cmemory capacity
C is limited above 10 fF to
avoid data loss by alpha
particles
R is limited by the area of the
memory transistor
wire
capacity
memory
capacity
memory transistor
=> ta is limited
Computer Architecture – Part 4 – page 25 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Memory gap
The memory capacity scales to
the square with the size of the
chip
CPU
memory
gap
As seen before, access time does
unfortunately not scale this way
Therefore a memory gap exists.
Caches are a good solution but they
can bridge the gap only unsufficient
because of the limited locality of
typical programs.
Memory
Increasing clock speed generates
an increasing memory gap.
Because of the memory gap, an
increasing clock speed results not
automatically in a higher executionspeed of programs.
Computer Architecture – Part 4 – page 26 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Unbalanced von Neumann
vN bottleneck
Computer Architecture – Part 4 – page 27 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Power consumption
Reduction of power and energy consumption is a big issue today
On high end systems, heat dissipation has to be reduced
Modern mobile embedded systems need the reduction to increase battery
lifetime
=> Tradeoff between high performance and low power/energy
consumption
Main ways to reduce power and energy consumption
 Reduction of clock frequency
 Reduction of supply voltage
 Optimization of microarchitecture
Computer Architecture – Part 4 – page 28 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Power consumption and
clock frequency
As seen before, CMOS only consumes power when switching
Therefore, in modern gate technologies the energy consumption
is mostly proportional to the clock frequency
P~f
Reduction of clock frequence means reduction of power
consumption,
but as well a reduction of the system performance
Computer Architecture – Part 4 – page 29 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
The power supply voltage
Voltage
Voltage
High
Performance Low Power
year
[V]
[V]
P ~ U2
1999
2000
2001
2002
2003
2004
2005
2008
2011
2014
1,5
1,0
1,8
1,8
1,5
1,5
1,5
1,2
1,2
0,9
0,6
0,6
1,5
1,5
1,2
1,2
1,2
0,9
0,9
0,6
0,5
0,3
The power supply voltage
cannot easily be reduced
under 1-0,5 V.
0,5
voltage High Performance [V]
voltage Low Power [V]
1999
2001
2003
2005
2008
year
2011
The reduction of supply voltage
implies a reduction of max.
clock frequency.
2014
Computer Architecture – Part 4 – page 30 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Power consumption and
microarchitecture
Possible approaches:
• Reduction of external bus activities (stay local)
• Static Power Management (sleep instructions)
• Dynamic Power Management (control unit
deactivates non-used parts of the
microarchitecture)
• Increase of code density (saves memory space and
cycles)
Computer Architecture – Part 4 – page 31 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Example: reducing memory
power consumption
This disadvantage of a good memory hierarchy is a high degree in energy
consumption.
The on-chip data - and instruction caches need itself appr. 25 % of the total
power of a processor chip.
common address space
Kernel
Memory
kernel instructions
(small and fast)
Off-Chip memory
CPU
other instructions
Instruction
Cache
A cache oriented solution as an example for a power aware
microarchitecture.
Computer Architecture – Part 4 – page 32 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
SIA-Roadmap
year
1999
2000
2001
2002
2003
2004
2005
2008
2011
2014
year
chip area [mm²]
transistors/chip [Mio]
frequency [MHz]
power consumption [W]
1999
2001
2003
2005
2008
2011
2014
year
1999
2000
2001
2002
2003
2004
2005
2008
2011
2014
Computer Architecture – Part 4 – page 33 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
chip area
[mm²]
transistors/
chip [Mio]
340
23,8
340
47,6
372
95,2
408
468
536
615
190
539
1523
4308
frequency power con[Mhz]
sumption [W]
1250
1486
1767
2100
2490
2952
3500
6000
10000
13500
90
100
115
130
140
150
160
170
174
183
Hier wird Wissen Wirklichkeit
The challenge (limit) of chip technology in the future
The production of chips with layout
structures in dimensions under 0.1 µm
became very difficult with respect to
lithography.
$
Therefore, only about 10% of today’s chips
are produced using latest technology
Chips of more than 300 mm2 chip area
will include one or more faults in
average, caused only by the technology
process.
Increasing cost of chip
production
Decreasing yield of chip
production
Computer Architecture – Part 4 – page 34 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit
Development of different parameters in
computer architecture in the past
1000000
100000
Bandwidth in Byte/s
Clock in MHz
Tape/Disk in MByte
10000
OS in kByte
Memory in MBit
1000
Design in MTrans.
100
10
1
1975
0,1
1980
1985
1990
1995
2000
2005
2010
0,01
0,001
Computer Architecture – Part 4 – page 35 of 35 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt
Hier wird Wissen Wirklichkeit