SOC-CH8b - Custom Computing

Download Report

Transcript SOC-CH8b - Custom Computing

Chapter 8
What’s Next: Challenges Ahead
Computer System Design
System-on-Chip
by M. Flynn & W. Luk
Pub. Wiley 2011 (copyright 2011)
soc 8.1
Future (1): Autonomous SOC
• SOC Consists of multiple heterogeneous
processors, controllers and memory
• but it’s still only a part of a larger system:
with power source, disk, network interface,
transducers, I/O, actuators (speakers and
displays), etc
• how much of this can we put on a single die?
Perhaps all but the actuators
soc 8.2
ASOC: what it is
• a standalone (in so far as possible) SOC with
the sensors, actuators, controllers
communications and storage
• capable of realizing complete
communications, sense, analysis, recognition
and motor actions
• must be more than simple sense and
communicate; analysis to create information
soc 8.3
Autonomous (untethered) chips
a simplified classification:
• simple identification of the die itself (as in RFID)
with RF response
• identification of a sensor detected “event” with
RF response (as in Smart Dust [Coo] and many
Smart Cards)
• detection of an “event” and processing
(classification, recognition, analyzing) the event
(ASOC) with RF response of the result
Ref: B.W. Cook et al, “SOC Issues for RF Smart Dust,” Proc IEEE
June 2006
soc 8.4
Untethered systems: RFID vs ASOC
System
Example
Power
source
Maximum
Memory
RF range
(meters)
Compute
Passive id
RFID,
Smart Card
(simple)
None
Active id
Smart card,
Active RFID
ROM ID
(1KB)
R/W ID +
parameters (2KB)
Passive;
order of 1
meter (M)
None
Active 1-10 M
RF sensor
ASOC
Smart Dust;
RFID +
sensor
Battery
Integrated
Battery
R/W
extensive
(100 MB)
10 - 20 M
10+M
FSM or controller
FSM
Short term battery
More than
one CPU
soc 8.5
ASOC basics
• small size
– basic die type: O(1/4 -1 cm2 x ½ mm)
– note, long term: could be printed on an ultra
thin base
• power and energy, self contained
• persistent storage
• communications with environment
(network)
• one or more types of sense and reaction
soc 8.6
Area, Time (Performance) and
Power design tradeoffs
Dynamic power
only
soc 8.7
Area
• cost of processed silicon: $1-10/cm2
• most designs target
– 1 cm2 or a little less in area; the sweet spot
– gives 90%+ yield
• larger dies have
– lower yields
– fewer dies / wafer
– costs can be 10x for a doubling of die size
• small dies aren’t much cheaper
– limited by testing and packaging
soc 8.8
120
1500
100
1300
1100
80
900
60
700
500
40
20
300
100
0
-100
90
65
45
transistor density
Flash memory density
Silicon device density scaling
Mega Bytes/ cm2
M transistors/ cm2
32
Feature size
Net: there’s either a 1 billion transistors or 100 Mega Bytes of Flash
on a 1cm2 die
soc 8.9
Power and batteries
• eliminating the external battery is the key
technology for ASOC
– no pins, distribution problems
• must print or deposit battery on reverse side
of die
• can scavenge power
– source may be unreliable
– adds on die complexity
soc 8.10
Battery technology
Type
Printed [1]
Thin film [2]
Button
Energy (J)
Recharge Y/N
2 / cm2
10 /cm2
200
N
Y
Y
Thickness
(micron)
20
100
500
(stand alone)
[1] PowerID, Power Paper Corp. www.powerpaper.com
[2] The POWER FAB (Thin Film Lithium Ion
Cell) battery system,. http://www.cymbet.com
soc 8.11
Scavenging Energy
Source
Charge rate
Solar
Ambient light
Strain and acoustic
65 (milliwatts)/ cm2
2 (milliwatts)/ cm2
A force (sound) changes
alignment of crystal
structure, creating voltage;
2 (milliwatts)/ cm2
An electric field of 10V/m
yields 16W/cm2 of
antenna
40 (microwatts/50C
difference)
RF
Temperature difference
(Peltier effect)
Comment
Piezoelectric effect
About 2-4% efficient
See [SOD]
See [YEA]
Needs temperature
differential.
[YEA] E.M. Yeatman, “Advances in power sources for wireless sensor nodes,”
Proceedings of 1st International Workshop on Body Sensor Networks, London, 2004
[SOD] Sodano et al, ” Electric power harvesting using piezoelectric materials,”
10th SPIE Conference on Smart Structures and Materials, 2003
soc 8.12
Energy capacity at 1 w usage
Net: an on die battery will have only 10 Joules unless energy is scavenged.
At a 10% duty cycle this gives better than a 3 year lifetime.
soc 8.13
Power and performance
• with a power budget of 1 w how to provide
meaningful performance?
• if today’s processor offers 5 GHz at 100w
– by the cubic rule 1 w (dynamic power) should
offer 10.5 MHz: (10-8)1/3 = 2.1 x 10-3
• but with lots of transistors we need to use
them to recover performance
soc 8.14
Power and performance
• with a power budget of 1 w how to provide
meaningful performance?
• static power is also a big issue
– must be of the O (0.1) w
• methods to reduce static power consumption
– sub threshold circuits
– power islands
– sleep transistors
– SOIAS: Silicon on Insulator Active Substrate
soc 8.15
Performance with low clock rate
• No clock: asynchronous logic
– no unnecessary state transitions
• Minimum or no cache system
– backed by compatible Flash
• VLIW and specialized multi processors
– to recover performance
soc 8.16
The ASOC die
soc 8.17
Networking: it’s the ensemble
that produces the action
• no individual die is expected to produce a
recognition or a definitive action
• recognitions are passed to neighbors; can be
sent to higher level for action or through
swarm computing arrived at via the network
• current software models are inadequate in
this regard
• for hardware, communication is the key
soc 8.18
Untethered inter die
communications
• Light or RF
• modulated light can be low power
– relatively easy to focus/ defocus
– free space signals are non interfering
– but, must have line of sight
• RF, components well understood
– RFID technology
– can require power; especially at high frequency
– antenna focuses power based on carrier
frequency
soc 8.19
On die lasers for
communications
power
10
1
1
10
100
1000
micro watts
0.1
0.01
pluse width (ns)
It’s possible to achieve 100 Mbps with 1watt without noise
But: ambient light is noise; need 10x signal over noise; also distance requires
optics to collimate the beam for low divergence
soc 8.20
RF, smart dust:1011 bits/Joule/
Meter.
65x30x25 mm
Prototype;
Target 2 mm3
Ref: [Coo] B. W. Cook et al, “SOC Issues for RF Smart Dust,” Proc IEEE
June 2006
soc 8.21
Today’s ASOC; implantable RF
5 mw (peak), 2 meters, 1 Mbps
Ref: Zarlink’s medical implantable RF
soc 8.22
Audio sensors
• time or frequency domain
• ear uses frequency domain
• need sensitive chip mounted crystal transducers
– to provide signal (voltage) to sensor
soc 8.23
Audio sensors: Cochlear implant
Speech
processor,
Transmitter
R
F
Receiver,
electrodes
Ref:
Wikipedia
soc 8.24
Audio sensors; cochlear chip; series of
frequency bandpass filters
Ref: B. Wen et al “Active Bidirectional Coupling in a Cochlear Chip” Advances
in Neural Information Processing Systems 17, Sholkopf Ed., MIT Press, 2006
soc 8.25
Vision and recognition: edges
64x64 pixel array (PEs) with
reconfiguration; PE chaining
and fast summation for edge
detection
Ref: T. Komuro, “SIMD Processor for Vision“ IEEE JSSC, VOL. 39, NO. 1,
JAN 2004
soc 8.26
Vision and recognition: templates
Object recognition with color (RGB to HIS)
128 x 64 pixel element array
Uses SAD array scan to match against 32
Templates (432 b each)
Achieves 30 frames per second with 1 mw
Ref:Etienne-Cummings, “A Vision Chip for Color Segmentation and Pattern
Matching,” EURASIP Journal on Applied Signal Processing 2003:7, 703–712
soc 8.27
Movement: 2D
• ASOC die is 10x10x0.6mm and weights
about 200 mg
• with 1 Joule of energy (107 ergs);
lifting 200mg 1 cm requires 20 ergs or
2 w-sec
• so as long as speed is slow (order of
cm / sec) or duty cycle is low, simple
motion using nano motors shouldn’t be
a problem
soc 8.28
Movement: Flight
Toward 30-gram Autonomous Indoor Aircraft:
Vision-based Obstacle Avoidance and Altitude Control
http://www.youtube.com/watch?v=Lv6amv0yDIg
J. Zufferey and D. Floreano,” Toward 30-gram Autonomous Indoor Aircraft:
soc 8.29
Vision-based… Control,” Laboratory of Intelligent Systems, EPFL
Movement: flight
J. Zufferey and D. Floreano,” Toward 30-gram Autonomous Indoor Aircraft:
Vision-based … Control,” Laboratory of Intelligent Systems, EPFL
soc 8.30
Movement: the fruit fly
Drosophila melanogaster
soc 8.31
Fruit fly
• length 2.5 mm; volume 2 mm3
• 0.65 milligram; 1 month lifetime
• vision: 800 units each w 8 photoreceptors for
colors thru the UV (200k neurons); 10x
better than human in temporal vision
• also olfaction, audition, learning/memory
• flight: wings beat 210x /sec; move 10
cm/sec; rotate 900 in 50 ms
soc 8.32
Summary: ASOC
• the goal is to create a catalog of techniques,
sensors, controllers, transceivers and
processors together with an interconnection
and design methodology for application ASOC
• ASOC can be one or many die; external units as
required by system
• we’re a long way in Cost-Time-Power from a
fruit fly; but we’re making progress!
soc 8.33
Future (2): self-optimise/self-verify
• 2005 International Technology Roadmap for
Semiconductors: overall design challenges
– cost-driven design optimisation
– verification and test
– re-use
• approach to address all 3 challenges?
– key elements
– challenges
– summary
soc 8.34
Approach: key elements
• optimise and verify: hardware + software
– meet requirements efficiently and demonstrably
• self-optimising and self-verifying design (SOSV)
– preserve property in design composition
• self?
– aware of context
– capable of planning
– effective external control
• 2 stages
– pre-deployment: building design, compile time
– post-deployment: operational, run time
soc 8.35
Pre- and post-deployment
----------------------------------------------------------------------------------Pre-deployment
Post-deployment
----------------------------------------------------------------------------------focus
designer productivity
design efficiency
aim
optimize/verify initial
post-deployment design
optimize according to
situation
context
design tool environment,
often static
operation environment,
often dynamic
acquire context from parameters affecting from data input
tool performance
e.g. sensors
planning
plan post-deployment
plan to meet postoptimise/verify
deployment goals
external control frequent
infrequent
----------------------------------------------------------------------------------soc 8.36
Re-use
• high-level generic design
– requirements + context: multiple designs
– optimise: options + parameters + abstraction levels
• facilitate design composition
– preserve self-optimising and self-verifying
– modularity of building blocks + interfaces
• platform-based evolution
– re-use un-verified design: risky
– automate re-verification after changes
– platform for re-use: from automatic to self
soc 8.37
Pre-deployment: overview
• available computing resources
– components + pre-context: locate and tune tools
– current context: optimise resource + error recovery
• designer
– adapt components: requirements + post-context
– choose control: automation of search strategies
– decide: re-use or re-invent
• challenges
– productive interaction: designer + tools
– avoid combinatorial explosion
– maximise re-use: incremental design
soc 8.38
Pre-deployment: example of choices
•
•
•
•
•
•
•
•
•
circuit technology: eg ASIC or FPGA
input/output: options
memory: hierarchy + options
interconnect: e.g. bus, switch, network-on-chip
granularity: configurable unit, custom instruction
synchronisation: e.g. clock domains, self-timed
parallelism: processors, hardware/software
data representation optimisation
post-deployment optimisation/verification
soc 8.39
Example: number of processors
From: Fidjeland & Luk 05
soc 8.40
Post-deployment: situation-specific
• optimization and verification opportunities
– design upgrade
– run-time conditions, e.g. noise, process variation
– program phase optimisation
• optimisation and verification process
– light-weight: on-site, e.g. proof-carrying code
– heavy-weight: remotely, verified by signature
• run-time system
– deals with exceptions
– error diagnosis facilities
soc 8.41
Example: program phase optimisation
• program phase: working set remains constant
• reconfigure to speed up frequent branches
From: Styles and Luk 05
soc 8.42
Autonomous systems
• control strategy
– make decisions to optimise itself
– model of world: planning and action
– understand trade-offs: e.g. reconfigure or not
• event-driven just-in-time reconfiguration
– component meta-data description
– assemble + tune partially-optimised components
– hide reconfiguration latency
• other possibilities
– machine learning
– self-organising feature map
soc 8.43
Example: dynamic power optimisation
power surge while self-optimising/verifying
soc 8.44
Challenges: theory + practice for:
•
•
•
•
•
productive automate: evolutionary vs disruptive
SOSV design: specify + analyse requirements
composable description: design + context
multi-level capture: domain-specific constraints
open standard: design, optim/verify programs
soc 8.45
Summary
• self * (optimising+verifying) = trusted re-use
– unify: autonomic, self-test, dynamic optim., RTR
– better design + more productive
• self-optimising self-verifying design platform
– FPGA-based systems: large + small
– autonomous system-on-chip + network of ASOCs
– applications: ubiquitous, dependable, secure, robust
• new generation of designers
– building blocks + tools: made smarter
– specify, analyse, adapt: requirements + search
soc 8.46