Computer Classes: Why they form, and what`s new "this" time

Download Report

Transcript Computer Classes: Why they form, and what`s new "this" time

Options for embedded systems.
Constraints, challenges, and approaches
HPEC 2001
Lincoln Laboratory
25 September 2001
Gordon Bell
Bay Area Research Center
Microsoft Corporation
HPEC 2001
More architecture options:
Applications,
COTS (clusters, computers… chips),
Custom Chips…
The architecture challenge:
“One person’s system, is another’s
component.”- Alan Perlis



Kurzweil: predicted hardware will be compiled
and be as easy to change as software by 2010
COTS: streaming, Beowulf, and www relevance?
Architecture Hierarchy:
–
–
–
–


Application
Scalable components forming the system
Design and test
Chips: the raw materials
Scalability: fewest, replicatable components
Modularity: finding reusable components
HPEC 2001
The architecture levels & options

The apps
–
–


Data-types: “signals”, “packets”, video, voice,
RF, etc.
Environment: parallelism, power, power, power,
speed, … cost
The material: clock, transistors…
Performance… it’s about parallelism
–
–
–
–
–
–
–
–
–
Program & programming environment
Network e.g. WWW and Grid
Clusters
Storage, cluster, and network interconnect
Multiprocessors
Processor and special processing
Multi-threading and multiple processor per chip
Instruction Level Parallelism vs
HPEC 2001
Vector processors
Sony Playstation export
limiits
A problem X-Box
would like to have,
… but have solved.
HPEC 2001
Will the PC prevail for the next
decade as a/the dominant platform?
…
or 2nd to smart, mobile devices?



Moore’s Law: increases performance;
Bell’s Corollary reduces prices for new classes
PC server clusters aka Beowulf with low cost OS
kills proprietary switches, smPs, and DSMs
Home entertainment & control …
–
–


Very large disks (1TB by 2005) to “store everything”
Screens to enhance use
Mobile devices, etc. dominate WWW >2003!
Voice and video become the important apps!
C = Commercial; C’ = Consumer
Where’s the action? Problems?


Constraints from the application: Speech, video,
mobility, RF, GPS, security…
Moore’s Law, networking, Interconnects
Scalability and high performance processing
–
–
–


Building them: Clusters vs DSM
Structure: where’s the processing, memory, and
switches (disk and ip/tcp processing)
Micros: getting the most from the nodes
Not ISAs: Change can delay Moore Law effect …
and wipe out software investment!
Please, please, just interpret my object code!
System (on a chip) alternatives… apps drivers
–
Data-types (e.g. video, video, RF) performance,
portability/power, and cost
COTS: Anything at the system
structure level to use?



How are the system components e.g.
computers, etc. going to be interconnected?
What are the components? Linux
What is the programming model?
–
–
–
Is a plane, CCC, tank, fleet, ship, etc. an Internet?
Beowulfs… the next COTS
What happened to Ada? Visual Basic? Java?
HPEC 2001
Computing
SNAP
built entirely
from PCs
Portables
Wide-area
global
network
Mobile
Nets
Wide & Local
Area Networks
for: terminal,
PC, workstation,
& servers
Person
Person
servers
servers
(PCs)
(PCs)
???
TC=TV+PC
home ...
(CATV or ATM
or satellite)
Legacy
mainframes &
Legacy
minicomputers
mainframe
& terms
servers &
minicomputer
servers & terminals
A space, time
(bandwidth), &
generation scalable
environment
scalable computers
built from PCs
Centralized
&Centralized
departmental
uni& mP servers
&
departmental
(UNIX
& NT)
servers
buit
from
PCs
HPEC 2001
Five Scalabilities
Size scalable -- designed from a few components,
with no bottlenecks
Generation scaling -- no rewrite/recompile or user
effort to run across generations of an architecture
Reliability scaling… chose any level
Geographic scaling -- compute anywhere (e.g.
multiple sites or in situ workstation sites)
Problem x machine scalability -- ability of an
algorithm or program to exist at a range of sizes
that run efficiently on a given, scalable computer.
Problem x machine space => run time: problem scale,
machine scale (#p), run time, implies speedup and
efficiency,
HPEC 2001
Why I gave up on large smPs & DSMs






Economics: Perf/Cost is lower…unless a commodity
Economics: Longer design time & life. Complex.
=> Poorer tech tracking & end of life performance.
Economics: Higher, uncompetitive costs for processor &
switching. Sole sourcing of the complete system.
DSMs … NUMA! Latency matters.
Compiler, run-time, O/S locate the programs anyway.
Aren’t scalable. Reliability requires clusters. Start there.
They aren’t needed for most apps… hence, a small
market unless one can find a way to lock in a user base.
Important as in the case of IBM Token Rings vs Ethernet.
HPEC 2001
What is the basic structure of
these scalable systems?
Overall
 Disk connection especially wrt to
fiber channel
 SAN, especially with fast WANs &
LANs

HPEC 2001
GB plumbing from the baroque:
evolving from 2 dance-hall SMP &
Storage model
Mp — S — Pc
:
|
:
|—————— S.fc — Ms
|
:
|— S.Cluster
|— S.WAN —
vs.
MpPcMs — S.Lan/Cluster/Wan —
:
HPEC 2001
SNAP Architecture----------
HPEC 2001
ISTORE Hardware Vision


System-on-a-chip enables computer, memory,
without significantly increasing size of disk
5-7 year target:
MicroDrive:1.7” x 1.4” x 0.2”
2006: ?
1999: 340 MB, 5400 RPM,
5 MB/s, 15 ms seek
2006: 9 GB, 50 MB/s ? (1.6X/yr capacity,
1.4X/yr BW)
Integrated IRAM processor
2x height
Connected via crossbar switch
growing like Moore’s law
16 Mbytes; ; 1.6 Gflops; 6.4 Gops
10,000+ nodes in one rack!
100/board = 1 TB; 0.16 Tflops
HPEC 2001
The Disk Farm? or
a System On a Card?
14"
The 500GB disc card
An array of discs
Can be used as
100 discs
1 striped disc
50 FT discs
....etc
LOTS of accesses/second
of bandwidth
A few disks are replaced by 10s of Gbytes
of RAM and a processor to run Apps!!
HPEC 2001
The Promise of SAN/VIA/Infiniband
http://www.ViArch.org/

Yesterday:
–
–
–

10 MBps (100 Mbps Ethernet)
~20 MBps tcp/ip saturates
2 cpus
round-trip latency ~250 µs
Now
–
250
Time µs to
Send 1KB
200
150
Transmit
receivercpu
sender cpu
100
Wires are 10x faster
Myrinet, Gbps Ethernet, ServerNet,…
–
Fast user-level
communication
-

tcp/ip ~ 100 MBps 10% cpu
round-trip latency is 15 us
1.6 Gbps demoed on a WAN
50
0
100Mbps
Gbps
SAN
HPEC 2001
Top500 taxonomy… everything is
a cluster aka multicomputer

Clusters are the ONLY scalable structure
–


Cluster: n, inter-connected computer nodes
operating as one system. Nodes: uni- or SMP.
Processor types: scalar or vector.
MPP= miscellaneous, not massive (>1000),
SIMD or something we couldn’t name
Cluster types. Implied message passing.
–
–
–
–
–
Constellations = clusters of >=16 P, SMP
Commodity clusters of uni or <=4 Ps, SMP
DSM: NUMA (and COMA) SMPs and constellations
DMA clusters (direct memory access) vs msg. pass
Uni- and SMPvector clusters:
Vector Clusters and Vector Constellations
HPEC 2001
Courtesy of Dr. Thomas Sterling, Caltech
The Virtuous Economic Cycle
drives the PC industry… & Beowulf
Attracts
suppliers
Greater
availability
@ lower cost
Standards
Attracts users
Creates apps,
tools, training,
HPEC 2001
BEOWULF-CLASS
SYSTEMS

Cluster of PCs
–
–
–


Pure M2COTS
Unix-like O/S with source
–


Linux, BSD, Solaris
Message passing programming model
–

Intel x86
DEC Alpha
Mac Power PC
PVM, MPI, BSP, homebrew remedies
Single user environments
Large science and engineering
applications
HPEC 2001
Lessons from Beowulf










An experiment in parallel computing systems
Established vision- low cost high end computing
Demonstrated effectiveness of PC clusters for
some (not all) classes of applications
Provided networking software
Provided cluster management tools
Conveyed findings to broad community
Tutorials and the book
Provided design standard to rally community!
Standards beget: books, trained people, software
… virtuous cycle that allowed apps to form
Industry begins to form beyond a research project
Courtesy, Thomas Sterling, Caltech.
Designs at chip level…
any COTS options?



Substantially more programmability
versus factory compilation
As systems move onto chips and
chip sets become part of larger
systems, Electronic Design must
move from RTL to algorithms.
Verification and design of “GigaScale
systems” will be the challenge.
HPEC 2001
The Productivity
Gap
10,000,000
100,000,000
.10m 1,000,000
100,000
.35m
10,000,000
58%/Yr. compound
Complexity growth rate
10,000
100,000
1,000
10,000
x
100
x x
x
2.5m
1,000,000
x
10
1
Logic Transistors/Chip
Transistor/Staff Month
x
1,000
x
21%/Yr. compound
Productivity growth rate
100
10
Source: SEMATECH
HPEC 2001
What Is GigaScale?

Extremely large gate counts
–
–

High complexity
–
–

Complex data manipulation
Complex dataflow
Intense pressure for correct , 1st time
–

Chips & chip sets
Systems & multiple-systems
TTM, cost of failure, etc. impacts ability to
have a silicon startup
Multiple languages and abstraction levels
–
Design, verification, and software
HPEC 2001
EDA Evolution: chips to systems
GigaScale Architect
2005 (e.g. Forte)
Hierarchical
Verification
plus
GigaScale
SOC Designer
System Architect
1995 (Synopsys & Cadence)
Testbench Automation
Emulation
Formal Verification
plus
RTL
1M gates
Simulation
Courtesy of Forte Design Systems
ASIC Designer
Chip Architect
Gates 1985(Daisy, Mentor)
10K gates
IC Designer
1975 (Calma
& CV)
HPEC 2001
Physical design
If system-on-a-chip is the
answer, what is the problem?

Small, high volume products
–
–
–
–
–



Phones, PDAs,
Toys & games (to sell batteries)
Cars
Home appliances
TV & video
Communication infrastructure
Plain old computers… and portables
Embeddable computers of all types where
performance and/or power are the major
constraints.
HPEC 2001
SOC Alternatives… not
including C/C++ CAD Tools
The blank sheet of paper: FPGA
 Auto design of a processor: Tensilica
 Standardized, committee designed
components*, cells, and custom IP
 Standard components including more
application specific processors *,
IP add-ons plus custom
 One chip does it all: SMOP
*Processors, Memory, Communication &
Memory Links,
HPEC 2001

Tradeoffs and Reuse Model
System Application
IUnknown
Application
Implementation
High
Low High
Structured
Custom
RTL
Flow
IOleObject
IDataObject
IPersistentStorage
IUnknown
IOleDocument
IFoo
IBar
IPGood
IOleBad IOleObject
IDataObject
IPersistentStorage
IOleDocument
IUnknown
Architecture
Time
MOPS/mW
Cost
Programmability
to Develop/Iterate New Application
Low
High Lower
FPGA
FPGA &
ASIP
Microarchitecture
GPP
DSP
GPP
Platform
Exportation
Silicon Process
HPEC 2001
System-on-a-chip alternatives
FPGA
Compile
a system
Systolic |
array
Pc + ??
Sea of uncommitted
gate arrays
Unique processor for
every app
Many pipelined or parallel
processors + custom
Dynamic reconfiguration
of the entire chip…
Spec. purpose processors
cores + custom
Gen. Purpose cores.
Specialized by I/O, etc.
Pc+DSP |
VLIW
Pc & Mp.
ASICS
Universal Multiprocessor array,
Micro
programmable I/0
Xylinx, Altera
Tensillica
TI
IBM, Intel,
Lucent
Cradle, Intel
IXP 1200
Xilinx 10Mg, 500Mt, .12 mic
HPEC 2001
Tensillica Approach: Compiled
Processor Plus Development Tools
ALU
Pipe
I/O
Cache Timer
Register File
MMU
Tailored,
HDL uP
core
Describe the
processor
attributes from
a browser-like
interface
Courtesy of Tensilica, Inc.
http://www.tensilica.com
Using the
processor
generator,
create...
Customized
Compiler,
Assembler,
Linker,
Debugger,
Simulator
Standard
cell library
targetted to
the silicon
process
HPEC 2001
Richard Newton, UC/Berkeley
EEMBC Networking Benchmark
• Benchmarks: OSPF, Route Lookup, Packet Flow
• Xtensa with no optimization comparable to 64b RISCs
• Xtensa with optimization comparable to high-end desktop CPUs
• Xtensa has outstanding efficiency (performance per cycle, per watt, per mm2)
• Xtensa optimizations: custom instructions for route lookup and packet flow
IDT 32334/100
0.045
14
IDT79RC32364/100
NEC V832-143
0.040
12
IDT79RC32V334-150
Toshiba TMPR3927F-GHM2000/133
NEC VR5432-167
Xtensa/200
IDT79RC64575IDtc/250
NEC VR5000
IDT79RC64575Algor/250
AMD K6-2/450
AMD K6-2E/400
0.035
10
8
6
4
AMD K6-IIIE+/550
0.030
0.025
0.020
0.015
0.010
2
Xtensa Optimized/200
AMD K6-2E+/500
Netmark Performance/MHz
Toshiba TMPR3927F-GH189/133
Performance relative to IDT 32334/100 (MIPS32)
AMD ElanSC520/133
0
0.005
0.000
HPEC 2001
Colors: Blue-Xtensa, Green-Desktop x86s, Maroon-64b RISCs, Orange-32b RISCs
EEMBC Consumer Benchmark
• Benchmarks: JPEG, Grey-scale filter, Color-space conversion
• Xtensa with no optimization comparable to 64b RISCs
• Xtensa with optimization beats all processors by 6x (no JPEG optimization)
• Xtensa has exceptional efficiency (performance per cycle, per watt, per mm2)
• Xtensa optimizations:custom instructions for filters, RGB-YIQ, RGB-CMYK
1.00
200
ST20C2/50
National Geode GX1/200
NEC VR5432/167
Xtensa/200
NEC VR5000/250
AMD K6-2E/400
AMDK6-2E+/500
AMD K6-III+/550
Xtensa Optimized/200
Performance relative to ST20C2/50
NEC V832/143
150
125
100
75
50
Consumermark Performance/MHz
175
AMD ElanSC520/133
0.80
0.60
0.40
0.20
25
0
0.00
Colors: Blue-Xtensa, Green-Desktop x86s, Maroon-64b RISCs, Orange-32b RISCs
HPEC 2001
Free 32 bit processor core
HPEC 2001
Complex SOC architecture
HPEC 2001
Synopsys via Richard Newton,
UC/B
UMS Architecture
DRAM
CONTROL
CLOCKS,
DEBUG

MEMORY
MEMORY
M M M M
S S S S
P P P P
M M M M
S S S S
P P P P
PROG I/O
PROG I/O
PROG I/O
MEMORY
PROG I/O
MEMORY
PROG I/O
PROG I/O
PROG I/O

PROG I/O
PROG I/O

PROG I/O
M M M M
S S S S
P P P P
NVMEM

PROG I/O
PROG I/O
M M M M
S S S S
P P P P
DRAM
Memory bandwidth scales with processing
Scalable processing, software, I/O
Each app runs on its own pool of processors
Enables durable, portable intellectual property
Cradle UMS Design Goals
• Minimize design time for applications
• Efficient programming model
• High reusability accelerates derivative development
• Cost/Performance
• Replace ASICs, FPGAs, ASSPs, and DSPs
• Low power for battery powered appliances
• Flexibility
• Cost effective solution to address fragmenting markets
• Faster return on R&D investments
HPEC 2001
Universal Microsystem (UMS)
Quad 1
Quad 2
Quad 2
Quad 3
Quad 3
Global Bus
I/O Quad
Quad ‘n”
SDRAM
CONTROL
I/O Quad
PLA Ring
Quad “n”
Each Quad has 4 RISCs, 8 DSPs, and Memory
Unique I/O subsystem keeps interfaces soft
The Universal Micro System (UMS)
An off the shelf “Platform” for Product Line Solutions
Universal Micro System
S S S S
P P P P
P P P P
MEMORY
MEMORY
D RAM
C ON TRO L
PRO G I/O
C LOC KS
PRO G I/O
Global Bus
MEMORY
MEMORY
M M M M
M M M M
S S S S
S S S S
P P P P
P P P P
NVMEM
PROG I/O
PR OG I/O
PRO G I/O
S S S S
PR OG I/O
PROG I/O
M M M M
PR OG I/O
PRO G I/O
I/O Bus
PRO G I/O
M M M M
PROG I/O
Multi Stream Processor
DRAM
Intelligent I/O Subsystem
(Change Interfaces without changing chips)
750 MIPS/GF LO PS
MEM MEM
PE
DSE DSE
Shared
Prog
Mem
Superior Digital Signal
Processing
(Single Clock FP-MAC)
Shared
Shared Data
DMA Mem
Local Memory that scales with
additional processors
Scalable real time functions
in software using small fast processors (QUAD)
VPN Enterprise Gateway
PHY
10/100
E-MAC
Quad 1
Firewall/Tuneling
Layer-2 switching
IP stack
Quad 2
3 DES IPSec
IP Layer 3 Routing
Operating System
Quad 3
3 DES IPSec
VoIP
LAN Telephony
Quads 4 & 5
VoIP
LAN Telephony
10/100
E-MAC
10/100
E-MAC
PHY
PHY
T1/E1/J1
10/100
E-MAC
T1/E1/J1
Quad 1
TCP/IP
IP Layer 3
IKE
3DES IPSec
PHY
•Single quad; Two 10/100 Ethernet ports at
wire speed; one T1/E1/J1 interface
•Handles 250 end users and 100 routes
•Does key handling for IPSec
•Delivers 50Mbps of 3DES
•Five quads; Two 10/100 Ethernet ports
at wire speed; one T1/E1/J1 interface
•Handles 250 end users and 100 routes
•Does key handling for IPSec
•Delivers 100Mbps of 3DES
•Firewall
HPEC 2001
•IP Telephony
•O/S for user interactions
UMS Application Performance
Application
MPEG Video Decode
MPEG Video Encode
AC3 Audio Decode
Modems
Ethernet Router
(Level 3 + QOS)
Encryption
3D geom, lite, render
DV Encode/Decode
MSPs
4
6
10-16
1
0.5
3
4
0.5
4
1
1
4
8
Comments
720x480, 9Mbits/sec
720x480, 15Mbits/sec
322/1282 Search Area
V90
G.Lite
ADSL
Per 100Mb channel
Per Gigabit channel
3DES 15Mb/s
MD5 425Mb/s
1.6M Polygons/sec
Camcorder
• Architecture permits scalable software
• Supports two Gigabit Ethernets at wire speed; four fast
Ethernets; four T-1s, USB, PCI, 1394, etc.
• MSP is a logical unit of one PE and two DSEs
Cradle: Universal Microsystem
trading Verilog & hardware for C/C++
UMS : VLSI = microprocessor : special systems
Software : Hardware
 Single part for all apps
 App spec’d@ run time using FPGA & ROM
 5 quad mPs at 3 Gflops/quad = 15 Glops
 Single shared memory space, caches
 Programmable periphery including:
1 GB/s; 2.5 Gips
PCI, 100 baseT, firewire
 $4 per flops; 150 mW/Gflops
Silicon Landscape 200x

Increasing cost of fabrication and mask
–
–
–

Physical effects (parasitics, reliability issues, power
management) are more significant design issues
–

$7M for high-end ASSP chip design
Over $650K for masks alone and rising
SOC/ASIC companies require $7-10M business guarantee
These must now be considered explicitly at the circuit level
Design complexity and “context complexity” is
sufficiently high that design verification is a major
limitation on time-to-market
Fewer design starts, higher-design volume…
implies more programmable platforms
HPEC 2001
Richard Newton, UC/Berkeley
The End
HPEC 2001
General-Purpose
Computing
Application(s)
…
360 SPARC 3000
Instruction Set Architecture
Platform-Based
Design
Application(s)
…
…
Platform
“Physical Implementation”
…
Synthesizeable
RTL
Microarchitecture & Software
Application(s)
…
…
…
Verilog, VHDL, …
ASIC
FPGA
Physical Implementation
HPEC 2001
The Energy-Flexibility Gap
Energy Efficiency
MOPS/mW (or MIPS/mW)
1000
Dedicated
HW
100
10
1
MUD
100-200 MOPS/mW
Reconfigurable
Processor/Logic
Pleiades
10-50 MOPS/mW
ASIPs
DSPs
1 V DSP
3 MOPS/mW
Embedded mProcessors
LPArm
0.5-2 MIPS/mW
0.1
Flexibility (Coverage)
HPEC
2001
Source: Prof. Jan Rabaey,
UC Berkeley
Approaches to Reuse

SOC as the Assembly of
Components?
–

Alberto Sangiovanni-Vincentelli
SOC as a Programmable
Platform?
–
Kurt Keutzer
HPEC 2001
Component-Based Programmable
Platform Approach



Application-Specific Programmable Platforms (ASPP)
These platforms will be highly-programmable
They will implement highly-concurrent functionality
 Intermediate language that
exposes programmability of all
aspects of the microarchitecture
 Integrate using programmable
approach to on-chip communication
Assembly language
for Processor
 Assemble Components
from parameterized library
HPEC 2001
Richard Newton, UC/Berkeley
Compact Synthesized Processor, Including
Software Development Environment



Use virtually any
standard cell library
with commercial
memory generators
Base implementation
is less than 25K gates
(~1.0 mm2 in 0.25m
CMOS)
Power Dissipation in
0.25m standard cell is
less than 0.5 mW/MHz
to scale on a typical $10 IC (3-6% of 60mm^2)
Courtesy of Tensilica, Inc.
HPEC 2001
http://www.tensilica.com
Challenges of Programmability
for Consumer Applications




Power, Power, Power….
Performance, Performance, Performance…
Cost
Can we develop approaches to programming
silicon and its integration, along with the tools
and methodologies to support them, that will
allow us to approach the power and
performance of a dedicated solution sufficiently
closely (~2-4x?) that a programmable platform
is the preferred choice?
HPEC 2001
Richard Newton, UC/Berkeley
Bottom Line: Programmable Platforms

The challenge is finding the right
programmer’s model and associated
family of micro-architectures
–

Successful platform developers must
“own” the software development
environment and associated kernel-level
run-time environment
–

Address a wide-enough range of applications
efficiently (performance, power, etc.)
“It’s all about concurrency”
If you could develop a very efficient and
reliable re-programmable logic technology
(comparable to ASIC densities), you would
eventually own the silicon industry!
HPEC 2001
Richard Newton, UC/Berkeley
Approaches to Reuse

SOC as the Assembly of
Components?
–

Alberto Sangiovanni-Vincentelli
SOC as a Programmable
Platform?
–
Kurt Keutzer
HPEC 2001
Richard Newton, UC/Berkeley
A Component-Based Approach…

Simple Universal Protocol (SUP)
–
–
–
–

Single-Owner Protocol (SOP)
–
–

Unix pipes (character streams only)
TCP/IP (only one type of packet; limited options)
RS232, PCI
Streaming…
Visual Basic
Unibus, Massbus, Sbus,
Simple Interfaces, Complex Application (SIC)
When “the spec is much simpler than the code*”
you aren’t tempted to rewrite it
– SQL, SAP, etc.
Implies “natural” boundaries to partition IP and
successful components will be aligned with those
boundaries.
HPEC 2001
–

(*suggested by Butler Lampson)
The Key Elements of the SOC
Applications
RF MEMS optical ASIP
What is the
Platform aka
Programmer
model?
Richard Newton, UC/Berkeley
Power as the Driver
(Power is still, almost always, the driver!)
1000
MIPS/mW
100
10
Four orders
of magnitude
1
0.1
0.01
0.001
Pentium
0.35mm
StrongARM
0.35mm
TI DSP
0.25mm
Dedicated
1mm
Source: R. Brodersen, UC Berkeley
Back end
HPEC 2001
Computer ops/sec x word length / $
1.E+09
doubles every 1.0
1.E+06
.=1.565^(t-1959.4)
1.E+03
y = 1E-248e0.2918x
1.E+00
1.E-03
1.E-06
1880
doubles every 2.3
doubles every 7.5
1900
1920
1940
1960
HPEC 2001
1980
2000
Microprocessor performance
100 G
Peak
Advertised
Performance
(PAP)
Real Applied
Performance
(RAP)
41% Growth
10 G
Giga
100 M
10 M
Moore’s
Law
Mega
Kilo
1970
1980
1990
2000
2010
HPEC 2001
GigaScale Evolution




In 1999 less than 3% of engineers
doing designs with more than 10M
transistors per chip. (Dataquest)
By early 2002, 0.1 micron will allow
600M transistors per chip.
(Dataquest)
In 2001 49% of engineers @ .18
micron, 5% @ .10 micron. (EE Times)
54% plan to be @ .10 micron in
2003.(EET)
HPEC 2001
Challenges of GigaScale

GigaScale systems are too big to simulate
–
–

Hierarchical verification
Distributed verification
Requires a higher level of abstraction
–
Higher abstraction needed for verification
High level modeling
- Transaction-based verification
-
–
Higher abstraction needed for design
-
High-level synthesis required for productivity
breakthrough
HPEC 2001