What is Many Core SoC? - Embedded Systems and CoDesign Lab

Download Report

Transcript What is Many Core SoC? - Embedded Systems and CoDesign Lab

HOW DO WE HANDLE
MANY CORE SYSTEM ON CHIP?
Rabi Mahapatra
Department of Computer Science & Engineering
Texas A&M University
What is Many Core SoC?
Lot of Cores on a Board
A Single Chip
Many Core SoC have hundreds of IP cores on a single chip.
Multi Core SoC have a handful of IP cores on a single chip.
Embedded Systems and Codesign Laboratory
Why do we need many core SoC?
• Performance Demand is
ever increasing
• Frequency is not increasing
• More transistors available
on single die
– 50 billion transistors Soon!
Core Frequency (GHz)
4
3
2
1
0
1990
1995
2000
2005
2010
Transistors on Chip (Millions)
100000
1000
10
0.1
0.001
1960
1980
2000
2020
Embedded Systems and Codesign Laboratory
Issues in Many Core SoC




•
•
•
•
•
•
•
•
Power and Thermal Management
Testing
Operating System Design
Modeling and Benchmarks
Programming Model
Memory Bandwidth
Fault Tolerance
Virtualization Support
Embedded System and Codesign Laboratory
Key Research Challenges





• Low Power Core
Design
• Low Power
Communication
• System Level Power
Management
• Thermal
Management
• Simulation Platform
• Test and Debug
• Software Support
• Benchmarks
Power &
Energy
Testing &
Reliability
Development
Security
• Test Infrastructure
• Reliability Aware
Design


• Secure SoC
Infrastructure
• HW Security
Protocols

Embedded Systems and Codesign Laboratory
Challenge #1
Power & Energy
Power Issue in many core SoC
Application
Core Power
OS Level
35%
System Level
Communication
Power Supply
Clock
Circuit Level
Various Levels of
Power Management
Communication
Accounts for
Significant Power
Embedded Systems and Codesign Laboratory
Many Core SoC with Networks on Chip (NoC)
• As the number of cores per
chip scales, on-chip busses
will no longer meet
performance needs
• Route packets not wires
• NoC Components
– routers
– core-network interfaces (CNI)
– links
IP
IP
IP
CNI
CNI
CNI
R
IP
R
CNI
R
R
IP
CNI
R
CNI
R
IP
R
R
IP
CNI
CNI
R
R
IP
IP
IP
CNI
CNI
R
CNI
IP
IP
R
R
R
CNI
CNI
CNI
CNI
IP
IP
IP
IP
CNI
R
R
An Example NoC with Mesh Topology
Embedded Systems and Codesign Laboratory
Network on Chip for Low Power Communication
What NoC Has to Offer?
• High Bandwidth
• Less sensitive to Wire
Delay
• Versatile Infrastructure
• Scalable Communication
Challenges
•
•
•
•
Router Architecture? 
Power Management? 
Routing Algorithm
Quality of Service
Embedded Systems and Codesign Laboratory
Low Power Router Architecture
MEM 2
CPU 1
• Routers are primary
component in NoC
• Buffers consume the
most power in Router
Efficient Management of Buffer
can save power
MEM 1
R
CPU 2
R
R
R
Up to 79% Power
Consumption
Embedded Systems and Codesign Laboratory
How to Reduce Buffer Power?
• Reduce Active Buffers:
Dynamic Buffer
Management
– Buffers are organized in
blocks
– Flows are monitored
– Excess blocks are powered
down based on traffic flow
• Use Energy Efficient
Storage Encoding
– SRAMs can be efficient
buffer solution
– Storing 0 ≠ Storing 1
– Encode the bits in packet to
preserve energy
20% Energy Savings can be
achieved!
Embedded Systems and Codesign Laboratory
System Level Dynamic Power Management
Dynamic Peak Power Budget Satisfaction
• Local Power Consumption is
computed
• Neighbors Power is shared
• Non deterministic algorithm is
used to calculate available
budget
25% Performance
Improvement
Calculate
Local Power
Update
Power
Budget
Estimate
Share
Exchange
Information
Embedded Systems and Codesign Laboratory
System Level Dynamic Power Management
Intelligent Power budget Distribution
• Ant System inspired power
budget distribution approach
• Power ants are sent from
surplus region
• Beggar ants are sent from
starving region
• Power is shared from surplus
to starving region
Sharing
Path
Sink
Source
20% Improvement in Power
Budget Utilization
Embedded Systems and Codesign Laboratory
Reliability Aware Low Power Scheduling
Observed
Reliability
• Pfair is an optimal
scheduling algorithm for
multiprocessor task
scheduling
• Integrated DVFS (Dynamic
voltage and frequency
scaling) into the Pfair
scheduling algorithm
• Feedback controller based
allocation of additional job
copies to manage reliability.
Task
set
DVS
enabled
Pfair
Scheduler
Feedback
control
Additional job
copies
• Reduced Failure rate
compared to Pfair
• Up to 50% savings in
energy
Embedded Systems and Codesign Laboratory
Low Power Real Time Scheduling in Hardware
Save Context
Context
Switch
interrupt
• Implemented the Pfair
scheduling algorithm (for
MPSoC) in hardware
• Transformed floating
point computations to
integer domain.
Load Task
Task
Running
Transfer Control
P1
P2
P3
P4
Hardware
Pfair
scheduler
P5
P6
P7
P8
Exponential savings in
Energy consumption
compared to software based
scheduler
Embedded Systems and Codesign Laboratory
Temperature Aware Scheduling
Temperature aware energy
management (TA-DVS) at run
time using novel slack
reclamation
MF-DVS
TA-DVS
Feedback control
Task set
EDF
Scheduler
Slack
Temperature
Estimation
Temperature
constrained slack
Embedded Systems and Codesign Laboratory
Power Management: Summary
What We Addressed
• A Low Power Router Buffer
Architecture
• Peak Power Management
Heuristic
• Intelligent Dynamic Power
Budget Distribution
• Reliability and Temperature
aware task scheduling
Other Research Challenges
• Novel flow control to reduce
power consumption further
• Context/Application aware
power management
• System wide power policy
management
Embedded Systems and Codesign Laboratory
Challenge #2
Test and Reliability
Many Core SoC Reliability Overview
Many Core SoC present us with challenges and advantages in achieving reliable computing
• Challenges
– Shrinking feature sizes
– Power density and
temperature
Reduced operational lifetime:
the “bathtub” curve is getting
shallower and narrower
• Advantages
– NoC can be used as
test delivery platform
– Redundancy
Graceful degradation
Adaptable architectures
Enhanced testing is necessary to meet application reliability requirements
Embedded Systems and Codesign Laboratory
How to Improve Reliability? - Test Infrastructure IP (TI-IP)
• Many Core SoC will contain:
– Processing, Memory, and I/O IP
– Infrastructure IP
• For Debug, Yield, and Testing
(TI-IP)
• TI-IP provides reliability and
availability
– No longer necessary to take
chip off-line to test
– Allows for fast, high-coverage
testing
– Already being deployed in
commercial automotive SoC
(Freescale MPC564xL)
IP
Core
IP
Core
BIST
CNI
IP
Core
CNI
IP
Core
BIST
CNI
NoC
CNI
CNI
CNI
Test
I-IP
CNI
Test
I-IP
IP
Core
Embedded Systems and Codesign Laboratory
Distributed TI-IP – The Solution for Many Core SoC
TI-IP composed of:
•Test Controller
•Test Vector Memory
• TI-IP becomes limited
by scalability for Many
Core
– Many TI-IPs distributed
across the SoC
• Each SoC Tile can
have TI-IP and Test
Vectors
– Test vectors can be
optimally divided across
SoC based on NoC
topology
– This solves the problem of
testing deeply embedded
cores
Test Vector Set
1 2 3 4 5
5x5 2D-Torus Example
Embedded Systems and Codesign Laboratory
Test Vector Storage: The Benefits
• Experimental Analysis:
– Observe time to test SoC with and without distributed TI-IP
– Measured over a variety of SoC sizes for scalability
– Test time independent of SoC size for distributed TI-IP
94% reduction in
test time
Latency (µs)
350
300
250
200
150
100
50
0
85% reduction in
test time
Distributed TI-IP
Single-Source TI-IP
5x5
8x8
10x10
2D-torus size
13x13
Embedded Systems and Codesign Laboratory
SoC Test and Reliability: Summary
What We Addressed
• TI-IP: On-Line Testing
• Test Vector Storage
• Test Scheduling
Open Problems
• Life Time Reliability?
• Diagnosis & Recovery?
• Fault Resilience?
Embedded Systems and Codesign Laboratory
Challenge #3
Security
Security in Many Core SoC
Attack
Modeling
Threat
Response
Mechanism
SoC
Security
Security IP
Design
HW Security
Solution
Protocol
Development
Embedded Systems and Codesign Laboratory
Why Security is important in SoC?
• Weakness in SoC
–
–
–
–
Heterogeneous system: IP from different vendors
Tradeoff between security and performance
Decrease visibility and control
Improved attack techniques
• Sample Attacks
–
–
–
–
–
Denial of Service (DoS)
Bandwidth reduction
Draining or Sleep Deprivation
Extraction of secret info
Hijacking of programmable components
State of the Art in SoC Security
• Divide the system into secure and unsecure
areas
– Secure area (ASIC)
– Unsecure area (FPGA etc)
• Secure Bus Design
– Extend conventional arbitration using Trojan
Detection and Access Control
• Security IP Based Solution
– Transaction Monitoring
– Sandboxing
Embedded Systems and Codesign Laboratory
IP Based SoC Security Approach
• Central Security Core
• Dedicated communication
channel for security protocol
• Secure agents at each Core
• Distributed Response protocol
is necessary
Embedded Systems and Codesign Laboratory
Immune System Inspired Attack Response
Embedded Systems and Codesign Laboratory
SoC Security: Summary
State of Art
• Bio Inspired Modeling of
Threat Response
Mechanism
• Security Monitor Core
• Core Level Anomaly
Detector
Open Problems
• Attack Model
Development
• Security Infrastructure
Development
• Protocol Design
• Design of Security
Core/IP
Embedded Systems and Codesign Laboratory
Challenge #4
Development Platform
Need for a Development Platform
• Major road block in advent
of many core SoC
• Need better platform
simulator and debugger
• Rapid development cycle
• Suitable Benchmarks to
effectively evaluate many
core SoC
Design Space
Exploration
Performance
Evaluation
Test and Debug
Benchmarking
and
Standardization
Embedded Systems and Codesign Laboratory
The NoCBench Platform
DESIGN
MANUAL
CONFIGURATION
XML
REPRESENTATION
GRAPH
CORE
LIBRARY
COMPONENT
LIBRARY
 Full System SoC
Simulation Platform
 Built using SystemC
 Generic and
Extensible
Simulation
Engine
NO
NETWORK
GENERATION
BENCHMARKS
SYSTEMC
SIMULATION
REPORT
OK?
YES
DONE
Embedded Systems and Codesign Laboratory
SoC Model in NoCBench
NoCBench System Model
Components of NoCBench
• System Kernel
– Provides scheduling
– Task management
• Core Library
– Processor cores
– Memory core
– Other IP
• Network On Chip
– NoC backbone with routers
and CNIs
Embedded Systems and Codesign Laboratory
NoCBench Configuration and Results
Configuration Parameters
• Network on Chip
–
–
–
–
Router Details
Topology
Injection Limit
Power Model
• System
– Scheduler
– Task Configurations
– Core Types
Reported Metrics
• Network
– Throughput
– Latency
– Power
• Application
– Execution Time
– Cycles
– Power
Embedded Systems and Codesign Laboratory
Virtual Platform Using Carbon Design Tools
VPNoC
• Scalable architecture
• 16-100 cores support
• Run applications using
traces
• Suitable for data analysis
accelerators
• Can meet performance,
power, protocol and
security analysis
Challenges
• Slow when using large
number of nodes
• Fast Model is essential
Embedded Systems and Codesign Laboratory
What About Benchmarking?
• Benchmarking is a Challenge
–
–
–
–
Application Benchmark
Communication Benchmark
Large optimization problems
Social media data analysis
– Many other large data analytic problems
• What kind of setup
– How future “many core SoC” will look like
Embedded Systems and Codesign Laboratory
Development Platform: Summary
What We Have
•
•
•
•
•
•
VPNoC
SoC simulation
Micro Kernel
Simple Scheduler
Basic core library
Limited IPC
What Do We Need
• More core support
• Thread Library Support
• Application Benchmarks
Embedded Systems and Codesign Laboratory
Conclusion
Lot to Explore!
• Many Core SoC is the future
• Challenges
– Performance & Power
– Test, Reliability & Security
– Benchmarking
• More Information
– http://codesign.cs.tamu.edu/index.php/research/socand-noc
– http://codesign.cs.tamu.edu/index.php/research/realtime-systems
Embedded Systems and Codesign Laboratory
Thank You!
Embedded Systems and Codesign Laboratory