FORGE: A Framework for Optimization of Distributed Embedded

Download Report

Transcript FORGE: A Framework for Optimization of Distributed Embedded

A Model-Based Approach to System
Specification for Distributed Real-time
and Embedded Systems *
Radu Cornea1, Shivajit Mohapatra1, Nikil Dutt1, Rajesh Gupta2,
Ingolf Krueger2, Alex Nicolau1, Doug Schmidt3, Sandeep
Shukla4, Nalini Venkatasubramanian1
UC Irvine
2 UC San Diego
3 Vanderbilt
4 Virginia Tech
1
*This
work is supported in part by NSF award ACI-0204028
1
Outline

FORGE: DRE System Design

System Specification

Compiler-Runtime Interaction

Case Studies
• Automatic Target Recognition
• Quality-driven Video Streaming
FORGE: RTAS MDES 2003
2
Motivation

New portable devices
• Substantial capabilities
• DRE applications







Video Streaming
Avionics
Biomedical
Remote sensing
Space exploration
Command and control
Autonomous systems
• Networked, heterogeneous
• Performance, power, and reliability constraints

DRE development process
• Mostly manually driven
• Evolving end-to-end software architectures for complex
systems
FORGE: RTAS MDES 2003
3
FORGE: DRE System Design

Systematic method for DRE application
development
• Integrated specification of system requirements



Behavior, performance/QoS/Power/RT constraints
Specification of heterogeneous platforms across levels
Formalized through description languages
• Flexible and optimized middleware solutions and
operating systems


Adaptive and reflective middleware
Integration with application code
• Compiler/runtime tools for hardware abstraction layers

Particularly critical for power/performance management
FORGE: RTAS MDES 2003
4
Application Development Model
Application Functional Specification (including timing, power and other constraints)
Middleware
Service
objects
ADL capturing the platform architecture
Capture resource constraints
RDL describing resource constraints
Capture Platform architecture
Compiler
Heterogeneous computing platform
DSP
-proc
DB
Xscale
FORGE: RTAS MDES 2003
5
Middleware: Adaptation and Reflection

Adaptive
• Statically


Reduce memory
Minimize dependencies
• Dynamically


Optimize response
Reflective
• Self-adjust capabilities

QoS
• Reallocate resources/change strategies for
desired QoS

Need integration with lower levels!
FORGE: RTAS MDES 2003
6
Hardware Abstraction Layer
Specification

Processor ADLs
• Traditionally used for synthesizing compilers
and simulators
• Abstractions for micro-architectural resources


Structure
Behavior
• E.g., EXPRESSION

EXPRESSION
Behavior Specification
Operation Specification
Structure
Specification
Arch. Components
Instruction Description
Pipelining, Data Routing
Operation Mappings
Memory Subsystem
Need System-Level Extensions!
• Interfaces with OS and middleware
FORGE: RTAS MDES 2003
7
Resource and Architecture Description

Extend Processor ADL to complex systems
• Heterogeneous hardware/abstraction
• Communication Structure
• System Constraints/Requirements


power, reliability,..
deadlines, periodicity,…
• Constructs for system composition
• Couple with middleware abstraction

Use Extended ADL
• Generate service specifications
• Check feasibility of meeting constraints
• Code mapping, given constraints/tradeoffs
FORGE: RTAS MDES 2003
8
Interactions Between Levels

OS/hardware -> Middleware
•
•
•
•
Computing power
Available memory
Specialized functional units (coprocessors)
Power budget (efficient discharge profile)
=> Middleware can then make better decisions

Middleware -> OS/hardware
•
•
•
•
Part of the global view made available to OS
Better profiling (time, power)
Future schedule changes
Relative task importance
=> Hardware can then make better decisions
FORGE: RTAS MDES 2003
9
Case Study 1: ATR

ATR: Automatic Target Recognition
• 4 main tasks per frame

Mainly independent
• Can be parallelized
• Pipelined version
• Distributed world




Hundreds of nodes (drones)
Geographically distributed
Heterogeneous network
Various capabilities
Target Detection
FFT
Filter/IFFT
Compute Distance
• Wireless
• Sensors (IR, visible)
• Motion capable
• Complex decisions at runtime
Application pipeline
FORGE: RTAS MDES 2003
10
ATR System Specification

Application
• Task decomposition

Main tasks: TARG, FFT, IFFT, DIST
• System level constraints


Task characterization (requirements)
Resource description
• Nodes


Capabilities (processing power, memory)
Timing and power profiles (per each task)
• Network layout
FORGE: RTAS MDES 2003
11
ATR Specification Example
(Application ATR
(Contains TARG FFT IFFT
DIST)
(Paths
(TARG FFT)
(FFT IFFT)
(IFFT DIST)
)
(Deadline 16ms)
...
(Task TARG
(FloatingPoint NO)
(Scalable YES)
(Memory 1Mb)
...
)
(Task FFT
(FloatingPoint YES)
(Scalable YES)
(Memory 1Mb)
...
)
...
)
(Node MAIN1
(Processor 800MIPS 800MIPS)
(Memory 1000Mb)
(DPMCapable NO)
(DVSCapable NO)
(PowerSource
(Line NOLIMIT)
)
(TaskProfile
...
)
)
Specification
(application
and node
description)
(Node MOBILE1
(Processor 400MIPS)
(Memory 32Mb)
(DPMCapable NO)
(DVSCapable YES)
(DVSModes
(m0 600Mhz 2.2V)
(m1 500Mhz 1.8V)
(m2 400Mhz 1.5V)
(m3 300Mhz 1.1V)
)
(PowerSource
(Battery 50Wh)
(SolarCell 5Wh
(Period 24h)
(Duration 9h)
)
)
(TaskProfile
(Task TARG
(m0 0.66ms 7W)
(m1 0.79ms 4W)
(m2 0.99ms 2W)
(m3 1.32ms 0.9W)
)
(Task FFT
(m0 0.29ms 6W)
(m1 0.34ms 3.5W)
(m2 0.43ms 1.8W)
(m4 0.57ms 0.75W)
)
...
)
(Sensors
(Video
(Spectra Visible)
)
)
...
)
FORGE: RTAS MDES 2003
12
ATR Decision Tradeoffs

Reflective middleware: global view
• Decides on migrating components to free
resources on constrained nodes


Reshape network topology
Requires info from architecture (OS) level
• Receives periodic status updates from lower
level

OS/Hardware level: local view
• Handles operating modes, DVS
• Interacts with higher levels for control
decisions
FORGE: RTAS MDES 2003
13
ATR Scenarios

Component migration between nodes
• Middleware decision (decrease load)

Information about hardware helps
• E.g. Integer/FP tasks vs node FP capabilities

Network activation
• Target identified by a node
• Middleware wakes up nodes in the region

Sends commands to OS/hardware level (global info)
• OS/hardware decides on new power state


Low OoS - power saving, high QoS – full power
Dependent on target proximity
FORGE: RTAS MDES 2003
14
Case Study 2:
Quality Driven Video Streaming


MPEG4 streams to mobile handhelds (iPAQs)
Problem: high energy requirements
• Short lifetime, user experience greatly affected



Video stream cannot be viewed to completion
Partly affected by interference w/ other users
Goal: tradeoff quality vs power for the best
user experience
• Maximize QoS while ensuring full service
• Main objective is not power minimization!

Problem: Human perception of video quality

Subjective, different perception on small devices
FORGE: RTAS MDES 2003
16
Middleware/Hardware Integration


Aggregate techniques at different levels, for
cumulative joint power gains
Middleware: coarse grain
• Controls quality of multimedia content and network
transmission



Proxy-based admission control + video transcoding
Intelligent network streaming
Hardware/OS: fine tuning
• Architectural adaptation

Low-level performance knobs
• Optimized cache configuration
• Dynamic voltage scaling


Compiler techniques at device
Integration: feedback based QoS control
FORGE: RTAS MDES 2003
17
Experimental Results: CPU + Memory

Setup:
•
•
•
•
Wattch/Simplescalar
Berkeley MPEG tools
8 video qualities
Video content

• 30 point cache search space
Search Space for Cache Optimization

Quality
Cache
Best Best
Initial
Best Savings
(News) Size Assoc MHz Voltage Energy Energy
Q1
8
8 100
1
1.30
0.77 47.54%
Q2
8
8 100
1
1.10
0.65 47.83%
Q3
8
8 100
1
0.96
0.57 48.07%
Q4
32
2
66
0.9
0.55
0.26 57.67%
Q5
32
2
66
0.9
0.49
0.23 57.85%
Q6
32
2
33
0.9
0.43
0.21 58.07%
Q7
8
8
33
0.9
0.29
0.14 57.30%
Q8
8
8
33
0.9
0.24
0.12 57.58%
Cache/DVS Best Operating Points + Savings
Slow “news” to fast
“action” type content


Size: 4-64
Associativity: 1-32
Cache Results
• 10-15% energy savings

Cache + DVS
• Up to 60% savings
FORGE: RTAS MDES 2003
18
Experimental Results:
Network Card & System

Network card:
• Burst transmission
• “Sleep” between
transmissions
• Other users in the network
modeled as noise
• Savings: 70%
Optimizing Burst Time

Integrated framework
• Utility factor improvement
by a few quality levels

Conclusion: improved
user experience from
integrated approach
Integrated QoS Based Simulation
FORGE: RTAS MDES 2003
19
Summary

FORGE:
• Brings together advances in



Architecture/Hardware abstraction modeling
Software architecture
Distributed / real-time systems
• Provides capabilities for DRE development


Conceptualization of design knowledge
Exploitation of design knowledge across development
phases for DRE systems
• Cross-optimization across disjoint abstractions


Current focus on Hardware and Middleware Abstractions
Particularly critical for meeting power and QoS in DRE
applications using mobile devices
FORGE: RTAS MDES 2003
20