PowerPoint - Computer Science Division

Download Report

Transcript PowerPoint - Computer Science Division

An Introduction to the
Prescience Lab
Peter A. Dinda
Prescience Lab
Department of Computer Science
Northwestern University
http://plab.cs.northwestern.edu
Outline
•
•
•
•
Motivations
Questions
Projects
Conclusions
2
How do we deliver arbitrary amounts of
computational power to ordinary people?
Assumptions:
Shared computing environments,
Limited utility of reservations
3
Distributed and
Parallel Computing
How do we deliver arbitrary amounts of
computational power to ordinary people?
Interactive
Applications
4
• How do we build adaptive distributed
interactive applications effectively?
• How does the demand for resources in these
applications vary over time?
• How does the supply of resources vary over
time?
• How can we use the adaptation mechanisms
exposed by an application to match its
resource demand with resource supply?
5
How do we build adaptive distributed
interactive applications effectively?
• Applications
– Virtualized Audio
• Immersive audio
– Interactive visualization of massive datasets
• Frameworks
– Virtuoso
• Grid computing using virtual machines
– Dv
6
Virtualized Audio (with Dong Lu, Curtis Barrett)
Distributed
Computational
Resources
Microphones, Headphones
GPS, head-tracking
Wireless connectivity
Limited local computation
Other Users or
Audio Sources
7
Virtualized Audio: Interactive Auralization
Listener
Performer
Room
Virtual Listening Room
Virtual Performer
Auralization
Sound
Field 2
HRTF
Headphones
Listener at
Virtual Location
•Auralization injects performer into listener’s space
•Auralization adapts as listener moves or room changes
•Recomputes impulse responses
8
Architecture of Interactive Auralization
User-driven Immersive Audio Client
Scalable Audio Filtering Service
Current Spatial Model
and source/sink positions
Streaming Audio
Service
Master filtering
server
Mixing
server
Client
Binaural Audio
Output
Filter generation
Parallel FD
Simulation
Parallel FD
Simulation
Parallel FD
Simulation
Parallel FD
Simulation
Parallel FD
Simulation
Parallel FD
Simulation
Mixing
server
Filtering
server
Source 1
Filtering
server
Source 2
Filtering
server
Source 3
Filtering
server
Source 4
Filtering
server
Source n
Filtering
server
Scalable Real-time Simulation Server
Impulse response filters characterize user’s space
9
Adaptation in Virtualized Audio
• Numerous mechanisms
• Sampling rate, impulse response length,
algorithm for computing impulse response, filter
approximations, server selection, …
• Can vary computational load over many
orders of magnitude
• Compute/communicate ratio is huge
• How do we use these mechanisms to
achieve consistent real-time response?
10
Virtuoso (with Renato Figueiredo, Jose Fortes,
Ananth Sundararaj, Ashish Gupta)
• Make Grids like PCs
– User gets raw machine(s)
– Machine appears to be on his network
– User can install what he needs as owner
• Lower level of abstraction
– Classic virtual machine monitors
– Virtual networking
• Middleware support
– Instantiation, migration of machines
– Connectivity to remote files, machines
– Resource control
11
Classic Virtual Machine: VMWare
12
Why Virtual Networking?
• A machine running is suddenly plugged
into your network. What happens?
– Does it get an IP address?
– Is it a routeable address?
– Does firewall let its traffic through?
– To any port?
Virtual machine hostile environment
13
A Simple Layer 2 Virtual Network
Client
Server
SSH
VM monitor
Remote VM
Virtual
NIC
Physical
NIC
Friendly Local Network
Physical
NIC
Hostile Remote Network
14
A Simple Layer 2 Virtual Network
Client
Server
SSH
VM monitor
Remote VM
Virtual
NIC
Physical
NIC
Friendly Local Network
Physical
NIC
Hostile Remote Network
15
A Simple Layer 2 Virtual Network
Client
Bridge
Server
SSH Tunnel
Bridge
VM monitor
Remote VM
Virtual
NIC
Physical
NIC
Friendly Local Network
Physical
NIC
Hostile Remote Network
16
Bootstrapping the Virtual Network
• Star topology always possible
• TCP session from client must have been possible
• Better topology may be possible
• Depends on security at each site
• Topology may change
• Virtual machines can migrate
• Bootstrap to higher layers
• Virtual filesystems
17
How does the demand for resources vary over time?
How does the supply of resources vary over time?
• Resource demand in interactive applications
– Instrumented games, preceding applications, …
– Not much is known here
• Resource supply in distributed environments
– URGIS
• Grid Information based on the relational data model
– GridG
– Clairvoyance
• Online resource prediction for hosts and networks
– Tsunami
• Wavelet-based approaches to information dissemination
– Diffusion
• Zero-cost information dissemination
18
URGIS (with Beth Plale, Dong Lu)
• Unified Relational Grid Information Services
– GIS based on the relational data model
– Leverage results from database community
– Northwestern work: MySQL, Oracle RDBMSes
• Compositional queries
– Application-specific information aggregration
– Like decision support queries (TPC-H)
• Support for information of varying dynamicity
– Varying update rates and freshness requirements
– Seamless inclusion of streaming data
• A common data model and query language
– Powerful, high level, declarative, easy-to-optimize
19
Compositional Queries
• “Find four different hosts with a total
memory between 512 MB and 1 GB”
• “Find all available sensors and predictors
that provide information about the
network path between a and b”
• “Tell me when the load on any of these
four hosts diverges from the average by
more than 50%”
20
Example
select
host1.name, host2.name, host3.name, host4.name,
hd1.mem+hd2.mem+hd3.mem+hd4.mem as TotalMem,
from
hosts as host1, hostdata as hd1,
hosts as host2, hostdata as hd2,
hosts as host3, hostdata as hd3,
hosts as host4, hostdata as hd4
where
host1.ip=hd1.ip and host2.ip=hd2.ip and
host3.ip=hd3.ip and host4.ip=hd4.ip and
hd1.mem+hd2.mem+hd3.mem+hd4.mem>=512 and
hd1.mem+hd2.mem+hd3.mem+hd4.mem<=1024 and
host1.ip!=host2.ip and host1.ip!=host3.ip and
host1.ip!=host4.ip and host2.ip!=host3.ip and
host2.ip!=host4.ip and host3.ip!=host4.ip
order by
TotalMem desc
limit
10
21
Timebounded,
nondeterministic
queries
select nondeterministically
host1.name, host2.name, host3.name, host4.name,
hd1.mem+hd2.mem+hd3.mem+hd4.mem as TotalMem,
from
hosts as host1, hostdata as hd1,
hosts as host2, hostdata as hd2,
hosts as host3, hostdata as hd3,
hosts as host4, hostdata as hd4
where
host1.ip=hd1.ip and host2.ip=hd2.ip and
host3.ip=hd3.ip and host4.ip=hd4.ip and
hd1.mem+hd2.mem+hd3.mem+hd4.mem>=512 and
hd1.mem+hd2.mem+hd3.mem+hd4.mem<=1024 and
host1.ip!=host2.ip and host1.ip!=host3.ip and
host1.ip!=host4.ip and host2.ip!=host3.ip and
host2.ip!=host4.ip and host3.ip!=host4.ip
order by
TotalMem desc
limit
10
inlessthan
5 seconds
usingheuristic
22
prefer_depth_first
Implementation of Non-deterministic,
Time-bounded Queries
• Random number associated with each row in
each table (or insert)
• Query is rewritten to incorporate a random
ranges on the input tables
• Range lengths chosen to meet deadline
– This is not trivial and we don’t have this translation yet
• Heuristics not yet incorporated
• Hopefully RDBMS-independent
23
RGIS1 Non-deterministic Query Performance
1
1000
Number of
Results
100
Query Time
0.1
10
100,000 hosts
0.01
1
1
2
3
4
Number of Hosts In Join
5
Find n hosts with a total memory of 1 GB of memory
24
RGIS1 Non-deterministic Query Performance
1000
10000000
1000000
100
100000
Number of
Results
10000
10
1000
Query Time
100
1
100,000 hosts
0.1
0.0005 0.001
0.01
Selection Probability
10
1
0.1
Find 2 hosts with a total memory of 1 GB of memory
25
Clairvoyance (with Jason Skicewicz, Yi Qiao)
• Measure, Characterize, Predict, and Disseminate information
about dynamic resource supply
• Resource signals
– Discrete-time signals strongly correlated with resource supply
– Currently, univariate, working on multivariate
– Currently
•
•
•
•
Host load
Windows performance counters (using WatchTower)
Network flow bandwidth and latency (using Remos)
Any text-based source
• Online predictive modeling
–
–
–
–
Simple models (MEAN, BESTMEAN, BESTMEDIAN, LAST…)
Box/Jenkins Models (AR, MA, ARMA, ARIMA,…)
Fractional ARIMAs
Nonlinear modeling (TARs, Wavelet-decompositions)
26
RPS Toolkit
• Extensible toolkit for implementing resource
signal prediction systems [CMU-CS-99-138]
• Growing: RTA, RTSA, Wavelets, GUI, etc
• Easy “buy-in” for users
• C++ and sockets (no threads)
• Prebuilt prediction components
• Libraries (sensors, time series, communication)
27
Measurement and Prediction
28
Multiscale Network Prediction
• Large, recent study of predictability
• Hundreds of NLANR and other traces
– Mostly WANs
• Different resolutions
– Binning and low-pass via wavelets
• Sweet Spot
– Predictability often maximized at particular
resolution
29
Multiresolution Prediction Example
0.3
last
bm(8)
0.25
ma(8)
ar(8)
ar(32)
0.2
arma(4,4)
arima(4,1,4)
arima(4,2,4)
0.15
arfima(4,-1,4)
0.1
0.05
0
0.1
1
10
Bin Size (Seconds)
100
1000
30
Tsumami (with Jason Skicewicz)
• Efficient dissemination of resource signals
• Wavelet-based methods for
characterization, modeling, and prediction
• Tsumani toolkit will ship with the next RPS
release
31
The Tension
Video App
Sensor
Fine-grain
measurement
Resourceappropriate
measurement
Resource Signal
(periodic sampling)
Example: host load
…
Network
Grid App
Course-grain
measurement
32
Proposed System
Application
Sensor
Network
Stream Interval
Level 0
Wavelet
Transform Level M-1
Level 0
Level L
Inverse
Wavelet
Transform
Level M
33
Application receives levels based on its needs
Delay
• Transforms introduce sample delay
–
–
–
–
Depends on number of levels and type of filter used
Exponential in the number of levels
Affects both streaming and block transforms
Seemingly inherent for wavelets
• Exploit prediction
– Limited success
• Exploit “wavelet-like” decompositions
– Trade-off between reconstruction accuracy and
delay
– Existing theory. Our evaluation not done yet.
34
Wavelets and Prediction
• Predict each level of transformed signal
separately
– “Detail signals”
• Surprisingly ineffective in practice
• Whitens the signal
– “Approximation signals”
• Smoothing, used in network prediction work
discussed earlier
• Reasonably effective, worth pursuing
35
Diffusion (with Brian Cornell, Jack Lange)
• Efficient dissemination of resource signals
• Piggyback additional information on
existing packet transfers
– No additional packets
– Packet size unchanged
•
•
•
•
Zero Cost
Information
Dissemination
Evaluations with traces, Minet
Implementation as Linux kernel module
>=86 bits per packet possible
17 bits per packet verified
36
Diffusion Implementation
Sensor
App
Transport
Network
App
Transport
Network
Header
Editing
Data
Extraction
Data Link
Physical
Data Link
Physical
Consumer
Sensor data piggybacked on application packets
37
SpyTalk
38
How can we use the adaptation mechanisms exposed by an
application to match its resource demand with resource supply?
• Application-level performance predictions
– Running Time Advisor
• Confidence interval for running time of a task on a
particular host
– Message Time Advisor
• Confidence interval for transfer time of a message
• Adaptation advisors
– Real-time Scheduling Advisor
• Choose which host of a set on which a task is most
likely to meet its deadline
• Real-time  responsiveness requirement
• Service for interactive applications
39
Running Time Advisor
40
Real-time Scheduling Advisor
41
• How do we build adaptive distributed interactive
applications effectively?
• How does the demand for resources in these
applications vary over time?
• How does the supply of resources vary over time?
• How can we use the adaptation mechanisms
exposed by an application to match its resource
demand with resource supply?
42
Distributed and
Parallel Computing
How do we deliver arbitrary amounts of
computational power to ordinary people?
Interactive
Applications
43
Future Directions
• Continue pushing on projects discussed
• New directly related projects
– Interactive hierarchical visualization of
huge datasets
– Resource demand characterization,
modeling, and prediction
• Other directions
– Intrusion detection using signal processing
44
For More
Information
• Peter Dinda
– http://www.cs.northwestern.edu/~pdinda
• Prescience Lab
– http://plab.cs.northwestern.edu
45