XML + Query Processing: A Foundation for Intelligent Networks

Download Report

Transcript XML + Query Processing: A Foundation for Intelligent Networks

Sensor Networks:
Implications for Database Systems
and
Vice-Versa
UCB Sensor Day
Michael Franklin
January 2004
http://www.cs.berkeley.edu/~franklin
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.




Query-based interface to sensor networks
Developed on TinyOS/Motes
Benefits
– Ease of programming and retasking
– Extensible aggregation framework
– Power-sensitive optimization and adaptivity
Sam Madden (Ph.D. Thesis) in collaboration with Wei
Hong (Intel) and guidance (?) from Franklin and
Hellerstein.
http://telegraph.cs.berkeley.edu/tinydb
Why Database Queries?

Declarative, Set-based approach.
– Programmer productivity.
– Robustness to change.
– Let the system manage efficiency.

Semantics and High-level operators.
– Framework for correctness criteria.
– Pushing semantics down enables smarter
implementations, code re-use.

Natural mapping of dataflow processing.
– Query plans are networks of operators.
– Query/Data duality enables intelligent routing.
These
are the
traditional
arguments
Here’s
why the
techniques
carry over
Declarative Queries in Sensor Nets

Many sensor network applications can be described using query
language primitives.
–
Potential for tremendous reductions in development and
debugging effort.
“Report the light intensities of the bright nests.”
Sensors
SELECT nestNo, light
FROM sensors
WHERE light > 400
EPOCH DURATION 1s
Epoch
nestNo
Light
Temp
Accel
Sound
0
1
455
x
x
x
0
2
389
x
x
x
1
1
422
x
x
x
1
2
405
x
x
x
Aggregation Query Example
“Count the number occupied
nests in each loud region of
the island.”
SELECT region, CNT(occupied)
AVG(sound)
Epoch
CNT(…)
region
AVG(…)
0
North
3
360
0
South
3
520
1
North
3
370
1
South
3
520
FROM sensors
GROUP BY region
HAVING AVG(sound) > 200
EPOCH DURATION 10s
Regions w/ AVG(sound) > 200
In Network Aggregation: Example
Benefits
Total Bytes Xmitted vs. Aggregation Function
2500 Nodes
50x50 Grid
Depth = ~10
Neighbors = ~20
Total Bytes Xmitted
100000
90000
80000
70000
60000
50000
40000
30000
20000
10000
0
EXTERNAL
MAX
AVERAGE
Aggregation Function
COUNT
MEDIAN
Telegraph: Monitoring Data Streams






Streaming Data
– Network monitors
– Sensor Networks
– News feeds
– Stock tickers
B2B and Enterprise apps
– Supply-Chain, CRM, RFID
– Trade Reconciliation, Order Processing etc.
(Quasi) real-time flow of events and data
Must manage these flows to drive business (and other)
processes.
Can mine flows to create and adjust business rules.
Can also “tap into” flows for on-line analysis.
http://telegraph.cs.berkeley.edu
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
One View of the Design Space
Filtering,
Cleaning,
Alerts
seconds
On-the-fly
processing
Monitoring,
Time-series
Data mining
(recent history)
Time
Scale
Combined
Stream/Disk
Processing
Archiving
(provenance
and schema
evolution)
years
Disk-based
processing
Another View of the Design Space
Filtering,
Cleaning,
Alerts
local
Several
Readers
Monitoring,
Time-series
Data mining
(recent history)
Geographic
Scope
Regional
Centers
Archiving
(provenance
and schema
evolution)
global
Central
Office
One More View of the Design Space
Filtering,
Cleaning,
Alerts
Monitoring,
Time-series
Degree of
Detail
Dup Elim
history: hrs
Data mining
(recent history)
Archiving
(provenance
and schema
evolution)
Aggregate
Data Volume
Interesting Events
history: days
Trends/Archive
history: years
“HiFi Systems”

High Fan-In, globally-distributed architecture
– Think RFID-enabled supply chain/logistics
– Telegraph-like nodes internal to the network
– TinyDB-like sensor networks at the edges

Large data volumes generated at edges
Successive aggregation as you move into the center
Strong spatio-temporal focus



Would love to talk with people who have applications
that might need this kind of infrastructure.