Query Processing for Sensor Networks

Download Report

Transcript Query Processing for Sensor Networks

Query Processing for Sensor
Networks
Yong Yao and Johannes Gehrke
(Presentation: Anne Denton
March 8, 2003)
Outline




What sensor networks are we talking about?
What are the issues?
What are the choices?
Network issues


Database issues


Routing
Query plans
Related work
What Sensor Networks are we
talking about?
Commercially available:
 Size: a few cubic inches


Operating system




Projected according to Moore’s law: ¼ inch
available soon (not sure sure if Moore talked about
batteries …)
Embedded version of Linux (redhat) or Windows
ce.net
Wireless multi-hop RF radio
Powered by batteries
(LAN-attached with permanent power sources
Berkeley MICA Mote
http://www.xbow.com/Products/Product_pdf_files/Wireless_pdf/MICA.pdf
Note related
work to
Gehrke’s
is done at
Berkeley
(TinyDB)
Issues

Wireless





Power consumption



1 year idle
1 week under full load
Computation


Limited QoS
Latency with high variance
Limited bandwith
Frequently drops packets
Limited memory and computing power
Uncertainty in sensor readings
Supported Sensors





Temperature
Light
Magnetometers
Accelerometers
Microphones
Example Uses

Buildings



Biology



“Is Yong in his office”
“Is there an empty seat in the meeting room”
Find out about existence of specific species of bird
Map bird’s trail
MICA Mote developed under DARPA grant …
Choices

Query layer should be declarative



In-Network processing



Preservation of energy and bandwidth
Ratio of sending 1 bit vs. executing one instruction 220 to
2900 depending on architecture
Different trade-offs => job of query layer



Abstract user from physical details
(Why are database people interested …)
Long-term, e.g., monitoring environment
Short-term, e.g., battlefield
Query Proxy between network and application layer
(bypasses routing layer to some extent)

Must be closely linked with network layer
More Choices
Special nodes to access network
 Gateway nodes
 Noise requires “fusing” of data
 Aggregation important
 Queries need DURATION and EVERY
 Event-oriented model (triggers)
desirable but not implemented

In-Network Aggregation

Why?


Partial aggregation



Energy to transmit is heaviest burden
Possible for algebraic aggregate operators
(MAX, MIN, SUM, AVG)
Impossible for holistic operator (MEDIAN)
Otherwise: packet merging
http://citeseer.nj.nec.com/gray97data.html
Synchronization

Necessary for partial aggregation and packet
merging
AVG and SUM are duplicate sensitive aggregate
operators:
 Spanning tree
 MIN and MAX are not duplicate sensitive
 DAG may be sufficient

 Pragmatic approach to synchronization
 Problem: Predictions may fail due to network

reorganization or query results
bi-directional prediction
Routing

Differences to wired network



Many ad-hoc routing algorithms exist


Everybody has to share the routing job
Network is unstable
Routing layer in protocol stack
Database approach requires changes to
routing protocol

Gehrke points out that that’s not unusual:
Database file-access also bypasses operating
system to some extent
Changes to Routing Protocol

Intercepting of packets to achieve



Differences in communication pattern



Packet merging
Partial aggregation
Communication with leader rather than point-topoint
Knowledge about neighbors
Route initialization and maintainance

…
Query Plans

Example query “What is the quietest open
classroom in Upson Hall”

2 levels of aggregation



Query plan has



Compute average value for each qualified class room
Select minimum average over all class rooms
Flow blocks
Leader nodes
Differences to traditional optimizers


Focus on communication cost
Flow block instead of relational operator
Flow blocks

Task



Collect data
Perform computations
Parameters




Set of source nodes
Leader selection policy
Routing structure, e.g., DAG, tree
Computation
Query Optimization Example
SELECT
D.gid, AVG(D.value)
FROM
SensorData D
GROUP BY D.gid
HAVING
AVG(D.value)>Threshold
 Flow block for each group



Good if nodes in group physically close
In-Network Aggregation
Single flow block for all



Better if nodes in group are interspersed
No In-Network Aggregation possible
Packet merging more efficient
Experiments





Using a simulator
IEEE 802.11 as MAC layer
Prove energy decrease from in-Network
aggregation and packet merging
Extra delay overcompensated by
reduced collisions
… prove that the rest works too
Summary


Interesting database as well as network
issues
No data mining issues in this paper
(although I could think of some …)