Query Processing for Sensor Networks
Download
Report
Transcript Query Processing for Sensor Networks
Query Processing for Sensor
Networks
Yong Yao and Johannes Gehrke
(Presentation: Anne Denton
March 8, 2003)
Outline
What sensor networks are we talking about?
What are the issues?
What are the choices?
Network issues
Database issues
Routing
Query plans
Related work
What Sensor Networks are we
talking about?
Commercially available:
Size: a few cubic inches
Operating system
Projected according to Moore’s law: ¼ inch
available soon (not sure sure if Moore talked about
batteries …)
Embedded version of Linux (redhat) or Windows
ce.net
Wireless multi-hop RF radio
Powered by batteries
(LAN-attached with permanent power sources
Berkeley MICA Mote
http://www.xbow.com/Products/Product_pdf_files/Wireless_pdf/MICA.pdf
Note related
work to
Gehrke’s
is done at
Berkeley
(TinyDB)
Issues
Wireless
Power consumption
1 year idle
1 week under full load
Computation
Limited QoS
Latency with high variance
Limited bandwith
Frequently drops packets
Limited memory and computing power
Uncertainty in sensor readings
Supported Sensors
Temperature
Light
Magnetometers
Accelerometers
Microphones
Example Uses
Buildings
Biology
“Is Yong in his office”
“Is there an empty seat in the meeting room”
Find out about existence of specific species of bird
Map bird’s trail
MICA Mote developed under DARPA grant …
Choices
Query layer should be declarative
In-Network processing
Preservation of energy and bandwidth
Ratio of sending 1 bit vs. executing one instruction 220 to
2900 depending on architecture
Different trade-offs => job of query layer
Abstract user from physical details
(Why are database people interested …)
Long-term, e.g., monitoring environment
Short-term, e.g., battlefield
Query Proxy between network and application layer
(bypasses routing layer to some extent)
Must be closely linked with network layer
More Choices
Special nodes to access network
Gateway nodes
Noise requires “fusing” of data
Aggregation important
Queries need DURATION and EVERY
Event-oriented model (triggers)
desirable but not implemented
In-Network Aggregation
Why?
Partial aggregation
Energy to transmit is heaviest burden
Possible for algebraic aggregate operators
(MAX, MIN, SUM, AVG)
Impossible for holistic operator (MEDIAN)
Otherwise: packet merging
http://citeseer.nj.nec.com/gray97data.html
Synchronization
Necessary for partial aggregation and packet
merging
AVG and SUM are duplicate sensitive aggregate
operators:
Spanning tree
MIN and MAX are not duplicate sensitive
DAG may be sufficient
Pragmatic approach to synchronization
Problem: Predictions may fail due to network
reorganization or query results
bi-directional prediction
Routing
Differences to wired network
Many ad-hoc routing algorithms exist
Everybody has to share the routing job
Network is unstable
Routing layer in protocol stack
Database approach requires changes to
routing protocol
Gehrke points out that that’s not unusual:
Database file-access also bypasses operating
system to some extent
Changes to Routing Protocol
Intercepting of packets to achieve
Differences in communication pattern
Packet merging
Partial aggregation
Communication with leader rather than point-topoint
Knowledge about neighbors
Route initialization and maintainance
…
Query Plans
Example query “What is the quietest open
classroom in Upson Hall”
2 levels of aggregation
Query plan has
Compute average value for each qualified class room
Select minimum average over all class rooms
Flow blocks
Leader nodes
Differences to traditional optimizers
Focus on communication cost
Flow block instead of relational operator
Flow blocks
Task
Collect data
Perform computations
Parameters
Set of source nodes
Leader selection policy
Routing structure, e.g., DAG, tree
Computation
Query Optimization Example
SELECT
D.gid, AVG(D.value)
FROM
SensorData D
GROUP BY D.gid
HAVING
AVG(D.value)>Threshold
Flow block for each group
Good if nodes in group physically close
In-Network Aggregation
Single flow block for all
Better if nodes in group are interspersed
No In-Network Aggregation possible
Packet merging more efficient
Experiments
Using a simulator
IEEE 802.11 as MAC layer
Prove energy decrease from in-Network
aggregation and packet merging
Extra delay overcompensated by
reduced collisions
… prove that the rest works too
Summary
Interesting database as well as network
issues
No data mining issues in this paper
(although I could think of some …)