Transcript slides

Zachary G. Ives
University of Pennsylvania
joint work with Svilen Mihaylov, Marie Jacob,
Mengmeng Liu, Sudipto Guha, Boon Thau Loo
Funded by NSF IIS-0713267, CNS-0721541, and DARPA
DMSN 2008
August 24, 2008
Sensor Networks – Today
Many of today’s sensor apps focus on passive monitoring:
 Zoology, biology, oceanography, building security, etc.
 Many cool apps like ZebraNet (Princeton), CarTel (MIT), intelligent
parking lots (USC, others)
Has driven research throughout the software stack, e.g.:




Overcoming hardware limitations
Ad hoc networking
Distributed declarative aggregation queries over data
Approximation
… Where is sensor network technology heading next?
Sensor Networks – Tomorrow
Will join a larger ecosystem: connected to databases, streaming
Web sources, and the Internet as a whole!
 Sensors monitoring physical environment
 “Soft sensors” monitoring logical state of devices, network, nodes
 Displays and actuators that “follow” and guide / interact with the user
High-level programming that abstracts the heterogeneity of this
environment, detects events, defines high-level views
Applications: interactive environments that integrate, correlate
 “Smart buildings” that help occupants / visitors
 Hospital: Finding a patient
 Home: Reminders to take medications (or feed the fish)
 Campus: Where to find an available computer terminal
 Cross-layer, cross-system monitoring (e.g., security)
A Smart Environment – Penn “CIStem”
Our 5 buildings extended with soft & real sensors:
 Servers, routers, robots, etc.
 Soft: load, users, applications, reservations
 Hard: Machine temperature; AC status; power status; position
 Workstations
 Soft: Machine active; screen saver active
 Hard: Light on in lab; audio at keyboard
 Conference rooms, classrooms
 Soft: Reservations according to calendar
 Hard: Movement in room
Displays:
 PDAs, monitors in halls, LEDs, iPhones, etc.
Some Example Use Cases
Query: “Direct me to a free workstation with MS Word”
Lights & motion in lab; machine available; apps installed; lab near user
Query: “How many machines can we keep in this lab – given cooling constraints
– and how many requests can they handle?”
Run a machine under load; measure throughput; measure temperature
Query: “Where are the GRASP mobile robots located?”
On-board positioning; nearby ceiling-mounted sensors; complex
interpolation function
Event: reminders trigger and appear wherever you are!
Your calendar; your RFID and position; nearest display
Event: gracefully shut down machines near air conditioning outage or fire alarm
Air cond. cooling zone, alarm state, machine state, lookup of IP address
from location
A mix of sensors, DBs, etc.; lots of correlation (join)
To build these rich, interactive applications that combine
many types of data, we need a new architecture:
 Capabilities for integrating stream data:




Queries across different device types, network types
Views and abstraction layers
Cleaning and conflict resolution
Distributed optimization and query processing
 Sensing of logical state within devices:
 Network monitoring (at different layers, e.g., link, transport)
 App, node state (e.g., server jobs, console in-use state)
 The ability to “subscribe” displays or apps to views
Our target devices may include many non-mote devices
Our “Vision Statement”
Take a declarative query, posed over a view over any
kind of data, sourced by any device…
 Partition and distribute the query across a variety of
networks and systems…
 and feed its output to anyone “subscribing” to it…
 in a way that maximizes performance and reliability, while
minimizing use of precious resources!
The Key Research Challenges
The “right” primitives for stream information integration,
based on, but going beyond traditional data integration
 Views, joins, aggregation, etc. but also…
 User defined functions
 Regions, neighborhoods, and paths (transitive closure)
Highly distributed stream query processing capabilities
 Distributing queries across a heterogeneous network
 Impact of wireless, multi-hop networks on query processing
– failures, changes in topology, etc.
Some of the Challenges in
Distributed Query Processing
 Highly distributed, dynamic networks make computation,
coordination, optimization hard!
 Join gets “horizontally partitioned” very heavily – how do we
execute, optimize, adapt to changes? handle mobility? etc.
 How do we support windowed computation over regions,
paths, etc.? (windowed transitive closure queries)
 How do we determine how much work to place at different
nodes in a heterogeneous environment? Performance vs.
reliability vs. battery…
 How do we do decentralized optimization of queries? (One
answer: adaptive techniques!)
Conclusions – Much Remains to Do!
 The sensor world is going to be heterogeneous!
 Many network types, device types
 Soft sensors, physical sensors, routers, applications, …
 All sorts of data formats, including video
 Can we build a unified infrastructure for managing
heterogeneity, consistency, and data acquisition /
integration?
 Can we make it perform?