A Two Tier Sensor Storage Architecture Using Interval Skip Graphs

Download Report

Transcript A Two Tier Sensor Storage Architecture Using Interval Skip Graphs

TSAR*: A Two Tier Sensor Storage
Architecture Using Interval Skip Graphs
(*Tiered Storage ARchitecture)
Peter Desnoyers, Deepak Ganesan, and Prashant Shenoy
University of Massachusetts, Amherst
Department of Computer Science
University of Massachusetts, Amherst
Why do we need archival storage?
Applications need historical sensor information.
Why? Trigger events:
• Traffic monitoring - crash
• Surveillance - break-in
• Environmental monitoring - natural disaster
lead to requests for past information.
This requires archival storage.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Existing storage and indexing approaches
◊Streaming query systems
 TinyDB
(Madden 2005), etc.
 Data storage and indexing is performed outside of network.
Optimized for continuous queries. High energy cost if
used for archival - data must be transmitted to central
data store.
◊In-network storage and indexing
 DCS,
GHT (Ratnasamy 2002)
 Dimensions (Ganesan 2003)
 Directed Diffusion (Intangonwiwat 2000)
Limited by lack of sufficient, energy-efficient storage
and of communication and computation resources on
current sensor platforms.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Technology Trends
Radio
mJ/byte
Flash
Max Flash
mJ/byte
size
Mica2
30
4.5
0.5MB
MicaZ
3.4
4.5
0.5MB
Telos
3.4
1
1MB
0.01
>1GB
UMass
NAND
New flash technologies
enable large storage
systems on small energyconstrained sensors.
100x
1000x
UNIVERSITY OF MASSACHUSETTS, AMHERST
Hierarchical Storage and Indexing
Hierarchical deployments are being
used to provide scaling:
• James Reserve (CENS)
Higher powered micro-servers are
deployed alongside resource
constrained sensor nodes.
Application
Proxies
Key challenge:
• Exploit proxy resources to
perform intelligent search
across data on resourceconstrained nodes.
Sensors
UNIVERSITY OF MASSACHUSETTS, AMHERST
Key Ideas in TSAR
◊ Exploit storage trends for archival.

Use cheap, low-power, high capacity flash memory in
preference to communication.
◊ Index at proxies and store at sensors.

Exploit proxy resources to conserve sensor resources
and improve system performance.
◊ Extract key searchable attributes.

Distill sensor data into concise attributes such as ranges
of time or value that may be used for location and
retrieval but require less energy to transmit.
UNIVERSITY OF MASSACHUSETTS, AMHERST
TSAR Architecture
1. Interval Skip Graph-based
index between proxies.
• Exploit proxy resources to
locate data stored on sensors
in response to queries.
2. Summarization process

• Extracts identifying information:
e.g. time period during which
events were detected, range of
event values, etc.
3. Local sensor data archive
• Stores detailed sensor
information: e.g. images,
events.
Sensor node archive
UNIVERSITY OF MASSACHUSETTS, AMHERST
TSAR Architecture
1. Interval Skip Graph-based
index between proxies.
• Exploit proxy resources to
locate data stored on sensors
in response to queries.
2. Summarization process
Summarization
function
• Extracts identifying information:
e.g. time period during which
events were detected, range of
event values, etc.
3. Local sensor data archive
• Stores detailed sensor
information, e.g. images,
events.
UNIVERSITY OF MASSACHUSETTS, AMHERST

TSAR Architecture
1. Interval Skip Graph-based
index between proxies
Distributed index
• Exploit proxy resources to
locate data stored on sensors
in response to queries.
2. Summarization process
• Extracts identifying information:
e.g. time period during which
events were detected, range of
event values, etc.
3. Local sensor data archive
• Stores detailed sensor
information, e.g. images,
events.
UNIVERSITY OF MASSACHUSETTS, AMHERST

Example - Camera Sensing
Sensor archives information and transmits
summary to proxy.
Cyclops
camera
image


Birds(t1,t2)=1
summarize
<id>
storage
UNIVERSITY OF MASSACHUSETTS, AMHERST
<id>
Summary
handle
Sensor node
Example - Indexing
Index


Network of proxies
Birds(t1,t2)=1
<id>
Birds
t1,t2 1
<id>
proxy
Summary and location information are
stored and indexed at proxy.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Querying and Retrieval
Cyclops camera
Birds in interval
(t1,t2)?


summarize
Birds
Cyclops camera
t1,t2 1
<id>


summarize
proxy
Query is sent to any proxy.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Querying and Retrieval
Cyclops camera
Birds in interval
(t1,t2)?


summarize
Birds
Cyclops camera
t1,t2 1
<id>


summarize
<id>
proxy
Index is used to locate sensors
holding matching records.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Querying and Retrieval
Cyclops camera


summarize
Birds
Cyclops camera
t1,t2 1
<id>


<id>
proxy
Record is retrieved from storage
and returned to application.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Outline of Talk
◊ Introduction and Motivation
◊ Architecture
◊ Example
◊ Design
Skip Graph
 Interval Search
 Interval and Sparse Interval Skip Graph

◊ Experimental Results
◊ Related Work
◊ Conclusion and Future Directions
UNIVERSITY OF MASSACHUSETTS, AMHERST
Goals of Index Structure
The index should:
• support range queries over time
or value,
(|
|)?
• be fully distributed among
proxies, and
• Support interval keys indicating
a range in time or value.
Distributed
index
insert(|
UNIVERSITY OF MASSACHUSETTS, AMHERST
|)
What is a Skip Graph?
(Aspnes & Shah, 2003,
Harvey et al. 2003)
Distributed extension of
Skip Lists (Pugh ‘90):
Probabilistically balanced no global rebalancing needed.
Ordered by key - provides
efficient range queries.
Fully distributed - data is
indexed in place.
2
3
5
6
9
12
18
19
Properties: Log(N) search and insert
No single root - load balancing,
robustness
UNIVERSITY OF MASSACHUSETTS, AMHERST
Single key and
associated pointers
Interval search
Query: x=4
9
8
10
6
8
5
4
2
3
2
5
1
3
0
0
1
2
3
4
5
6
7
8
Given intervals [low,high] and query X:
1 - order by low
2 - find first interval with high <= X
3 - search until low > X
UNIVERSITY OF MASSACHUSETTS, AMHERST
9
10
Interval search
Query: x=4
8
9
6
10
5
2
4
2
3
1
5
0
0
8
3
1
2
3
4
5
6
7
8
9
10
Given intervals [low,high] and query X:
1 - order by low
2 - find first interval with high <= X
3 - search until low > X
UNIVERSITY OF MASSACHUSETTS, AMHERST
Interval search
Query: x=4
8
9
6
10
8
5
2
4
2
3
1
5
0
0
3
1
2
3
4
5
6
7
8
Given intervals [low,high] and query X:
1 - order by low
2 - find first interval with high <= X
3 - search until low > X
UNIVERSITY OF MASSACHUSETTS, AMHERST
9
10
Simple Interval Skip Graph
Method:
Index two
increasing values:
low, max
Search on either as
needed.
0-3
0-1
1-5
2-4
5-8
6-10
8-9
9-12
3
3
5
5
8
10
10
12
Derived from Interval Tree,
Interval keys: YES
Cormen et al. 1990
logN search: YES
logN update: NO - (worst case O(N))
UNIVERSITY OF MASSACHUSETTS, AMHERST
Sparse Interval Skip Graph
Goal: efficient update of max(high)
values in Interval Skip Graph.
M proxies
Approach: take advantage of ratio of
proxies (M) to data items (N)
Solution: eliminate redundant links
and corresponding updates.
Before: complete search tree rooted at
each data item.
After: retain M trees, one rooted at
each proxy, keeping robustness and load
balancing properties.
- - - - - -
Worst-case complexity:
Search: O(logM)
Update: O(M)
UNIVERSITY OF MASSACHUSETTS, AMHERST
N data items
Adaptive Summarization
How accurately should the
summary information represent
the original data?
Detailed summaries =
more summaries,
precise index
updates
queries
Precise index =
fewer wasted queries
UNIVERSITY OF MASSACHUSETTS, AMHERST
Adaptive Summarization
How accurately should the
summary information represent
the original data?
Approximate summaries =
fewer summaries,
imprecise index
updates
queries
?
?
imprecise index =
more wasted queries
UNIVERSITY OF MASSACHUSETTS, AMHERST
Adaptive Summarization
Goal: balance update and
query cost.
Approach: adaptation.
a = summarization (summaries / data)
r = EWMA( wasted queries / data )
Target range:
r0
Decrease a if:
r > r0
Increase a if:
r <
updates
queries
UNIVERSITY OF MASSACHUSETTS, AMHERST
+e
r0 - e
Prototype and Experiments
◊ Software:
Em* (proxy),
TinyOS (sensor)
◊ Hardware: Stargate
Mica2 mote
◊ Network: 802.11 ad-hoc,
multihop BMAC 11%
◊ Data:
James Reserve [CENS] dataset
30s temperature readings
34 days
For physical experiments, data stream was stored on sensor
node and replayed.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Index performance
How does the index performance
scale with the number of proxies
and size of dataset?
Queries
Interval skip graph index
Tested in:
Em* emulation
Tasks:
insert, query
Variables:
number of proxies (1-48)
size of dataset
Metric:
proxy-to-proxy messages
UNIVERSITY OF MASSACHUSETTS, AMHERST
Sensor data
Index results
Sparse skip graph provides >2x
decrease in message traffic for
small numbers of proxies.
Sparse skip graph shows
virtually flat message cost for
larger index sizes.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Query performance
What is the query performance
on real hardware and real data?
queries
4-proxy
network
Tested on: 4 Stargate proxies
12 Mica2 sensors in
tree configuration
Task:
query
3-level
multi-hop
sensor field
Variables: size of dataset
Metric:
query latency (ms)
data
UNIVERSITY OF MASSACHUSETTS, AMHERST
Query results
Sensor link
latency dominates
Proxy link
delay is negligible
The sensor communication
consists only of a query and a
response - the minimal
communication needed to retrieve
the data.
Validates the approach of using proxy resources to
minimize the number of expensive sensor operations.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Adaptive Summarization
How well does the
adaptation mechanism
track changes in
conditions?
Varied query
rate
Query/data
= 0.2
Query/data
= 0.1
1/a
Query/data
=0.03
Tested in: Em*, EMTOSSIM
emulation
Task:
data and queries
Variables: query/data ratio
Metric:
Summary rate
adapts
summarization
factor a
Summary algorithm adapts
to data and query
dynamics.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Related Work
◊ In-network Storage:



DCS (Ratnasamy 2002)
Dimensions (Ganesan 2003)
…
◊ In-network Indexing:




GHT (Ratnasamy 2002)
DIFS (Greenstein 2003)
DIM (Li 2003)
…
◊ Hierarchical Sensor Systems:

Tenet (CENS, USC)
◊ Sensor Flash File Systems:


ELF (Dai 2004)
Matchbox (Hill et al. 2000)
UNIVERSITY OF MASSACHUSETTS, AMHERST
Conclusions and Future Work
◊ Proposed novel Interval Skip Graph-based index
structure and adaptive summarization mechanism for
multi-tier sensor archival storage.
◊ Implemented these ideas in the TSAR system.
◊ Demonstrated index scalability, query performance, and
adaptation of summarization factor, both in emulation and
running on real hardware.
Future Work
◊ Investigate other index structures.
◊ Alternate interval- and non-interval-based summary
mechanisms.
For more information:
http://presto.cs.umass.edu
UNIVERSITY OF MASSACHUSETTS, AMHERST