Rethinking Data Management for Storage
Download
Report
Transcript Rethinking Data Management for Storage
Rethinking Data Management for
Storage-centric Sensor Networks
Yanlei Diao, Deepak Ganesan, Gaurav Mathur, and
Prashant Shenoy
CIDR 2007
Proceedings of the Third Biennial Conference on Innovative Data
Systems Research (CIDR), Asilomar, CA, January 2007.
1
STONES Project
STONES
STOrage for Networked Embedded Systems
http://sensors.cs.umass.edu/projects/stones/
Energy-efficient Storage for Sensors
Sensor Database
PRESTO, TSAR, Capsule, …etc
2
Papers
Proxy Cache
PRESTO: Feedback-Driven Data Management in
Sensor Networks
ACM/USENIX NSDI 2006
TSAR: A Two Tier Sensor Storage Architecture Using
Interval Skip Graphs
ACM Sensys 2005
Sensor Nodes
Capsule: An Energy-Optimized Object Storage
System for Memory-Constrained Sensor Devices
ACM Sensys 2006
3
PRESTO
Introduction
PRESTO Proxy
precision(query) >
confidence interval
x(t+1)-x(t) >
worst-case
deviation
PRESTO Sensor
4
TSAR
Introduction
5
Capsule
Introduction
6
Outline
Introduction
StonesDB Architecture
Local Database
Distributed Data Management
Current Status and Conclusions
References
7
Introduction
Data Management in Sensor Networks
Live Data Management
Real-time queries
Only small window of data is important
Event detection and notification
Push-down Filters, AQP, …etc
Archival Data Management
Database outside the sensor networks
View sensor networks as database
Analysis of past events, Historical trends
8
Introduction
Example: Smart Home and Smart Biz
9
Introduction
Centralized Archival Data Management
?
Internet
Sensors with high data rate?!
camera, acoustic, vibration…
DBMS
Database Management System
Low data rate, High query rate
lossless aggregation…
10
22.1 ˚C
21.5 ˚C
21.8 ˚C
Introduction
Storage-centric Archival Data Management
User Query
?
Data Access
Internet
Push query to sensors!
limited capabilities
flash memory efficiency
Flash Memory
acoustic
image
High data rate,
Low query rate
Local Storage
11
Introduction
Sensor Node Hardware Today
Mica2 mote
6 MHz Processor
4 KB RAM
128 KB FLASH
iMote2
13 – 416 MHz Processor
32 MB RAM
32 MB FLASH
12
Introduction
Technology Trends
Energy Cost (per byte)
Energy cost of storage compared to that of communication
Communication
Storage
Generation of Sensor Platforms
13
Design Goals
Exploit local flash memory
Exploit resources-rich proxies
cache data and split query plans
Support a rich set of queries
Cheap, energy-efficient flash memory
SQL-type queries
data mining and similarity search queries
Support heterogeneity
configurable to heterogeneous sensor platforms
14
StonesDB Architecture
Two-tier Sensor Networks
user specified confidence bound
Distributed Data Management Layer
Local Database
15
StonesDB Architecture
System Operations
Image Retrieval
找出沒有洋蔥頭臉部表
情的圖片...
Proxy Cache of Image Summaries
…
16
StonesDB Architecture
System Operations
Image Retrieval
找出沒有洋蔥頭臉部表
情的圖片...
Sensor Local Storage
Query Engine
Partitioned Access Methods
…
17
Local Database
Architecture of Local Database
Stream
Index
Summary
18
Local Database
Costs and Benefits of Access Methods
Cost for B+ tree insertion
H (Cr Cw )
Cost for sequential scan
H:
H-level B+ tree
Cr , C w : cost for page read/write
R page : readings per page
Cr / R page
Lazy index construction!
When depth of B+ tree is 2…
Sequential scan is 340 times more energy efficiency!
Scan is better when data is not accessed very frequently.
19
Local Database
Partitioned Access Methods
B+ Tree R Tree
temporal segments
Write-Once Indexing!
20
Local Database
Summarization and Aging
Multi-resolution Summarization: Local Storage Allocation
Resolution 4 Resolution 3 Resolution 2 Resolution 1
local storage capacity
All available storage gets filled…
When to drop these summaries?
How to drop these summaries?
Graceful query quality degradation.
21
Local Database
Summarization and Aging
Low query accuracy
High compactness
High query accuracy
Low compactness
22
How long should a summary be stored in the network?
Local Database
Summarization and Aging
Nsi 4 i si
Agei
i 1
Ri
ri
Query Accuracy
95%
Quality Difference
Qsystem system-provided step function
Quser
user-desired quality degradation
50%
Agei
Min 0tT (Max(qdiff (t )))
present
past
Time
Objective: minimize the worst case quality difference
23
Distributed Data Management
The Problems
I want the data of …
Proxy Cache of Image Summaries
What summaries to cache?
What resolution of summaries?
How should a query plan be split?
24
Distributed Data Management
Querying the Proxy Cache
min
summaries from sensors to the proxy
queries from the proxy to sensors
query execution at the sensors
Internet
results back to the proxy
Statistical models?
Gateway, Proxy
Low-resolution data?
Metadata of images?
Sensor Nodes
25
Distributed Data Management
Querying the Sensor Tier
User Query
Cache miss…
Not meet accuracy requirement?
Gateway, Proxy
How to split the query plan?
Use Query Processing Engine…
Sensor Nodes
26
Distributed Data Management
Querying the Sensor Tier
Number of cars over past half hour?
Proxy Cache of Image Summaries
01
02
03
04
Gateway, Proxy
04
01
03
02
Sensor Nodes
Partially process the query at the proxy! 27
Distributed Data Management
Querying the Sensor Tier
Average temperature between PM 1:00 - 2:00
Proxy Cache of Data Summaries
Gateway, Proxy
20.0 ˚C
19.5 ˚C
18.6 ˚C
21.1
˚C
19.2 ˚C
20.2 ˚C
…
average temperature every two hours
01
04
03
02
Sensor Nodes
Refine the result at the sensor node!
28
Current Status and Conclusions
Implemented Capsule
Other related systems
Flash-based object store
Energy-efficient data structure (lists, arrays, trees)
Currently extended with summarization, aging, and
partitioned indexing.
TSAR: separate data from metadata
PRESTO: implement a proxy cache
No running system for StonesDB architecture.
29
References
The STONES Project
STONES
CAPSULE
http://sensors.cs.umass.edu/projects/capsule/
PRESTO
http://sensors.cs.umass.edu/projects/stones/
http://presto.cs.umass.edu/
The Wireless Sensor Networks Group at UMASS
http://sensors.cs.umass.edu/index.shtml
30