Multiprocessor Scheduling - University of Massachusetts

Download Report

Transcript Multiprocessor Scheduling - University of Massachusetts

Storage Systems and
Sensor Storage
Research Overview
Computer Science
Storage Research Overview
• Hyperion
– High volume stream archival system
• Bandwidth efficient data migration in enterprise storage
systems
• Use of flash-storage in data centers
Computer Science
Hyperion Stream Store
[Usenix 2007]
• Streaming data common in environments such as
network monitoring, system monitoring, sensors, RFID
– Archive data for retrospective querying, forensics
• Hyperion: high volume stream archival for distributed
network monitoring
– Gigabit link: 250K packets per second
– Archive and index in real-time, while supporting interactive
querying
– Neither commodity rdbms nor general-purpose file systems
suitable
Computer Science
Hyperion Design
•
•
•
•
•
Multiple monitor nodes, each monitoring multiple network links
StreamFS: high-performance stream file system
Local index: multi-level signature index based on bloom filters
Distributed index for querying multiple nodes
Can scale to million pkts/s with StreamFs and 200K pkts/s
indexing per core on a commodity multi-core PC
Monitor/
capture
Signature
index
Hyperion node
StreamFS
Computer Science
Distributed
index
Online Data Migration
[ICAC 06]
• Enterprise storage systems: multiple volumes mapped
onto each array
– Load imbalances and hotspots can occur
• Goal: automatically resolve hotspots on volumes in large
storage systems
• Focus: minimize migration cost (bytes migrated to
resolve hotspot)
• Bandwidth-to-space ratio algorithm
– Displace and swap of volumes
– Implemented in Linux lvm
Computer Science
Semantic-aware Replication
• Replication for disaster recovery: synchronous
replication for tight recovery point objectives
– Latency increases with geographic separation
– Use of intermediary does not improve consistency
– Too stringent for certain applications
• Semantic-aware replication: hybrid approach
–
–
–
–
Use synchronous replication for “important” writes
Use asynchronous replication for other writes
Automatically infer which mode to use for each request
Transparent to applications
Computer Science
Flash-storage in Data Centers
• Flash-based storage becoming popular
– Higher performance but also higher cost than disk drives
• How can flash storage be exploited in data centers?
• Use flash drives as an accelerator between disk storage
and servers
– Focus on video storage where performance is key
• Exploit flash disk as non-volatile storage in servers
– Fast hibernate / resume => efficient power management in data
centers
Computer Science
• Flash memory becoming
extremely energyefficient
• Exploit flash memory
trends to design more
efficient in-network
sensor storage and
querying systems
– Capsule: flash-based
object storage system
– STONES: storage-centric
sensor networks
Computer Science
Energy Cost (uJ/byte)
Sensor Storage Overview
CC1000
Communication
Atmel NOR
CC2420
Storage
Telos
STM NOR
Micron NAND
128MB
Generation of Sensor Platform
Capsule Overview
•Object-based storage abstraction
•Energy and memory optimized library of
objects
•Checkpointing and rollback for failure
recovery
•Storage reclamation to deal with finite
storage capacity
•Portable to NAND/NOR flash memories
and different sensor platforms
Computer Science
[SenSys 06]
StonesDB Overview
[CIDR 07]
• StonesDB: flash memoryoptimized archival data
management architecture that
supports sensor data storage,
indexing, and aging of data.
Query Engine
Partitioned Access Methods
Computer Science
Extra Slides
Computer Science
Mapping App Data Needs to Storage
Debug logs
Stream
File
Calibration
Tables
Data Archival &
Indexing
Index
Signal
Processing
Array
Packet
Queue
Queue
?
Stack
Data
Processing
Pages on Flash
Map application data structures to Capsule
objects that offer efficient flash implementation
Computer Science
Local Data Management Stack
Computer Science
Distributed Data Management Stack
Computer Science
STONES
• Design an archival data management architecture
that:
– Supports energy-efficient sensor data storage, indexing, and
aging by optimizing for flash memories.
– Supports energy-efficient processing of SQL-type queries, as
well as data mining and search queries.
– Is configurable to heterogeneous sensor platforms with
different memory and processing constraints.
Computer Science
Technology Trends in Storage
Energy Cost
(uJ/byte)
CC1000
Communication
Atmel NOR
CC2420
Storage
Telos STM
NOR
Micron NAND
128MB
Generation of Sensor Platform
Computer Science