Transcript ppt

Cost-effective Hybrid Storages
Flash Group, Cao Qingling
Motivation
• Dry up things with the least money!
Motivation
• High cost, low density, low reliability.
• Replacement as HDD is not recommended, especially
when the data volume is very large.
• Cache as HDD, make up the gap between RAM and
HDD. Hurt the lifetime.
• A permanent store at the same level as HDD, store
some special data.
Introduction
• Put forward a Hybrid Hbase with SSD.
• Storing system component of Hbase in SSD,
which at the same level as HDD.
• Perform quantitative assessment, Hybrid
Hbase perform 1.5-2 times better.
HBase
• Column-based key-value
store.
• Each region server has a
write-ahead log(WAL).
• First write WAL and then
in-memory memstore.
• Region is a horizontal
division.
• A region could split.

Data on disks is stored as Log-structured merge(LSM) trees.
HBase System Component
Zookeeper:
• Clients contact it for -ROOT- table.
• Master contacts it to know available region servers.
• Region servers contact with it in a heartbeat keepalive mechanism.
• Zookeeper is I/O intensive.
Catalog Tables:
• -ROOT- and .META. Tables.
• Mostly read intensive and are not updated frequently.
HBase System Component
Write-ahead-log(WAL):
• Any write is first done on the WAL.
• The size grows with: i) WAL committed; ii) write rate;
iii) the size of key-value pair.
Temporary Storage:
• Used when a region is split or merged.
• Sequentially read or write.
Assessment
Price: 1:10
1% of the database size. Gain more than 10% performance.
Experimental Evaluation
• Experiment: Intel processor(4 cores and 4
threads at 3 GHz) with 8 GB RAM, Western
Digital 1TB HDD, Kingston 128 GB SSD.
• Yahoo! Cloud Serving Benching(YCSB).
• Workloads: 100w queries on database with
6000w records. Record size is 1KB. Totally 72
regions.
Experimental Evaluation
Experimental Evaluation
Introduction
• Approximate membership query data
structure(AMQ). Bloom Filter.
• Larger than RAM, performance decays.
• Quotient Filter:better data locality, squential
operations, available delete, dynamically
resized, space-saving.
• Buffered Quotient Filter(BQF) and Cascade
Filter(CF) designed for flash.
Introduction
• Approximate membership query data
structure(AMQ). Bloom Filter.
• Larger than RAM, performance decays.
• Quotient Filter:better data locality, squential
operations, available delete, dynamically
resized, space-saving.
• Buffered Quotient Filter(BQF) and Cascade
Filter(CF) designed for flash.
Quotient Filter
 Fingerprint: f = fq2r + fr.
• fr = f mod 2r
• fq =
• T[fq] = fr
Quotient Filter
run
Physical Storage
• is_occupied: check if fq = i, namely if T[i] has data.
• is_shifted: if fr belongs to slot i.
• is_continuation: if blongs to the same run with i-1.
Quotient Filter
• Check if f in the QA:
step1:
step2: to the beginning of the cluster.
step3: to the start of the run.
step4: search f.
• Insert a f.
• Delete a f.
Quotient Filters on Flash
• Buffered Quotient Filter
- BQF: one QF as the buffer, another on SSD.
- Optimized for lookup performance.
• Cascade Filter
- Optimized for insertion.
- Offer a trade off between lookup and insertion.
Quotient Filter on Flash
• Cascade Filter
- Based on cache-oblivious lookahead arrary(COLA).
Evaluation
Conclusions
• Bloom Filter has wide use in key-value storage.
• Change the way of thinking.
• Gain inspiration from traditional algorithms of
database.
• Design corresponding hybrid system by
applications.
Bloom Filter
• Initial state:
• Insert: H(1), H(b).
• Can not expend, support no delete, poor data
locality.