V LDB - University of Wisconsin–Madison
Download
Report
Transcript V LDB - University of Wisconsin–Madison
2
V LDB
Boris Gelman
Vice President
Architecture
Information Services
VISA
[email protected]
2
V LDB: The Concept
2
V LDB = Very Very Large Database:
New concept or change to VLDB concept ?
Data Structure:
Petabyte tables with 100s billions of rows
Complex table structures
Non-uniform physical data representation of petabyte tables
Query:
Well-defined subsets (index and/or partition) on tables: small
(~10,000) -> medium
(~300,000) -> large (~1,000,000)
Undefined subsets: very large (~1,000,000,000) -> very very large (~100,000,000,000)
Complex joins
Complex group by’s and sorts
Workload:
Multiple categories of queries running concurrently (transaction research, analytics, data
mining)
Inserts and selects concurrently against the same tables
24 * 7 operation with very limited maintenance windows
SLAs are very strict
2
V LDB: Problems
Data Partitioning:
Smart partitioning: hash, expression, … -> hybrid multi-level partitioning
Smart partition manipulation: detach / attach partition online
Query Execution:
Hash join on petabyte tables ?
Performance Tuning does not work:
Adaptive and buffer-pool aware query optimization ?
System-category aware query optimization ?
Optimizer efficiency ?
Backup/Restore does not work:
Data replication is not a substitute for backup: data corruption, application errors,
human errors
Smart backup/restore related to smart data partitioning !
2
V LDB: Problems
Database Federation:
Single database system cannot hold a combination of ODS (> 1 PB) and crossfunctional multi-subject DW (> 200 TB) - it is impractical
Data Abstraction Layer: federated tables partitioned across multiple database systems!
Federated Database is easier to maintain and backup, and availability is higher!
Federated Database Performance = Single Database System Performance !!!