V LDB - University of Wisconsin–Madison

Download Report

Transcript V LDB - University of Wisconsin–Madison

2
V LDB
Boris Gelman
Vice President
Architecture
Information Services
VISA
[email protected]
2
V LDB: The Concept
2
 V LDB = Very Very Large Database:
 New concept or change to VLDB concept ?
 Data Structure:
 Petabyte tables with 100s billions of rows
 Complex table structures
 Non-uniform physical data representation of petabyte tables
 Query:
 Well-defined subsets (index and/or partition) on tables: small
(~10,000) -> medium
(~300,000) -> large (~1,000,000)
 Undefined subsets: very large (~1,000,000,000) -> very very large (~100,000,000,000)
 Complex joins
 Complex group by’s and sorts
 Workload:
 Multiple categories of queries running concurrently (transaction research, analytics, data
mining)
 Inserts and selects concurrently against the same tables
 24 * 7 operation with very limited maintenance windows
 SLAs are very strict
2
V LDB: Problems
 Data Partitioning:
 Smart partitioning: hash, expression, … -> hybrid multi-level partitioning
 Smart partition manipulation: detach / attach partition online
 Query Execution:
 Hash join on petabyte tables ?
 Performance Tuning does not work:
 Adaptive and buffer-pool aware query optimization ?
 System-category aware query optimization ?
 Optimizer efficiency ?
 Backup/Restore does not work:
 Data replication is not a substitute for backup: data corruption, application errors,
human errors
 Smart backup/restore related to smart data partitioning !
2
V LDB: Problems
 Database Federation:
 Single database system cannot hold a combination of ODS (> 1 PB) and crossfunctional multi-subject DW (> 200 TB) - it is impractical
 Data Abstraction Layer: federated tables partitioned across multiple database systems!
 Federated Database is easier to maintain and backup, and availability is higher!
 Federated Database Performance = Single Database System Performance !!!