Vortragstitel
Download
Report
Transcript Vortragstitel
A Hybrid Row-column OLTP Database
Architecture for Operational Reporting
Jan Schaffner, Anja Bog, Jens Krüger,
Alexander Zeier
Agenda
2
■ Operational Reporting
■ Related Work
■ Architecture of Hybrid System
■ Virtual Cube
■ Outlook and Discussion
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Operational Reporting
3
Dinstinction according to Inmon:
■ Informational Reporting
□ Supports long-term, strategic decisions
□ Summarized data
□ Long-term horizons
Typically done using a data warehouse (DW)
■ Operational Reporting
□ Supports day-to-day decisions
□ Data on a more detailed level
□ Takes up-to-the-minute data into account
Done using a DW or an OLTP system?
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Operational Reporting (contd.)
4
■ Using a DW for Operational Reporting
□ DW must be designed to the same level of granularity
as the OLTP systems huge data volumes
□ Updates are required to frequently be replicated into the
DW endless optimization
■ Using an OLTP Store for Operational Reporting
□ Operational reporting queries are relatively long-running in
comparison to pure OLTP workloads
□ Resource contention:
Locks of long-running queries block the short-running ones
□ Different data model:
Not optimized for reporting (i.e. no star-schema)
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Common Data Warehouse Architecture
5
■ DW contains ETL processor which
□ ...extracts data from various OLTP sources into a staging area
□ ...applies transformations for cleansing and integration
□ ...stores data in a dimensional layout
■ OLAP engine runs queries against
dimensional data store
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
“Real-Time” DW Architectures
6
■ Microbatch
□ Configure ETL process to run in very short intervals
□ Up-to-date data but very resource intensive
■ Push Architectures
□ Handling of deltas on a business or database transaction level
□ Up-to-date data but still resource intensive
■ Operational Data Store (ODS)
□ Store copy of the OLTP data using an integrated schema
□ High data granularity but no up-to-date data
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
“Real-Time” DW Architectures (contd.)
7
■ ELT
□ Data is extracted from the OLTP sources and loaded into the ODS
□ Transformations are done in the warehouse at query-runtime
□ High granularity (transactional data) but no up-to-date data
■ Virtual ODS
□ Virtual in the sense that queries are redirected against OLTP system
□ High granularity (transactional data) and up-to-date data
□ Performs ETL on-the-fly
□ Affects performance of OLTP system
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Column-Stores: New “Trend” for OLAP
8
□ Column-store databases:
◊ Vertical fragmentation
◊ Fast aggregations (sum, min, max, avg, …)
more flexibility for ad-hoc reporting
◊ Each column can be compressed individually
□ Both disk-based …
◊ Vertica
c1
◊ Greenplum
□ … and in-memory:
c2
c3
sID
c1
sID
c2
sID
c3
v11 v12 v13
1
v11
1
v12
1
v13
v21 v22 v23
2
v21
2
v22
2
v23
v31 v32 v33
3
v31
3
v32
3
v33
row-oriented
column-oriented
◊ SAP BIA
◊ MonetDB
◊ Exasol
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Encoding Schemes
9
Few distinct values
Ordered
Unordered
Sequence of triples:
• value
• offset position
• # occurrences
Sequence of tuples:
• value
• bitmap for positional
occurence
Many distinct values
Delta representation
?
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Architecture of Hybrid System
10
■ Essentially integration between row- and column store DBs
■ MaxDB is used as the row store
□ Database underlying SAP Business ByDesign
□ Supports ACID transactions
■ TREX is used as the column store
□ Main memory
□ Engine underlying SAP BIA
□ Has a copy of (some of) the OLTP data
□ Primary OLTP system and main-memory database (MMDB) are
governed using a single resource manager
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Architecture of Hybrid System (contd.)
11
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Virtual Cube
■ Similar architecture as virtual ODS
□ Virtual Cube provides the same interface
as a typical cube (slice, dice, drill-down, …)
□ Virtual Cube rewrites queries and issues
them against the MMDB (TREX in our case)
□ TREX has a copy of the OLTP data
□ Primary OLTP system and MMDB are
tied together as described above
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Outlook
13
■ Build a “real” hybrid database in-memory as part of ChunkyStore
■ Data can be stored as either:
□ Rows
□ Columns
□ Chunks (adjacent fragments of rows and columns)
■ DB decides which physical storage alternative is most suitable
■ Main-memory implementation will cater for fast updates
as well as fast operational reporting capabilities
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture
Thank you
14
■ Questions?
August 24, 2008 | A Hybrid Row-column OLTP Database Architecture