How to Use the PowerPoint Template
Download
Report
Transcript How to Use the PowerPoint Template
Oracle Database In-Memory
Tirthankar Lahiri
Vice President
Oracle Data Technologies and TimesTen
Vineet Marwah
Senior Director
Oracle Data Technologies
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Oracle Database In-Memory Goals
Orders of
Magnitude Faster
Analytics
100x
Accelerate Mixed
Workload OLTP
2x-10x
No Changes to
Applications
Simple
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Cost Effective
$$$
$
3
What is a Controversy?
“A discussion marked
especially by the expression
of opposing views”
Merriam Webster
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Decades Long Controversy in Database Systems
“A discussion marked
especially by the expression
of opposing views”
Merriam Webster
Column
Row
Format
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Row Format Databases vs. Column Format Databases
SALES
– Example: Insert or query a sales order
– Fast processing for few rows, many columns
Row
SALES
Column
Transactions run faster on row format
Analytics run faster on column format
– Example : Report on sales totals by region
– Fast accessing few columns, many rows
Until Now Must Choose One Format and Suffer Tradeoffs
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
6
Oracle Database In-Memory: Dual Format Architecture
Existing
Buffer Cache
New In-Memory
Format
SALES
SALES
Row
Format
Column
Format
• BOTH row and column
formats for same table
• Simultaneously active and
consistent
• OLTP uses existing row format
• Analytics uses new In-Memory
Column format
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
7
In-Memory Columnar Format
Pure In-Memory Columnar
• Highly optimized compressed
columnar format
• Pure in-memory format:
• Cheap to maintain – no logging or IO
• Allows efficient OLTP
• No changes to disk format
SALES
• Transparent to Applications
• Can be enabled for subset of
database
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
8
Selective Enabling In-Memory Columnar Storage
ALTER TABLE orders INMEMORY;
• Selectively enable in-memory storage
CREATE TABLE
PARTITION BY
(PARTITION
(PARTITION
• New INMEMORY clause
sales ……
RANGE(date)
p1 …… INMEMORY,
p2 …… NO INMEMORY);
ALTER TABLE accounts INMEMORY
NO INMEMORY (photo);
• Applies to Tables, Partitions, Sub-Partitions ,
Materialized Views, Tablespaces
• Can also exclude unneeded columns
• Coming – automatic inmemory placement
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
9
Populating the In-Memory Column Store
CREATE TABLE orders
(c1 number,
c2 varchar(20),
c3 number)
INMEMORY PRIORITY CRITICAL;
ALTER TABLE sales
INMEMORY PRIORITY MEDIUM;
ALTER TABLE accounts
INMEMORY PRIORITY NONE;
• In-Memory objects are populated in the background
• Always accessible via buffer cache in the meantime
• Queries do not block for populate
• Default behavior - initiate populate on first access
• Optional pre-populate via PRIORITY sub-clause
• Initiates populate without waiting for queries
• CRITICAL > HIGH > MEDIUM > LOW
• Controls order (not speed) of populate
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
10
In-Memory Column Store Performance Optimizations
•
Vector Processing
•
Software on Silicon
•
Operation pushdown
•
In-Memory Storage Index
•
Elimination of Analytic Indexes
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
11
Vector Processing: Additional Advantage of Column Format
Memory
• Each CPU core scans only
required columns
STATE
Example:
Find all sales
in state of CA
CPU
Load
multiple
region
values
Vector Register
CA
CA
CA
Vector
Compare
all values
in 1 instruction
CA
> 100x Faster
• SIMD vector instructions used
to process multiple values in
each instruction
•
E.g. Intel AVX instructions with 256 bit
vector registers
• Billions of rows/sec scan
rate per CPU core
• Row format is millions/sec
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
12
DBIM Software On Silicon
SPARC M7 On-Chip DBIM Accelerator/Compressor
MEMORY or L3$
Row Format
DB
MEMORY or L3$
Column Format
Compressed
DB
Up to 32 Concurrent DB Streams
Bit/Byte-packed, Padded, Indexed Vectors
M7 In-Silicon
Query
Engines
Up to 32 Concurrent Result Streams
• SIMD Vector Instructions were originally designed for High Performance
Computing and Graphics – not for Databases
• New SPARC M7 chip has 32 Database Acceleration Engines (DAX) on chip
• Like having 32 specialized cores for DBIM query processing
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Highly Restricted
13
Performance: Database In-Memory Acceleration Engines
SPARC M7
Core
Core
Core
• DAX includes specialized query functions for
predicates, conversions, set membership tests, etc.
Core
Shared Cache
DB
Accel
DB
Accel
DB
Accel
DB
Accel
• Independently process streams of columns:
–E.g. find all values that match ‘California’
–10x performance gains
–Up to 170 billion rows per second
32 Database Accelerators (DAX)
Copyright © 2015,
2014, Oracle and/or its affiliates. All rights reserved. |
Oracle Confidential – Highly Restricted
14
Capacity: Database In-Memory Decompression Engines
• Compression increases in-memory capacity
• FOR QUERY mode compresses column values
Fastest for queries - no decompression needed
Core
Core
Core
Core
Data must be decompressed prior to access
Uses custom Oracle Zip (OZIP): Superfast bit pattern decompressor
Shared Cache
DB
OZIP
DB
OZIP
DB
OZIP
• FOR CAPACITY mode compresses column bit patterns
DB
OZIP
32 OZIP Decompressors
2x Memory capacity
• DAX includes specialized OZIP decompression engine
• Run OZIP decompress at full memory speed, > 120 GB/sec
• Pipelines decompression and data processing (predicate
evaluation, aggregation, comparisons, etc.) in hardware
• Doubles memory capacity with no performance penalty
Copyright © 2015,
2014, Oracle and/or its affiliates. All rights reserved. |
Oracle Confidential – Highly Restricted
15
Operation Pushdown: Bloom Filter
Example: Find total sales in outlet stores
Sales
Stores
• Bloom filter created on
dimension scan
Type=‘Outlet’
Amount
StoreID in
15, 38, 64
Store ID
Store ID
Type
Bloom Filter
• Bloom filter pushdown:
•
Filtering pushed down to fact scan
•
Returns only rows that are likely to be
join candidates
• Joins tables 10x faster
Sum
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
16
Operation Pushdown: Vector Group By
Example: Report sales of footwear in outlet stores
Products
In-Memory
Report Outline
Sales
Outlets
Stores
report outline during dimension
scan
• Push down report outline
Footwear
Footwear
• Create (empty) in-memory
aggregation to fact scan
$
$$
• Reduces complex aggregations
$
to series of fast inmemory scans
$$$
• Reports run 10x faster
• Without predefined cubes
Outlets
Sales
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
17
In-Memory Storage Index
Example: Find stores with sales greater than $10,000
• In-Memory Compression Units (IMCU) – unit of
column store allocation ~ 0.5 million rows
• Per-column min/max values
– Check predicates against min/max values
– Skip IMCU if predicate not satisfied
Min $4000
Max $7000
Min $8000
Max $12000
• Eliminates accessing unnecessary IMCUs
• Eliminates predicate evaluation when all values pass
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Min $13000
Max $15000
18
Complex OLTP is Slowed by Analytic Indexes
Table
1–3
OLTP
Indexes
10 – 20
Analytic
Indexes
• Most Indexes in complex OLTP
(e.g. ERP) databases are only
used for analytic queries
• Inserting one row into a table
requires updating 10-20 analytic
indexes: Slow!
• Indexes only speed up
predictable queries & reports
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
19
OLTP is Slowed Down by Analytic Indexes
Insert rate decreases as
number of indexes
increases
# of Fully Cached Indexes (Disk Indexes are much slower)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
20
Column Store Replaces Analytic Indexes
Table
1–3
OLTP
Indexes
In-Memory
Column Store
• Fast analytics on any columns
• Better for unpredictable analytics
• Less tuning & administration
• Column Store not persistent so
update cost is much lower
• OLTP & batch run faster
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
21
Schneider In-Memory Compression
Schneider General Ledger Compression Factors
20
• Over 2 billion
General Ledger
Entries
15
10
5
8.6x
13x
16x
19x
0
Query Low
Query High
Capacity Low
Capacity High
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
22
Schneider Speedup Across 1545 Queries
7x to 128x faster
Seconds per Query
100
• 2 billion General
Ledger Entries
90
80
Buffer Cache
70
• 1545 queries
60
50
IN-MEMORY
– Originally took 34
hours to complete
– Combination of filter
queries, aggregations
and summations
40
30
20
10
0
2000M
300M
30M
5M
0.5M
Million rows returned by query
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
23
Schneider Transactions Speedup
4 000
3 500
• Data – Sales Accounts
Primary Index Only
3 000
2 500
• Main table has
1 Primary Key +
21 secondary indexes
Primary Index Plus In-Memory Columns
No Index
2 000
No Index + IM
1 500
Full Index
• Test - 303 million
transactions
1 000
500
All Indexes
75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
-
0
Millions of transaction per day
From 5x to 9x faster
Millions of records in the target table
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
24
Schneider Storage and IO Reduction
Over 70% reduction in storage usage due to analytic index removal
Size in GBs
SECONDARY
INDEXES
350
300
250
200
150
TABLES &
PK INDEXES
100
50
0
Objects
Redo
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
25
Scale-Out In-Memory Database to Any Size
• Scale-Out across servers to
grow memory and CPUs
•
DISTRIBUTE clause: by Partition,
Sub-Partition, or Rowid Range
• In-Memory queries parallelized
across servers to access local
column data
• Direct-to-Wire InfiniBand
protocol speeds messaging
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
26
Scale-Out In-Memory with Fault Tolerance
• Ability to Duplicate IMCUs on
another node
• Enabled via DUPLICATE subclause
• Application transparent
• Similar to storage mirroring
• Downtime eliminated by using
duplicate after failure
Only Available on Engineered Systems
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
27
Conclusion: The End of a Controversy
Dual Format
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
In-Memory Database Technology Across Tiers
In-Memory Row Store
Application
Tier
Application
Application
Application
Application
Application
Application
In-Memory Column Store
Database
Tier
TimesTen In-Memory Database
• Embeddable In-Memory Database for Application Tier
• Primary Usecase: Latency-critical custom OLTP
Microsecond Response Time
• Standalone Database or as Application-Tier Cache for
Oracle Database
Oracle Database In-Memory
• Dual-Format In-Memory Database
• Primary Usecase: Real Time Analytics on any source
Billions of Rows/Sec analytic data access
• Faster mixed-workload enterprise OLTP
• Storage-Tiering: Combines best of memory, flash, disk
• Transparent: packaged apps run with no changes
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Oracle Confidential – Highly Restricted
29
Summary
• Dual Format Architecture
– Fully consistent row and column format
– Best of both worlds OLTP and Analytics performance.
– Typically, row format (Buffer cache) memory < 10% of column format memory
• New In-Memory Column Format
– In-memory only representation
– Seamlessly built into Oracle Database Engine
– Compatible with all Oracle Database features
• Cost Effective
– Use in-memory for hot data, flash for intermediate data, disk for cold data
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
30
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Backup Slides
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
In-Memory Area: Composition
In Memory Area
IMCU
IMCU
IMCU
IMCU
• Contains two subpools:
IMCU
SMU
SMU
SMU
SMU
SMU
SMU
SMU
SMU
IMCU
IMCU
IMCU
Metadata
– Pool of In Memory Compression Units (IMCUs)
– Pool of Snapshot Metadata Units (SMUs)
• IMCUs contain column formatted data
• SMUs contain metadata and transactional
information
Column Format Data
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
34
Column Compression Unit (CU)
Dictionary
VALUE
Audi
BMW
Cadillac
ID
0
1
2
Column value list
BMW
Audi
BMW
Cadillac
BMW
Audi
Audi
Column CU
• Contiguous storage per column in an IMCU
Min: Audi
Max: Cadillac
2
0
2
1
2
0
0
• All CUs automatically store Min/Max values
• Multiple formats:
– For example, Dictionary Compression: CU stores
(smaller) dictionary IDs instead of full values
– Additional compression schemes
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
35
In-Memory Compression Unit (IMCU)
IMCU header
Column CUs
ROWID
EMPID
NAME
DEPT
• Unit of column store allocation
SALARY
– Columnar representation of a large
number of rows (e.g. 0.5 million)
– Rows in one or more table extents
– Variable size
• Contains contiguous runs for each
column (column compression
units)
Extent #13
Blocks 20-120
Extent #14
Blocks 82-182
Extent #15
Blocks 201-301
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
36
In-Memory Column Store Populate
IMCU
SMU
• Populate: Initial creation of IMCU from Row Format
• Repopulate: Recreation of IMCU after modification
– Threshold Repopulate, after exceeding change
threshold
– Trickle Repopulate, constant activity, any changed
IMCU is a potential candidate
Change
threshold or
trickle action
DML
operations
Workload
transactions
IMCU
SMU
Repopulate
(Initial) Populate
Row Format
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
37
In-Memory Column Store Compression
• IMCUs compressed during population
ALTER MATERIALIZED VIEW mv1
INMEMORY MEMCOMPRESS FOR QUERY;
• Controlled by MEMCOMPRESS sub-clause
• Ascending order of compression levels:
• FOR DML (for heavily updated tables, slower for queries)
CREATE TABLE history
(Name varchar(20),
Desc varchar(200))
INMEMORY
MEMCOMPRESS FOR CAPACITY LOW;
• FOR QUERY LOW (default: light compression and fastest)
• FOR QUERY HIGH (slightly more compression)
• FOR CAPACITY LOW (balances capacity and performance)
• FOR CAPACITY HIGH (maximizes capacity)
• Allows in-memory storage tiering
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
38
DML
QUERY LOW
QUERY HIGH
Hotter Partitions
In-Memory Column Store Compression
• IMCUs compressed during population
• Controlled by MEMCOMPRESS sub-clause
• Ascending order of compression levels:
• FOR DML (for heavily updated tables, slower for queries)
CAPACITY LOW
CAPACITY HIGH
Colder Partitions
• FOR QUERY LOW (default: light compression and fastest)
• FOR QUERY HIGH (slightly more compression)
• FOR CAPACITY LOW (balances capacity and performance)
• FOR CAPACITY HIGH (maximizes capacity)
• Allows in-memory storage tiering
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
39
In-Memory Column Store Compression
Mode
Compression Factor
Query
Speed
Typical
Range
Observed
Max
QUERY
2x-7x
Up to 10x
Fastest
CAPACITY LOW
4x-9x
Up to 20x
Very Fast
CAPACITY HIGH
7x-12x
Up to 30x
Fast
• MEMCOMPRESS FOR QUERY LOW
or HIGH
– LOW: Fastest performance
– HIGH: Slightly greater compression
– Lightweight compression schemes
– Queries run on compressed data
Compression ratios can be highly
inflated by choosing a bad
uncompressed format, or reporting
most compressible table.
Our results are measured relative
to Oracle’s efficient row format for
customer data. .
– Faster than on uncompressed data
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
40
In-Memory Column Store Compression
Mode
Compression Factor
Query
Speed
Typical
Range
Observed
Max
QUERY
2x-6x
Up to 10x
Fastest
CAPACITY LOW
4x-9x
Up to 20x
Very Fast
CAPACITY HIGH
7x-12x
Up to 30x
Fast
• MEMCOMPRESS FOR CAPACITY
LOW
– Balances throughput and capacity
– Adds Oracle custom ZIP (OZIP) on top
of COMPRESS FOR QUERY
– World’s fastest decompressor, tuned
for DB query performance
Compression ratios can be highly
inflated by choosing a bad
uncompressed format, or reporting
most compressible table.
Our results are measured relative
to Oracle’s efficient row format for
customer data. .
– 2x - 3x faster than LZO (standard for
fast zip)
– Further optimized on SPARC M7 via
Software on Silicon
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
41
In-Memory Column Store Compression
Mode
Compression Factor
Query
Speed
Typical
Range
Observed
Max
QUERY
2x-6x
Up to 10x
Fastest
CAPACITY LOW
4x-9x
Up to 20x
Very Fast
CAPACITY HIGH
7x-12x
Up to 30x
Fast
Compression ratios can be highly
inflated by choosing a bad
uncompressed format, or reporting
most compressible table.
Our results are measured relative
to Oracle’s efficient row format for
customer data. .
• MEMCOMPRESS FOR
CAPACITY HIGH
– Compress with emphasis on
space-savings
– Heavier weight decompression
required before query processing
– Trades off some performance for
capacity
– Extra 1.5-2x compression
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
42
Scale-Up for Maximum In-Memory Performance
• Scale-Up on large SMPs
•
NUMA optimizations
•
Parallel Execution
• SMP scaling removes
overhead of distributing
queries across servers
• Memory interconnect far
faster than any network
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
43
M6-32 Big Memory Machine
World’s largest SMP System
•32 TB DRAM
•32 Sockets
•384 Cores
•3072 processing threads
•3 Terabyte/sec Bandwidth
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
44
Scale-Out In-Memory Database to Any Size
• Distribution allows in memory segments
larger than single host memory
ALTER TABLE sales INMEMORY
DISTRIBUTE BY PARTITION;
ALTER TABLE COSTS INMEMORY
DISTRIBUTE AUTO;
• Policy is automatic or user-specifiable
• Controlled by DISTRIBUTE subclause
•
•
•
•
Distribute by rowid range
Distribute by partition
Distribute by sub-partition
Distribute AUTO (default)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
45
Scale-Out: Distribute by Partition
• Distribute by Partition (toplevel partition for composite
partitioned tables)
• Ideal for Hash Partitions
ORDERS
PARTITION
BY HASH ON
ORDER_ID
0
• Also for other partition types
if uniformly accessed
1
• Allows in-memory partitionwise joins
4
2
3
….
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
46
Scale-Out: Distribute by Sub-Partition
• For composite partitions, can
distribute by Sub-Partition
• Ideal for Hash Sub-Partitions
• Also for other sub-partition
types if uniformly accessed
• Allows in-memory partitionwise joins
ORDERS
PARTITION BY
RANGE ON
ORDER_DATE
SUBPARTITION
BY HASH ON
ORDER_ID
Nov ‘13
1
Nov ‘13
2
Nov ’13
3
Nov ’13
4
Dec ‘13
1
….
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
47
Scale-Out: Distribute by Rowid Range
• Distributes IMCUs by
uniform hash on first rowid
• For non-partitioned tables
• Also for partitioned tables
with skewed access across
partitions
• Ensures uniform distribution
of load across instances
ORDERS
Rowid Ranges
1-105
106-201
202-310
311-421.
422-535
….
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
48