Oracle Archiving Best Practices

Download Report

Transcript Oracle Archiving Best Practices

Oracle Data Archiving
Taming the Beast
Dave Moore
Neon Enterprise Software
INTELLIGENCE. INNOVATION. INTEGRITY
Agenda
Archiving Defined
Requirements and Solutions
Oracle Archiving Strategies
Oracle Row Removal Options
Oracle Post Archive Operations
INTELLIGENCE. INNOVATION. INTEGRITY
Dave
 Oracle ACE
 Using Oracle since 1991
 Product Author at Neon Enterprise Software
 Creator of OracleUtilities.com
 Author of “Oracle Utilities” from Rampant
Tech Press
 Core competencies include performance,
utilities and data management
INTELLIGENCE. INNOVATION. INTEGRITY
Database Archiving
Database Archiving:
The process of removing selected data records from
operational databases that are not expected to be referenced
again and storing them in an archive data store where
they can be retrieved if needed.
Purge
INTELLIGENCE. INNOVATION. INTEGRITY
Trends Impacting Archive Needs
Data Retention Issues:
Volume of data
Length of retention
requirement
Varied types of data
Security issues
0
Time Required
30+ Yrs
INTELLIGENCE. INNOVATION. INTEGRITY
Archiving All Types of Data
Paper
Blueprints
Forms
Claims
Word
Excel
PDF
XML
IMS
DB2
ORACLE
SYBASE
SQL Server
IDMS
VSAM
Programs
UNIX Files
Outlook
Lotus Notes
Attachments
Sound
Pictures
Video
INTELLIGENCE. INNOVATION. INTEGRITY
Data Archiving and ILM
Create
Operational
Needed for
completing
business
transactions
Reference
Archive
Needed for
reporting
or expected
queries
Needed for
compliance
and business
protection
Discard
Mandatory Retention Period
INTELLIGENCE. INNOVATION. INTEGRITY
Some Sample Regulations
Impacting Data Retention
INTELLIGENCE. INNOVATION. INTEGRITY
What Does It All Mean?
Enterprises must recognize that there is a
business value in organizing their information
and data.
Organizations that fail to respond run the risk
of seeing more of their cases decided on
questions of process rather than merit.
(Gartner, 20-April-2007, Research Note G00148170:
Cost of E-Discovery Threatens to Skew Justice System)
INTELLIGENCE. INNOVATION. INTEGRITY
Operational Efficiency
Database Archiving can be undertaken to
improve operational efficiency
 Large volumes of data can interfere with
production operations
– efficiency of transactions
– efficiency of utilities: BACKUP/RESTORE, REORG, etc.
– Storage
» Gartner: databases copied an average of 6 times!
INTELLIGENCE. INNOVATION. INTEGRITY
What Solutions Are Out There?
 Keep Data in Operational Database
— Problems with authenticity of large amounts of
data over long retention times
 Store Data in UNLOAD files (or backups)
— Problems with schema change and reading
archived data;
using backups poses even more serious problems
 Move Data to a Parallel Reference Database
— Combines problems of the previous two
 Move Data to a Database Archive
INTELLIGENCE. INNOVATION. INTEGRITY
Components of a
Database Archiving Solution
Production
Database
Data
Extract
Captured Structure
Archive Policies
Data Retention
Archive Data Store
and Retrieve
Recall
Database
Data
Recall
Archive Store
Archive Data
Query Access
Metadata
Policies
History
Data &
Metadata
Archive
Administration
INTELLIGENCE. INNOVATION. INTEGRITY
Archiving Requirements
 Policy based archiving: logical selection
 Keep data for very long periods of time
 Store very large amounts of data in archive
 Maintain Archives for ever changing operational systems
 Become independent from Applications/DBMS/Systems
 Protect authenticity of data
 Access data when needed; as needed
 Discard data after retention period automatically
INTELLIGENCE. INNOVATION. INTEGRITY
Policy based archiving
 Why :
— Business objects are archived, not files
— Rules for when something is ready can be complex
— Data ready to be archived is distributed over database
 Implications:
— User must provide policies for when something is to
be archived
 How:
— Full metadata description of data
— Flexible specification of policy : “WHERE clause”
INTELLIGENCE. INNOVATION. INTEGRITY
For Example…
Parts Master is
the parent table
to all other tables
STORAGE INFO
Part Number
Bin Number
Qty on Hand
Qty on Order
Qty Backorder
PARTS MASTER
Part Number
Type
Description
Unit Type
Cost
Price
Substitute Parts
ORDER INFO
Part Number
PO Number
Vendor ID
Quantity Ordered
Unit Cost
Date Ordered
Date Received
SUMMARY BY
QUARTER
Part Number
Year
Q1 Disbursed
Q2 Disbursed
Q3 Disbursed
Q4 Disbursed
DISBURSEMENT
Part Number
Dept. ID
CHIT ID
Qty Disbursed
Date Disbursed
INTELLIGENCE. INNOVATION. INTEGRITY
Keep Data for a Long Time
 Why: retention requirements in decades
 Implications:
— Archive will outlive applications/DBMS/systems that
generated them
— Archive will outlive people who designed and managed
operational systems
— Archive will outlive media we store it on
 How:
— Unique data store
— Application/DBMS/system independence
— Metadata independence
— Continuous management of storage
— Continuous management of archive content
INTELLIGENCE. INNOVATION. INTEGRITY
Maintain Archive for
Changing Operational Systems
 Why :
— Metadata changes frequently
— Applications are re-engineered periodically
–
–
–
–
Change DBMS platform
Change System platform
Replace with new application
Consolidate after merger or acquisition
 Implications:
— Archive must support multiple variations of an application
— Archive must deal with metadata changes
 How:
— Manage applications as major archive streams having
multiple minor streams with metadata differences
— Achieve independence from operating environment
INTELLIGENCE. INNOVATION. INTEGRITY
Achieve Metadata Independence
 Why :
— Operational metadata is inadequate
— Operational metadata changes
— Operational systems keep only the “current” metadata
— Data in archive often does not mirror data in operational
structures
 Implications:
— Archive must encapsulate metadata
— Metadata must be improved
 How:
— Metadata Capture, Validate, Enhance capabilities
— Store structure that encapsulates with data
— Keeps multiple versions of metadata
INTELLIGENCE. INNOVATION. INTEGRITY
Protect Authenticity of Data
 Why :
— Potential use in lawsuits/ investigations
— Potential use in business analysis
 Implications:
— Protect from unwanted changes
— Show original input
— Cannot be managed in operational environment
 How:
—
—
—
—
—
—
—
SQL Access that does not support I/U/D
Do not modify archive data on metadata changes
Encryption as stored
Checksum for detection of sabotage
Limit access to functions
Audit use of functions
Maintain offsite backup copies for restore if sabotaged
INTELLIGENCE. INNOVATION. INTEGRITY
Access Data Directly From Archive
 Why :
— Cannot depend on application environment
 Implications:
— Full access capability within archive system
 How:
— Industry standard interface (e.g. JDBC)
— LOAD format output for
– For load into a database
– May be different from source database
— Requires full and accurate metadata
— Ability to review metadata
— Ability to function across metadata changes
INTELLIGENCE. INNOVATION. INTEGRITY
Discard Function
 Why :
— Legal exposure for data kept too long
 Implications:
— Data cannot be kept in archive beyond retention period
— Must be removed with no exposure to forensic software
 How:
— Policy based discard
— System level function
— Tightly controlled and audited
— True “zero out” capability
— Discard from backups as well
INTELLIGENCE. INNOVATION. INTEGRITY
Database or Archive?
Keep in DB
Keep in Archive
Performance
Space
Compliance
INTELLIGENCE. INNOVATION. INTEGRITY
Based on Data Availability
Keep in DB
Keep in Archive
Purge
Must be Available to App
Must be Available
Must Be Secure
Not Needed
INTELLIGENCE. INNOVATION. INTEGRITY
Oracle Archiving Strategies
 Designed Up Front (Yeah, right)
 Determined by Application Owner
 Implemented by ____________
 Utilize Oracle Features
INTELLIGENCE. INNOVATION. INTEGRITY
Finding Large Tables
 DBA_SEGMENTS (bytes)
 DBA_TABLES (num_rows)
 or based on I/O
INTELLIGENCE. INNOVATION. INTEGRITY
Rolling Windows
 Self Managing
 Mostly based on DATE
 Utilize DBMS Features
— Partitioning
— Transportable Tablespaces
–
–
–
–
Exchange Partition
Set tablespace read only
Expdp
Copy export file and data file
INTELLIGENCE. INNOVATION. INTEGRITY
Rolling Windows via Partitioning
P1 …
Data Profile
Storage Profile
Probably
Never
Accessed
Cheap as
you can get
Read Only /
Compressed
… P47
Rarely
Accessed
Heavily
Accessed
Not so fast
or expensive
Fast,
expensive
Read Only
Read / Write
INTELLIGENCE. INNOVATION. INTEGRITY
Why not use transportable tablespaces
or Oracle exports for data retention?
INTELLIGENCE. INNOVATION. INTEGRITY
The Problem with Oracle Files
 Transportable Tablespaces
 Exports
 Backups
Oracle
Export Files &
Datafiles
Import
Trans Tsp
Version 16Z
Year 2007
Year 2030
Not a good method for LT Data Retention
INTELLIGENCE. INNOVATION. INTEGRITY
Partitioning (Old ways)
 Range Partitioning
Data is distributed based on partition key range
of values – usually a date.
Good When: Data is date-based.
INTELLIGENCE. INNOVATION. INTEGRITY
Partitioning (Old Ways)
 Hash Partitioning
Uses hash algorithm to create equally sized
buckets of data.
Good When: No natural partition key and
desire I/O balancing (hot spots).
INTELLIGENCE. INNOVATION. INTEGRITY
Partitioning (Old Ways)
 List Partitioning
Data is distributed based on LIST of values in
partition key.
Good When: Have short list of values (States,
Regions, Account Types)
INTELLIGENCE. INNOVATION. INTEGRITY
Partitioning (New Ways – 11G)
 Interval Partitioning
Initial Partition is created manually, the rest are
automatically created as new data arrives.
Good When: Need a rolling window!
INTELLIGENCE. INNOVATION. INTEGRITY
Partitioning (New Ways – 11G)
 REF partitioning
Related Tables benefit from same partitioning strategy,
whether column exists in children or not!
Good When: Desire related data to be partitioned in
the same manner.
INTELLIGENCE. INNOVATION. INTEGRITY
Partitioning (New Ways – 11G)
 Virtual Column Partitioning
Partition key may be based on virtual column
Good When: Virtual column is required for
partition key.
INTELLIGENCE. INNOVATION. INTEGRITY
Rows Gotta Go
INTELLIGENCE. INNOVATION. INTEGRITY
Row Removal Options
 SQL DELETE
 CTAS / DROP / RENAME
 TRUNCATE
 Row Marking
INTELLIGENCE. INNOVATION. INTEGRITY
SQL DELETE
 Good for small number of rows
 RI handled automatically
 Oracle was born to DELETE, better than any
PL/SQL that you write.
 Issue with Un-indexed Foreign Keys 
INTELLIGENCE. INNOVATION. INTEGRITY
DELETE Optimization
 Work in batches, committing (only when
programmatically DELETING)
 Use parallel DML (Partitioned tables only)
 Drop Indexes before (if possible)
 Index FK columns
INTELLIGENCE. INNOVATION. INTEGRITY
CTAS

Works well for PURGE, not archive

Perfect when you want to keep low percentage of
rows in the table

Doesn’t handle RI – no DELETE was issued.

Process
1.
Create table with rows you want to keep
2.
Drop old table
3.
Rename table
4.
Recreate indexes
create table new_table unrecoverable as select * from old_table where ...
INTELLIGENCE. INNOVATION. INTEGRITY
TRUNCATE
 Congratulations if your application lends
itself to TRUNCATE without losing new data
 What about RI?
 May truncate or drop individual partitions
INTELLIGENCE. INNOVATION. INTEGRITY
DROP
 DROP PARTITION
 What would you do before you drop it?
 Exchange partition with table
 Transportable tablespace.
INTELLIGENCE. INNOVATION. INTEGRITY
Things to Remember
• Benchmark the best way for you
• Benchmark against real data if possible
• Use parallel DML
INTELLIGENCE. INNOVATION. INTEGRITY
Design Summary
 Create an architecture that lends itself to
aging, archiving, deleting
 This architecture should compensate for
business requirements
— For instance, customer orders not accessible after
6 months … or
— top query performance needed for all ‘ACTIVE’
accounts … etc
 Implement it – THE EASY PART
INTELLIGENCE. INNOVATION. INTEGRITY
Post Archive Challenges
INTELLIGENCE. INNOVATION. INTEGRITY
Post Archive Challenges
“I have successfully deleted 10
billion rows from the table.
HoooAhhhh! Performance will be
great, space will be available, and
I will get credit for optimizing our
data warehouse application, saving
the company billions of dollars”
INTELLIGENCE. INNOVATION. INTEGRITY
… 2 Days Later …
INTELLIGENCE. INNOVATION. INTEGRITY
Post Archive Challenges
Hmmmmm. It looks like …
- Queries are not any faster . . .
- The Select count(*) took the same amount of time . . .
- Space was not freed in Oracle (DBA_FREE_SPACE) . . .
- Space was not freed in the operating system . . .
WHY NOT ????? Where are the benefits ???
INTELLIGENCE. INNOVATION. INTEGRITY
From Swiss to Provolone
After DELETE
After Maintenance
INTELLIGENCE. INNOVATION. INTEGRITY
Post Archive Challenges
 Statistics are not fresh
 High Water Marks are very high
 Space has not been freed within
Oracle (if that’s what you want)
 Space has not been freed to the OS
INTELLIGENCE. INNOVATION. INTEGRITY
Refresh Statistics
 Help the optimizer, easy enough
 dbms_stats provides many options
INTELLIGENCE. INNOVATION. INTEGRITY
Automatic Stats
 Recommended by Oracle
 Calls DBMS_STATS_JOB_PROC
 Enabled via:
Begin
dbms_auto_task_admin.enable(
client_name => ‘auto optimizer stats collection’,
operation => NULL,
window_name => NULL);
END;
/
INTELLIGENCE. INNOVATION. INTEGRITY
When do you go manual ?
 High transaction DELETEs or TRUNCATEs
 Bulk loads which add more than 10% of table
size
So there’s our answer – go manual.
INTELLIGENCE. INNOVATION. INTEGRITY
How do we Gather Them?
 NOT the Analyze Command
 Instead DBMS_STATS package
exec dbms_stats.gather_table_stats(ownname => 'BDB',
tabname => 'MASTER', estimate_percent =>
dbms_stats.auto_sample_size);
INTELLIGENCE. INNOVATION. INTEGRITY
High Water Mark
INTELLIGENCE. INNOVATION. INTEGRITY
High Water Mark
INTELLIGENCE. INNOVATION. INTEGRITY
Reset High Water Mark (HWM)
 DROP or TRUNCATE
 Multiple OTHER ways to do this depending on version
 In v9 … alter table move tablespace [tsp name];
— Row movement must be enabled
— Tablespace must be a LMT
— Can move into same tablespace
— Will occupy 2X space temporarily
— Must then rebuild indexes
 In v10 … alter table <table_name> shrink space;
INTELLIGENCE. INNOVATION. INTEGRITY
Freeing Allocated Space
INTELLIGENCE. INNOVATION. INTEGRITY
Create table, check space
SQL> create table space_example as select * from dba_source;
Table created.
SQL> select count(*) from space_example;
COUNT(*)
---------296463
SQL> exec dbms_space.unused_space(‘DAVE', 'SPACE_EXAMPLE');
Total blocks: 6328
Unused blocks: 1
Unused bytes: 8192
Last Used Block: 55
Last Used Block ID: 10377
Last Used Ext File ID: 4
INTELLIGENCE. INNOVATION. INTEGRITY
Check datafile space
Size Current
Poss.
FILE_NAME
Poss.
Size Savings
-------------------------------------------------- -------- -------- -------/export/home/ora102/oradata/ora102/qasb001.dbf
29
46
17
/export/home/ora102/oradata/ora102/example01.dbf
69
100
31
/export/home/ora102/oradata/ora102/qasb002.dbf
41
41
0
/export/home/ora102/oradata/ora102/system01.dbf
493
500
7
/export/home/ora102/oradata/ora102/sysaux01.dbf
430
430
0
/export/home/ora102/oradata/ora102/undotbs01.dbf
91
175
84
/export/home/ora102/oradata/ora102/users01.dbf
44
83
39
/export/home/ora102/oradata/ora102/test.dbf
51
70
19
INTELLIGENCE. INNOVATION. INTEGRITY
Delete rows, check space
SQL> delete from space_example;
296463 rows deleted.
SQL> commit;
SQL> exec dbms_space.unused_space(‘DAVE', 'SPACE_EXAMPLE');
Total blocks: 6328
Unused blocks: 1
Unused bytes: 8192
Last Used Block: 55
Last Used Block ID: 10377
Last Used Ext File ID: 4
Nothing Changed !
INTELLIGENCE. INNOVATION. INTEGRITY
Shrink it, check space
SQL> alter table space_example enable row movement;
SQL> alter table space_example shrink space;
SQL> exec dbms_space.unused_space('BDB', 'SPACE_EXAMPLE');
Total blocks: 8
Unused blocks: 4
Unused bytes: 32768
Last Used Block: 4
Last Used Block ID: 5129
Last Used Ext File ID: 4
Space Freed From Table, but still in Oracle
INTELLIGENCE. INNOVATION. INTEGRITY
Check space again
Size Current
Poss.
FILE_NAME
Poss.
Size Savings
-------------------------------------------------- -------- -------- -------/export/home/ora102/oradata/ora102/qasb001.dbf
29
46
17
/export/home/ora102/oradata/ora102/example01.dbf
69
100
31
/export/home/ora102/oradata/ora102/qasb002.dbf
41
41
0
/export/home/ora102/oradata/ora102/system01.dbf
493
500
7
/export/home/ora102/oradata/ora102/sysaux01.dbf
430
430
0
/export/home/ora102/oradata/ora102/undotbs01.dbf
171
175
4
/export/home/ora102/oradata/ora102/users01.dbf
44
83
39
/export/home/ora102/oradata/ora102/test.dbf
1
70
69
This datafile should be resized to save 69 MB
SQL> alter database datafile '/export/home/ora102/oradata/ora102/test.dbf' resize 1m;
INTELLIGENCE. INNOVATION. INTEGRITY
Free the Space
 Space is still reserved for future inserts and
updates, just not freed back to the OS
 Space will not be automatically freed –
confirm by checking DBA_FREE_SPACE
 Ways to set it free
— drop
— truncate
— alter table move …
— alter table shrink space …
INTELLIGENCE. INNOVATION. INTEGRITY
Unindexed Foreign Keys Example
1 Million Rows
PARENT
COL1
ON DELETE CASCADE
CHILD
COL1
COL1_PARENT
1 Million Rows
SQL> DELETE FROM PARENT WHERE COL1 < 1000;
Fky.sql
INTELLIGENCE. INNOVATION. INTEGRITY
Before Index
delete from parent
where
col1 < 1000
call
count
------- -----Parse
1
Execute
1
Fetch
0
------- -----total
2
cpu
elapsed
disk
query
current
-------- ---------- ---------- ---------- ---------0.01
0.08
2
27
0
0.90
0.80
4
2208799
6062
0.00
0.00
0
0
0
-------- ---------- ---------- ---------- ---------0.91
0.88
6
2208826
6062
rows
---------0
999
0
---------999
delete from "DAVE"."CHILD"
where
"COL1_PARENT" = :1
call
count
------- -----Parse
1
Execute
999
Fetch
0
------- -----total
1000
cpu
elapsed
disk
query
current
-------- ---------- ---------- ---------- ---------0.00
0.00
0
0
0
285.94
293.11
1543900
2208789
1029
0.00
0.00
0
0
0
-------- ---------- ---------- ---------- ---------285.94
293.11
1543900
2208789
1029
rows
---------0
999
0
---------999
INTELLIGENCE. INNOVATION. INTEGRITY
SQL> create index prnt_ndx on child(col1_parent);
delete from parent
where
col1 < 1000
call
count
------- -----Parse
1
Execute
1
Fetch
0
------- -----total
2
cpu
elapsed
disk
query
current
-------- ---------- ---------- ---------- ---------0.00
0.00
0
0
0
0.53
0.47
7
13
7053
0.00
0.00
0
0
0
-------- ---------- ---------- ---------- ---------0.53
0.47
7
13
7053
rows
---------0
999
0
---------999
delete from "DAVE"."CHILD"
where
"COL1_PARENT" = :1
call
count
------- -----Parse
1
Execute
999
Fetch
0
------- -----total
1000
cpu
elapsed
disk
query
current
-------- ---------- ---------- ---------- ---------0.00
0.00
0
0
0
0.42
0.46
2
3002
4058
0.00
0.00
0
0
0
-------- ---------- ---------- ---------- ---------0.42
0.46
2
3002
4058
rows
---------0
999
0
---------999
INTELLIGENCE. INNOVATION. INTEGRITY
Unindexed Foreign Keys
 Problem is not limited to DELETE statements
 Search database for unindexed FK columns
 Script is on asktom
— Search for unindex.sql
INTELLIGENCE. INNOVATION. INTEGRITY
Summary Points
Create sound Archiving strategy based on
Oracle technical features as well as business
and/or legal requirements
 Leverage partitioning
 Move partitions to cheap disk when
appropriate
 Make partitions ‘read only’ and compressed
 Remove data via DROP or TRUNCATE if
possible
 If SQL DELETE, make sure to perform
maintenance operations
Consider 3rd partyINTELLIGENCE.
solutions
INNOVATION. INTEGRITY
“Well done is better than well said”
Ben Franklin
Questions?
INTELLIGENCE. INNOVATION. INTEGRITY