Transcript Flashback
Session id: 40164
How Oracle Database 10g
Revolutionizes Availability and
Enables the Grid
Juan Loaiza
Vice President, Systems Technologies
Oracle Corporation
From High Quality Parts to
High Quality Systems
Traditionally Low Cost = Low Quality
High quality systems were built by combining
high quality, high cost parts – Mainframe model
Oracle enables a new model
Oracle combines high volume inexpensive
processors and storage to produce a high
quality system
Unbreakable Inexpensive Systems
3
Low Cost Fault Tolerance
Computer
Failures
Unplanned
Downtime
Planned
Downtime
4
Data
Failures
System
Changes
Data
Changes
Grid Clusters
Low Cost Fault Tolerance
Commercial Grids and
Availability
Grid pools standard low
cost nodes and modular
disk arrays
Perfect for RAC HA
Failover can happen to
any node on the grid
Grid load balancing will
redistribute load over
time
Designed to Tolerate Failures
5
New Economics for
Data Protection & Recovery
Computer
Failures
Unplanned
Downtime
Planned
Downtime
7
Disk Based Recovery
Data
Failures
System
Changes
Data
Changes
Trade cheap disk space
for expensive downtime
New World:
Disk Based Data Recovery
1980’s - 200 MB
Disk economics are close to tape
Disk is better than tape
–
1000x increase
Random access to any data
We rearchitected our recovery
strategy to take advantage of
these economics
–
Random access allows us to
backup and recover just the
changes to the database
Backup and Recovery goes from
hours to minutes
2000’s - 200 GB
8
Resiliency using Low Cost Storage
Computer
Failures
Unplanned
Downtime
Storage Failure
Human Error
Data
Failures
Corruption
Planned
Downtime
9
System
Changes
Site Failure
Data
Changes
Four Failure Types
Data Mirroring with ASM
ASM mirrors data across
inexpensive modular storage
arrays
No additional logging or
expensive NVRAM to recover
mirrors
–
Database logging recovers
mirrors
Automatically remirrors when
disk or array fails
Designed to tolerate failures
Failure Resiliency using Low Cost Storage
10
Collapsing the Cost of Human Error
Computer
Failures
Unplanned
Downtime
Storage Failure
Human Error
Data
Failures
Corruption
Planned
Downtime
11
System
Changes
Data
Changes
Site Failure
Human Error
Human
Errors
Single Biggest Cause
of Downtime
Other
Downtime
Goal is to quickly analyze and repair
–
For Localized damage
Need surgical analysis and repair
Example – deleted wrong order
–
For Widespread damage
Need complete back-out to avoid long downtime
Example – batch job deletes this month’s orders
12
Flashback Time Navigation
Flashback Query
–
Query all data at point in time
Select * from Emp AS OF ‘2:00 P.M.’ where …
Tx 3
Flashback Versions Query
–
See all versions of a row between
two times
–
See transactions that changed the
row
Select * from Emp VERSIONS BETWEEN
‘2:00 PM’ and ‘3:00 PM’ where …
Tx 2
Tx 1
13
Flashback Transaction Query
–
See all changes made by a
transaction
Select * from DBA_TRANSACTION_QUERY
where xid = ‘000200030000002D’;
Flashback Database
Disk Write
New Block
Version
Data Files
14
Old Block
Version
Flashback
Log
A new strategy for point in time recovery
Flashback Log captures old versions of
changed blocks
– Think of it as a continuous backup
– Replay log to restore DB to time
– Restores just changed blocks
It’s fast - recover in minutes, not hours
It’s easy - single command restore
Flashback Database to ‘2:05 PM’
“Rewind” button for the Database
Flashback Error Correction
Database
Customer
Recovery at all levels
Database Level
–
Flashback Database restores
the whole database to time
Uses Flashback Logs
Table Level
–
Order
–
Flashback Table restores
rows in a set of tables to time
Uses UNDO in database
Flashback Drop restores a
dropped table or a index
Recycle bin for DROPs
Row Level
–
15
Restore individual rows
Uses Flashback Query
Flashback for All Users
END USER
• Flashback Query
• Flashback Versions Query
DEVELOPER
• Flashback Versions Query
• Flashback Transaction Query
• Flashback Table
DATABASE ADMIN
• Flashback Database
• Flashback Drop
SYSTEM ADMIN
• Data Guard
16
Revolution in Recovery
Flashback Revolutionizes Recovery
–
–
Operates on just the changed data
Time to correct error equals time to make error
Minutes instead of hours
Correction Time = Error Time + f(DB_SIZE)
Flashback is Easy
–
17
Single command instead of complex procedure
Prevention & Recovery of
Corruptions
Computer
Failures
Unplanned
Downtime
Storage Failure
Human Error
Data
Failures
Corruption
Planned
Downtime
18
System
Changes
Data
Changes
Site Failure
Flash Recovery Area
Fully automatic disk based
backup and recovery
–
Set and Forget
Nightly incremental backup rolls
forward recovery area backup
–
Database
Area
Nightly Flash Recovery
Apply
Area
Validated
Incremental
Weekly
Archive
To Tape
Two Independent Disk Systems
20
Changed blocks are tracked
in production DB
Full scan is never needed
–
–
Dramatically faster (20x)
Blocks validated to prevent
corruption of backup copy
Use low cost ATA disk array for
recovery area
Low Cost No Compromise
Disaster Recovery
Computer
Failures
Unplanned
Downtime
Storage Failure
Human Error
Data
Failures
Corruption
Planned
Downtime
21
System
Changes
Data
Changes
Site Failure
Existing Site Recovery Tradeoffs
Production
Database
Transaction
Shipping
Standby
Database
Reporting
On Delayed
Data
4 Hour
Delay
Apply
User can delay log apply to protect from user errors but:
–
–
Failover takes hours
Reports run on hours old data
After failing over to standby, production DB must be rebuilt
–
22
Production has updates that did not get to standby
Low Cost No Compromise
Disaster Recovery
Production
Database
Transaction
Shipping
(Real Time Apply)
Reporting
On Real Time
Data
Standby
Database
Some Nodes
Used for
Other
Computing
No
Delay
Flashback
Flashback
Log
Log
Flashback DB removes need to delay apply of logs to correct errors
Flashback DB removes the need to reinstantiate primary on failover
Real-time log apply enables real-time reporting on standby
Data Guard works transparently across GRID clusters
–
23
Standby can use fewer CPU resources than primary
Highest Data Protection
Lowest Cost
Dramatic Advances
in Ease of Use
Data Guard
Flash
Recovery
Area
Flashback
Human Error
Protection
ASM Mirroring
Storage Failure
Protection
24
Corruption
Protection
Site Failure
Protection
Combine the
Features to
Achieve Any Level
of Data Protection
No Cost System Changes
Computer
Failures
Unplanned
Downtime
Planned
Downtime
27
Data
Failures
System
Changes
Goal
Allow any change to
the system with no
downtime
Online Reconfiguration
Rolling Upgrades
Data
Changes
No Cost System Changes –
Capacity on Demand
CPU
–
Add/remove CPUs on SMP online
Cluster Nodes
–
–
Add/remove cluster nodes online
No data movement needed
Memory
–
–
Grow and shrink shared memory
and buffer cache online
Auto tuning of memory online
Disk
–
–
–
28
Add/remove disks online
Automatically rebalance
Move datafiles
Rolling Patch Upgrade using RAC
Clients
A
B
1
Clients
A
B
B
Patch
2
Initial RAC Configuration Clients on A, Patch B
A
B
4
Upgrade Complete
29
Patch A
A
Oracle
Patch
Upgrades
Operating
System
Upgrades
B
3
Clients on B, Patch A
Hardware
Upgrades
Rolling Release Upgrade using Data Guard
Logs
Ship
Logs
Queue
Clients
Clients
1
Patch Set
Upgrades
2
Version X
Version X
Initial SQL Apply Config
X
X+1
Upgrade node B to X+1
Logs
Ship
Logs
Ship
Clients
3
X+1
X+1
Switch to B, upgrade A
X
Run mixed to test
Major
Release
Upgrades
Cluster
Software &
Hardware
Upgrades
Clients
4
30
Upgrade
X+1
No Cost Data Changes
Computer
Failures
Unplanned
Downtime
Data
Failures
Goal
Competitive pressures
demand continual change
Need to change data with
no interruption to the
application
–
Planned
Downtime
31
System
Changes
Data
Changes
location, format, indexing,
or even definition
Online Redefinition
Evolution without Interruption
Maximum Availability
Architecture (MAA)
Operational Practices are key
–
Technology alone is not enough
MAA is a blueprint for achieving HA
& DR
–
–
Tested, validated, and documented
best practices
Database, Storage, Cluster,
Network
10 person year effort
otn.oracle.com/deploy/availability
Maximum Availability = Unbreakable Architecture + Best Practices
34
Highest Availability at Lowest Cost
Highest Availability
–
–
–
–
–
–
Fault Tolerant Clusters
Flashback Error Correction
Automated Disk Backup
No Compromise Disaster Recovery
Rolling Upgrades
Online Redefinition
At Lowest Cost
–
–
–
Low Cost Grid servers
Low Cost Modular Storage Arrays
Automated & Simple to Use
Oracle10g is Unbreakable & Inexpensive
35
Next Steps
High Availability Sessions from Oracle
Tuesday in Moscone Room 304
Wednesday in Moscone Room 304
11:00 AM
8:30 AM
How Oracle Database 10g
Revolutionizes Availability and
Enables the Grid
Oracle Database 10g - RMAN and ATA
Storage in Action
11:00 AM
3:30 PM
Oracle Recovery Manager (RMAN)
10g: Reloaded
Oracle Data Guard: Maximum Data
Protection at Minimum Cost
1:00 PM
5:00 PM
Proven Techniques for Maximizing
Availability
Oracle Database 10g Time Navigation:
Human-Error Correction
4:30 PM
Data Guard SQL Apply: Back to the
Future
36
For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/
Next Steps
High Availability Sessions from Oracle
Thursday
Database HA Demos All Four Days
In The Oracle Demo Campground
8:30 AM in Moscone Room 304
Oracle Database 10g Data
Warehouse Backup and Recovery:
Automatic, Simple, Reliable
8:30 AM in Moscone Room 104
Building RAC Clusters over
InfiniBand
Real Application Clusters
Data Guard
Database Backup & Recovery
Flashback Recovery
LogMiner, Online Redefinition, and
Cross Platform Transportable
Tablespaces
37
For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/
Q U E S T I O N S
A N S W E R S
New Oracle Database 10g HA Features
Clusters
Portable Clusterware
Cluster file system for
Linux & Windows
Automated Patching
Data Guard SQL Apply
Support for Longs
Support for multi-byte
CLOBs and NCLOBs
Support for Index
Organized Tables
Simplified zero data
loss failover
Real time apply allows
real time reporting
Zero downtime
instantiation
39
Data Guard Generic
Data Guard Broker
support for RAC
Named Data Guard
Configurations
Real Time Apply
Flashback Standby
Database
Flashback
Reinstantiation
Improved Recovery
Parallelism
Rolling Upgrades
Rolling Upgrades Using
Data Guard SQL Apply
Online Redefinition
Support of Unique
Indexes
One Step Cloning of
Dependent Objects
Columns can be
Populated Using
Sequences & Sysdate
Signature Based
Dependency Tracking
Using Synonyms
Online Segment Shrink
New Oracle Database 10g HA Features
40
Flash Backup &
Recovery
Automated
Management of B&R
Disk Space
Simplified Backup
Using Image Copy
Change Aware
Incremental Backups
Incrementally Updated
Backups
Compressed archive
logs
Tuning
Improved Recovery
Parallelism
Faster Instance Startup
& Cache Warm
Backup & Recovery
Simplified Recovery
Through Resetlogs
Restore Tolerates Missing
Backups
Proxy Backup of Archives
Automated TSPITR
Instantiation
Full DB Begin Backup
Automated Backup
Channel Failover
Simplified RMAN
cataloging of backup files
Automated File Creation
during Recovery
Drop Database
Rename Tablespace
Flashback
Flashback Drop
Flashback Row History
Flashback Table
Flashback Transaction
History
Flashback Database
Better map of time to
SCN for flashback query
LogMiner
Automated Specification
of Logs to Mine
Support for Shared
Server Configurations
Fine Grained
Supplemental Logging