Inside the Interconnect
Download
Report
Transcript Inside the Interconnect
Data Guard
Basics
Julian Dyke
Independent Consultant
Web Version - February 2008
1
© 2008 Julian Dyke
juliandyke.com
Agenda
2
© 2008 Julian Dyke
Data Guard
The Theory
The Reality
juliandyke.com
Data Guard
The Theory
3
© 2008 Julian Dyke
juliandyke.com
Data Guard
Reasons for Deployment
4
Site Failures
Power failure
Air conditioning failure
Flooding
Fire
Storm damage
Hurricane
Earthquake
Terrorism
Sabotage
Plane crash
Planned Maintenance
HUMAN ERROR
© 2008 Julian Dyke
juliandyke.com
Data Guard
Standby Database
Primary Database
Standby Database
Primary
Standby
Instance
Database
Site 1
5
© 2008 Julian Dyke
Redo
Instance
Database
Site 2
juliandyke.com
Data Guard
Physical Standby
6
Physical Standby
Technology introduced in Oracle 7.2
Marketed as Data Guard in Oracle 8.1.7 and above
Standby is identical copy of primary database
Redo changes
transported from primary to standby
applied on standby (Redo Apply)
Can switch operations to standby
Planned (switchover / switchback)
Unplanned (failover)
Failover time dependent on various factors
Rate of redo generation / size of redo logs
Redo transport / apply configuration
© 2008 Julian Dyke
juliandyke.com
Data Guard
Logical Standby
7
Introduced in Oracle 9.2
Subset of database objects
Redo copied from primary to standby
Changes converted into logical change records (LCR)
Logical change records applied on standby (SQL Apply)
Standby database can be opened for updates
Can modify propagated objects
Can create new indexes for propagated objects
May need larger system for logical standby
LCR apply can be less efficient than redo apply
Array updates on primary become single row updates on
standby
© 2008 Julian Dyke
juliandyke.com
Data Guard
Protection Modes
8
Three protection modes:
Maximum protection - zero data loss
Redo synchronously transported to standby database
Redo must be applied to at least one standby before
transactions on primary can be committed
Processing on primary is suspended if no standby is
available
Maximum availability - minimal data loss
Similar to maximum protection mode
If no standby database is available processing
continues on primary
Maximum performance (default)
Redo asynchronously shipped to standby database
If no standby database is available processing
continues on primary
© 2008 Julian Dyke
juliandyke.com
Data Guard
Redo Log Shipping
9
ARCH background process
Copies completed redo log files to standby
LGWR background process - modes are:
ASYNC - asynchronous
Oracle 10.1 and below
redo written by LGWR to dedicated area in SGA
read from SGA by LNSn background process
Oracle 10.2 and above
redo written by LGWR to local disk
read from disk by LNSn background process
SYNC - synchronous
Redo written to standby by LGWR - modes are:
AFFIRM - wait for confirmation redo written to disk
NOAFFIRM - do not wait
© 2008 Julian Dyke
juliandyke.com
Data Guard
ARCH Redo Transmission
Primary Database
Primary
Database
Standby Database
LGWR
RFS
Online
Redo
Log
Standby
Redo
Log
ARC0
ARC1
MRP
LSP
Standby
Database
ARCn
LOG_ARCHIVE_DEST_1
Archived
Redo
Logs
10
© 2008 Julian Dyke
Archived
Redo
Logs
juliandyke.com
Data Guard
LGWR (ASYNC) Redo Transmission
Primary Database
Primary
Database
LGWR
Online
Redo
Log
Standby Database
RFS
LNSn
ARCn
MRP
LSP
Standby
Database
Standby
Redo
Log
ARCn
LOG_ARCHIVE_DEST_1
Archived
Redo
Logs
11
© 2008 Julian Dyke
Archived
Redo
Logs
juliandyke.com
Data Guard
LGWR (SYNC) Redo Transmission
Primary Database
Primary
Database
LGWR
LNSn
Standby Database
RFS
Online
Redo
Log
Standby
Redo
Log
ARCn
ARCn
MRP
LSP
Standby
Database
LOG_ARCHIVE_DEST_1
Archived
Redo
Logs
12
© 2008 Julian Dyke
Archived
Redo
Logs
juliandyke.com
Data Guard
Role Transitions
13
There are two types of role transition
Switchover
Planned failover to standby database
Original primary becomes new standby
Original standby becomes new primary
No data loss
Can switchback at any time
Failover
Unplanned failover to standby database
Original standby becomes new primary
Original primary may need to be rebuilt
Possible data loss
© 2008 Julian Dyke
juliandyke.com
Data Guard
Switchover
Before Switchover
Site1
Site2
Site1
Site2
Primary
Physical
Standby
Physical
Standby
Primary
Instance
Instance
Instance
14
After Switchover
Redo
Redo
Instance
Database
Database
Database
Database
Primary
Database
Standby
Database
Standby
Database
Primary
Database
© 2008 Julian Dyke
juliandyke.com
Data Guard
Failover
Before Failover
Site1
Site2
Site1
Site2
Primary
Physical
Standby
Primary
Physical
Primary
Standby
Instance
Instance
Instance
15
After Failover
Redo
Redo
Instance
Database
Database
Database
Database
Primary
Database
Standby
Database
Unavailable
Standby
Primary
Database
© 2008 Julian Dyke
juliandyke.com
Data Guard
Read-Only Mode
16
Physical standby database can be opened in read-only mode
(Managed) Recovery must be suspended
Reports can use temporary tablespaces
Sorts
Temporary tables
Reports cannot modify permanent objects
Failover times may be affected
Suspended redo must be applied
© 2008 Julian Dyke
juliandyke.com
Data Guard
Delayed Redo Application
17
Delay in redo application can be configured
Redo is transported immediately
Provides protection against site failure
Redo is not applied immediately
Provides protection against human error
Increases potential failover times
In Oracle 10.1 and above flashback database can be used as
an alternative to delayed redo application
© 2008 Julian Dyke
juliandyke.com
Data Guard
Data Guard Broker
18
Introduced in Oracle 9.2
Stable in Oracle 10.2 and above
Managed using DGMGRL utility
Contains Data Guard configuration
Additional layer of complexity
Used by Enterprise Manager to manage standby
Mandatory for some new functionality e.g.
Fast Start Failover
© 2008 Julian Dyke
juliandyke.com
Data Guard
Fast Start Failover
Primary
Observer
Node 1
Standby
Node 2
Site3
19
Database
Database
Site1
Site2
© 2008 Julian Dyke
juliandyke.com
Data Guard
Fast Start Failover
20
Detects failure of primary database
Automatically fails over to nominated standby database
Requirements include
Flashback logging must be configured
DGMGRL must be used
Observer process running in third independent site
Highly available in Oracle 11.1 and above
MAXIMUM AVAILABILITY protection mode
Standby database archive log destination must be
configured as LGWR SYNC
MAXIMUM PERFORMANCE protection mode
Oracle 11.1 and above
Primary database can potentially be reinstated automatically
Using flashback logs
© 2008 Julian Dyke
juliandyke.com
Data Guard
Fast Start Failover
21
Advantages
No interconnect network required between sites
No storage network required between sites
RAC licences not required if each site is a single-instance
Disadvantages
Active / Passive
Requires Enterprise Edition licence
Remaining infrastructure must also failover
Network
Application tier
Clients
© 2008 Julian Dyke
juliandyke.com
Data Guard
Oracle 11g New Features
22
Snapshot Standby
Standby can be converted to snapshot standby
Can be opened in read-write mode (for testing)
Redo transport continues
Redo apply delayed
Standby can subsequently be converted back to physical
standby
Active Data Guard
Separately licensed option
Updates applied to primary
Changes can be read immediately on standby databases
Standby database can be opened in read-only mode
Redo can continue to be applied
© 2008 Julian Dyke
juliandyke.com
Data Guard
Licensing
23
Standby database nodes must by fully licensed
Same metric as primary (named user, CPU etc)
Standard Edition
Cannot use Data Guard
Use user-defined scripts to transport redo
Use Automatic Recovery to apply redo
Manually resolve archive log gaps
Enterprise Edition
Use Managed Recovery to apply redo
Use Fetch Archive Logging to resolve archive log gaps
Additional licenses required for Active Data Guard
© 2008 Julian Dyke
juliandyke.com
Data Guard
Alternatives
24
Standard Edition
Manual log shipping using scripts
SAN level Replication technologies
Netapp SnapMirror, MetroCluster
EMC SRDF, Mirrorview
HP StorageWorks
Redo log replication technologies
Quest Shareplex
© 2008 Julian Dyke
juliandyke.com
Data Guard
The Reality
25
© 2008 Julian Dyke
juliandyke.com
Data Guard
The Reality
26
Many sites run physical standbys
Well proven technology
Spare capacity on standby often used for development or
testing during normal operations
Relatively few sites run a logical standby
Streams is much more popular
Many sites enable flashback logging
In both development and production environments
Very few using Automatic Failover
Very few sites working with Oracle 11g yet
Consequently none using Active Data Guard
© 2008 Julian Dyke
juliandyke.com
Data Guard
The Reality
27
Failover times
Normally dependent on management decisions
Usually some investigation before failover
Time to failover database is minimal (5-10 minutes)
Time to failover infrastructure can be hours
Network configuration
DNS
Application / web servers
Clients
Failover SLAs often up to 48 hours
Rebuild times
Can take minutes using flashback logging
Can take much longer depending on reason for failover
© 2008 Julian Dyke
juliandyke.com
Thank you for your interest
28
References
http://www.juliandyke.com/References/References.html
Questions
[email protected]
© 2008 Julian Dyke
juliandyke.com