Using Oracle Data Guard for Applications Disaster Recovery

Download Report

Transcript Using Oracle Data Guard for Applications Disaster Recovery

Jeff Slavitz
Independent Consultant
Oracle Applications DBA and Developer
[email protected]
What Would You Do???
 It’s Monday morning.
 Fatal production system hardware problem, or
production hardware location is now a smoking hole
 Expect Production to be down for an unforseen
amount of time
 It is year-end, you have a new manager and your senior
DBA is on vacation
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
What is your backup situation?
 Do you have a cold backup? On-site or off-site?
 Do you have a hot backup. On disk? On tape? On
another accessible site? How recent is it?
 Will you be able to read your tape backups?
 How recent are the archive logs that are now on the
system that is down?
 Where are your archive log backups since the last hot
backup?
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Options
 Restore and clone latest backup to a test box. How




familiar is your DBA with RMAN recover? Is the box
big enough to run the company? Will interfaces work
(EDI, Payroll, …)
Rent or buy a box. How long would it take to find and
setup a new box?
How long will it take to get the company up and
running? Have you practiced this process or is this a
fire drill? How many P1s will you need?
How much data will the company lose?
What does the company do while Oracle is down?
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
How Do You Feel?
How Does Your Manager Feel?
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Or…
 In 15 minutes you’ve fully switched Oracle Applications
over to a standby database
 You’ve tested the procedure with all DBAs
 The failover procedure is fully supported and
documented by Oracle
 Interfaces are tested on the standby database
 You run the entire company once a month for a short
period of time on the standby database
 You have a test standby database that DBAs practice on
and you use for reporting purposes
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
How Do You Feel?
How Does Your Manager Feel?
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
My Goal Today
 Tell you how you can be relaxed, not stressed, in a
disaster recovery situation
 What is Data Guard
 Describe added value Data Guard and Flashback
provide in addition to disaster recovery
 On a high level, review steps to implement Data Guard
and Flashback
 Lessons learned - what Metalink notes don’t tell you
 Please ask questions as I go along
 Apologies to the non-DBAs
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
What is Data Guard?
 Data Guard is part of Oracle, not a separate
product, which provides a set of services that
create, maintain, manage, and monitor one or
more standby databases
 Data Guard maintains these standby databases as
transactionally consistent copies of the production
database
 Data Guard can switch any standby database to the
production role
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Without Data Guard
Oracle® Data Guard Concepts and Administration 10g Release 2
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
With Data Guard
Oracle® Database High Availability Best Practices 10g Release 2 (10.2)
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Why Not Use Data Guard?
 Too expensive – not! You already have and license it if you




have Oracle Enterprise Edition. Nothing to purchase in order
to implement what we discuss today.
Too bleeding edge. Data Guard has been around since Oracle
9i and Oracle 8i as Standby Server.
Too difficult to implement. Well documented by Oracle and
completely supported. This presentation is a roadmap
through the process plus my own observations and
experiences.
No DR box. Use test box for standby. Doesn’t protect against
smoking hole but it’s better than nothing.
How often do you really need DR? Is once enough? Plus a
standby database can be used for other purposes.
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Why You Should Use Data Guard
 If your Production Oracle Applications instance went
away would anybody notice?
 Our job is to protect the company’s data
 Your manager will thank you – DR is important.
 Data Guard plus Flashback provide added value:



Instant clone
Reporting instance
Many exciting possibilities!
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
What Is Flashback?
 Introduced in Oracle 9i
 Flashback is part of your Oracle database – no
licensing requirement
 Provides a rewind button for your data
 Flashback Database - return database to a previous
point in time
 Flashback Table - return a table to a previous point in
time
 Flashback Query – see data at a previous point in time
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Data Guard and Flashback Together
Many possibilities when used together …
Scenario:
Show stopper problem in Production
Need to test a patch or data fix with current production
data on test system ASAP
Old school: Clone
Time to complete: ??? Hours
Data Guard: Use standby database as an instant clone
Time to complete: 10 minutes
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Instant Clone
 Activate standby database
 Perform testing on standby database
 When testing complete, flashback standby database to
activation point in time
 Test again? Allows destructive testing.
 When all done, flashback and re-activate standby
database
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Reporting Database
Scenario:
You want a data warehouse which is a snapshot of
Production as of 9pm the day before. Users need access
via Oracle Applications and Discoverer
Old School: Daily clone
Data Guard: Use standby database as a reporting
database
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Reporting Database
 Activate standby database
 Users login to standby database with Oracle Applications,




Discoverer, or any other tool. Users can make changes in
Application though they won’t be saved after Flashback.
Archive logs accumulate on standby database
At 9pm each day flashback database to the way it was at
9pm the day before, apply archive logs created since then,
reopen database
Can have multiple standby databases – one for DR, one for
reporting, one for ???
Oracle 11i Active Data Guard allows read-only access to
standby database while it is being updated from primary
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Bumps Along the Way
 Time implementation so it’s not at year-end
 Bumpy transition into production – low Test database
activity is not an accurate picture of how Data Guard
and Flashback will work in Production
 3mb network was not enough for standby to keep up
with primary - needed 10mb just for Data Guard
 Do periodic full company test of standby system
 Data Guard and Flashback implementation are well
documented but there is a learning curve
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Implementing Data Guard and
Flashback
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Hardware Configuration
Primary
(California)
Standby
(Nevada)
Database
Database
45mb of which
8-10mb used by
Data Guard
Application
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Application
Software Configuration
 Database
 10.2.0.4
 Data Guard Physical Bundle patch 7936993
 Data Guard Logical Bundle patch 7937113
 Data Guard Broker Bundle patch 7936793
 Applications 11.5.10.2
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Data Guard Setup
Two Metalink notes:
 Note 216212.1: Business Continuity for Oracle
Applications Release 11i, Database Releases 9i and 10g
 Note 452056.1: Business Continuity for Oracle
Applications Release 12 on Database Release 10gR2
(earned my vote for best Metalink note ever)
 Use Release 12 note
 Better configuration, simpler and works fine with 11i
 Substitute APPL_TOP for Instance Top in text
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Logical vs Physical
Two types of standby databases
 Physical
 Byte-for-byte duplicate of the primary database
 Archived redo logs transferred from primary database are
directly applied to the standby database
 When standby in recovery mode it is mounted not open
 Logical
 Different structure than the primary database
 Oracle uses SQL statements to update standby
 Standby can be open while it is being updated
Today we are talking about Physical standby databases
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Maximum Protection Mode
 Three types of protection – Maximum Protection is
highest level of data security
 Primary database changes are not committed until it
has been confirmed that the data is available on at
least one standby database
 If Oracle determines that the redo data cannot be
transferred from the primary server to the standby
server, it will automatically stop the primary
database instance
 Data transmitted synchronously - network
implications
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Maximum Performance Mode
 Default protection mode
 The commit operation on the primary database is not
contingent upon the data being received by the
standby server
 If all of the standby servers become unavailable,
processing will continue on the primary database
 Data transmitted asynchronously
 This performance mode is what are discussing today
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Maximum Availability Mode
 Second highest level of data security
 Primary database changes are not committed until it
has been confirmed that the data is available on at
least one standby database
 If the standby database becomes unavailable for any
reason, the protection mode is temporarily lowered to
maximum performance until the problem has been
corrected
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Data Guard Installation Overview
 Develop a naming convention for your primary and






standby database servers e.g. PROD_<primary server>,
PROD_<standby server>
Create standby redo logs: one more than number of redo
logs, same size as redo logs, do not multiplex
Clone database to standby
Copy datafiles to standby with closed database or with
open database using RMAN (see note 753241.1, Configuring
Standby Database on R12 using RMAN Hot Backup)
Test primary and standby and listeners using tnsping
Clone application to standby
You are now ready to turn on Data Guard
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Primary init.ora ifile Parameters
# Global unique name
db_unique_name=PROD_CAMELDB1
# Flash recovery area. This is the default location for control files, online
# redo logs, archived redo logs, flashback logs, RMAN backups.
db_recovery_file_dest=/u01/oracle/flash_recovery_area
# First destination for archived redo logs
log_archive_dest_1 =
'LOCATION=USE_DB_RECOVERY_FILE_DEST MANDATORY‘
# Second destination for archived redo logs
log_archive_dest_2 = 'SERVICE=PROD_LNXDRDB1
valid_for=(online_logfiles,primary_role)
db_unique_name=PROD_LNXDRDB1 LGWR ASYNC=20480
OPTIONAL REOPEN=15 NET_TIMEOUT=30‘
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Primary init.ora ifile Parameters
# Limit on the total space available to the flash recovery area.
db_recovery_file_dest_size=100g
# In minutes, how long you want to keep flashback files. Since using
# guaranteed flashback really don't need this to be too big.
db_flashback_retention_target=120
# On/Off switch for sending logs to standby system
log_archive_dest_state_2 = defer
# Databases in Data Guard configuration using my instance naming
convention
log_archive_config='dg_config=(PROD_CAMELDB1,PROD_LNXDRDB1)‘
# Log gap detection and resolution when this database is the standby
fal_server = 'PROD_LNXDRDB1‘
fal_client = 'PROD_CAMELDB1'
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Start Data Guard
Start primary and second database and listeners
 Primary: startup
 Standby: startup mount
Turn on archive redo log transport on primary
 alter system set log_archive_dest_state_2=enable;
Put standby in recovery mode
 Recover managed standby database using current logfile
disconnect;
Confirm logs are shipping to standby
 Primary: alter system archive log current;
 Standby: Check archive directory or query v$archive_log
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Care and Feeding of Data Guard
 Check alert log of primary and standby on a regular basis
 Set Enterprise Manager to email you when an error shows
up in alert log
 Automatic archive log gap resolution usually works but
network connectivity issue can require you to manually fix
 Primary alert log will notify you when a gap exists:
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 435-442
 Copy missing archive logs from primary to standby
 On standby: alter database register logfile …
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Flashback
 db_recovery_file_dest
 defines Flash recovery area disk location
 This is the default location for control files, online redo logs, archived redo
logs, flashback logs, RMAN backups
 Archive logs are placed in flashback area in sub-directory with date
 DO NOT delete files in flashback area manually; database calculates space
available based on what RMAN has done
 db_recovery_file_dest_size
 Specifies limit on the total space to be used by the flash recovery area
 Check space usage using v$flash_recovery_area_usage
 When recovery area is full database stops!
 db_flashback_retention_target
 Defines in minutes, how long you want to keep flashback files
 Since using guaranteed flashback don't need this to be too big
 To turn flashback on: mount database and alter database flashback on
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Switchover – Database Steps
 Seamless switch from primary to standby
 Switch existing primary to new standby:
 Shutdown application
 Confirm all logs received and applied on standby
 alter database commit to switchover to standby with session
shutdown;
 Recover managed standby database using current logfile disconnect
 Switch existing standby to new primary:
 alter database commit to switchover to primary;
 alter database open;
 alter system set log_archive_dest_state_2=enable;
 Confirm logs are shipping to standby by checking alert log, archive
directory or query v$archive_log
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Switchover – Application Steps
 Some of these steps are not in Metalink note but are required
 Clear application context: exec fnd_conc_clone.setup_clean
 Run adautocfg on database and application tiers
 Run cmclean.sql (note 134007.1) to avoid Output Post Processing
manager not starting
 Update fnd_concurrent_requests
 Change logfile_name and outfile_name to new APPLCSF path
 Change logfile_node_name and outfile_node_name to new
host name
 Update fnd_conc_req_outputs (for XML reports)
 Change file_name to new APPLCSF patch
 Change file_node_name to new host name
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Switchover – Application Steps
 Update wf_notification_attributes.text_value with new node







name
Update wf_item_attribute_values.text value with new node
name
Check for any other profile values that contain old node name
Clear Apache cache
Start application and send users new login url
Synchronize APPLCSF log and out files
Use rsync
 Faster than scp and only copies changed files
rsync -avz
$APPLCSF/out/${TWO_TASK}*/ applprod@lnxdrapp1:$APPLCSF/out/${TWO_TASK}*
Consider setting cron job to synchornize APPLCSF in case of
system failure
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Failover
 Immediate switch of primary to standby system
 Some archive log data may be lost due to network
latency
 On standby-soon-to-be-primary:
 Cancel database recovery
 alter database commit to switchover to primary
 Configure application as shown in Switchover
 Even with cron job synchronizing APPLCSF log and
out files you probably won’t get all report output
 Repair former primary system
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Failover – Sync Old Primary as New Standby
 Establish new standby by copying primary datafiles
using RMAN, cold backup or flashback. To use
flashback:
 Find the SCN when the existing primary database
became the primary:
select to_char(standby_became_primary_scn) from v$database;
 On new standby:




Startup mount
flashback database to scn <above #>
alter database convert to physical standby;
recover managed standby database using current logfile
disconnect;
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Instant Clone or Reporting Database
 On standby:
Cancel standby recovery
create restore point STBY_ON guarantee flashback database
Active and open database
Configure application as in Switchover
 Users can now login to the standby database through Oracle
Applications, Discoverer or anything.
 Apply patches, test, update data in standby database.
 To revert to restore point STBY_ON and do further testing as needed:
Shutdown immediate
Startup mount
Flashback database to guaranteed restore point STBY_ON
Alter database open
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Resume Standby Operation
On Standby:
 Shutdown application
 Shutdown database
 Startup mount
 Flashback database to restore point STBY_ON
 Alter database convert to physical standby
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Lessons Learned
 Script as much as possible – you will make typos under
time pressure
 Use Enterprise Manager to see how Data Guard affects
your system and tune accordingly
 Implement Data Guard and flashback at different
times
 Setup one test database using Data Guard for testing
and training
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
For the Future
 Try Maximum Availability protection
 Use Enterprise Manager to manage Data Guard. This
requires Data Guard Broker implementation which
requires use of spfile.
 Completely script switchover and failover as shown in
note 452056.1
 Use RMAN to backup standby database only
 Upgrade database to 11G and experiment with Active
Data Guard. This allows you to create a read-only
database that is open and is constantly updated from
primary.
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
For More Information
 Google (it knows everything)
 http://blogs.oracle.com/stevenchan
 Note 452056.1 – Business Continuity for Oracle Apps Release 12 on







Database 10GR2 (works fine for 11i)
Note 216212.1 – Business Continuity for Oracle Apps Release 11i,
Database Releases 9i and 10G
Note 753241.1 – Configuring Standby Database on R12 using RMAN Hot
Backup (works fine for 11i)
Note 805438.1 – How to Open Physical Standby For Read Write Testing
and Flashback
Data Guard Redo Transport & Network Best Practices Oracle 10GR2
White Paper
Oracle Data Guard Concepts and Administration manual
High Availability Best Practices manual
Backup and Recovery Reference manual
Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Questions?
Want the latest version of this presentation?
www.OracleAppsPro.com
Jeff Slavitz
[email protected]
(415) 388 - 3003
Jeff Slavitz - NorCal OAUG Training Day 2010 v3