Oracle 10gR2 Availability
Download
Report
Transcript Oracle 10gR2 Availability
Oracle Infrastructure
Overview
CC Yu 余苓華
Senior Sales Consultant
Oracle Corporation
Agenda
Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
Customer's Practice
Q&A
2
Data Mirroring with ASM
ASM mirrors data across
inexpensive modular
storage arrays
Automatically remirrors
when disk or array fails
Designed to tolerate
failures
Failure Resiliency using Low Cost Storage
3
Flashback Database
Disk Write
New Block
Version
Data Files
Old Block
Version
Flashback
Log
A new strategy for point in time recovery
Flashback Log captures old versions of
changed blocks
– Think of it as a continuous backup
– Replay log to restore DB to time
– Restores just changed blocks
It’s fast - recover in minutes, not hours
It’s easy - single command restore
Flashback Database to ‘2:05 PM’
“Rewind” button for the Database
4
Flashback Error Correction
Database
Customer
Recovery at all levels
Database Level
–
Flashback Database restores
the whole database to time
Uses Flashback Logs
Table Level
–
Order
–
Flashback Table restores
rows in a set of tables to time
Uses UNDO in database
Flashback Drop restores a
dropped table or a index
Recycle bin for DROPs
Row Level
–
Restore individual rows
Uses Flashback Query
5
Restore Points (10gR2)
10gR2 allows the declaration of named restore points
Create Restore Point Known_good_point;
…..
Flashback Database to Restore Point Known_good_point;
–
–
Easy way to return to a known time
Guaranteed restore points ensure that flashback logs are retained
10gR2 can flashback through:
–
–
–
Previous database recovery and open resetlogs
Switchover to standby
Previous Flashback DB
Flashback and incremental backup apply can now be combined to
maintain a reporting database
–
Similar to split mirror and merge mirror
6
Flashback for All Users
END USER
• Flashback Query
• Flashback Row History
DEVELOPER
• Flashback Row History
• Flashback Transaction History
• Flashback Table
DATABASE ADMIN
• Flashback Database
• Flashback Drop
SYSTEM ADMIN
• Data Guard
7
RMAN is Oracle’s Recommended
Database Backup Tool
Enterprise Manager
& 3rd Party Tools
RMAN’s deep integration with the database engine
makes it the best tool for DB backup & recovery
–
Oracle Database
–
–
Tape Libraries
RMAN is used at thousands
of enterprise sites
–
–
Smart
Sophisticated backup and recovery strategies
Fast
Optimized backup to disk for fastest recovery
No extra redo during backup
Block level incremental backup
Reliable
Block contents are validated during backup
Easy
Simple management with Enterprise Manager
Supports over 20 Media Managers
Veritas, Legato, Tivoli, HP, Oracle Backup, etc.
8
Oracle Secure Backup – The Lowest
Cost Tape Backup Manager
File Systems
Linux, Unix
Windows,
Filers
Databases
Oracle Secure Backup is ideal for customers seeking a
low cost alternative to complex backup products
Best integrated end-to-end backup of Oracle Databases
–
–
Media manger for RMAN backup and recovery of
Oracle9i and 10g databases to tape
Fastest Database Backup on the market
Backup Oracle Home, App Server and other file systems
Oracle Secure Backup includes:
–
–
–
Centralized management of network backups
Scalability to low 100’s of servers, 10’s of millions of files
Easy management through Enterprise Manager
“Express” edition Bundled with Oracle Database –
replaces LSSV
Supports popular tape
libraries & drives
–
Single vendor support
9
Oracle 10g Manageability Out of Box
Installation
Fast, lightweight install
including Automated
Pre and Post Install
Steps
Installation Media
Optimization
Easy, fast client install
Enhanced silent install
for ISVs
Data Load
Data Pump
Cross-Platform
Transportable TS
Restartable Data Load
Ongoing System Management
Automatic Storage Management
Automatic Shared Memory Tuning
Advisors Out of the Box
Segment Advisor
Undo Advisor
Redo Log file Size Advisor
Automatic Undo Retention
Alert generation, out of the box
thresholds
Resource Manager
* Not a comprehensive list
Simplified Creation &
Configuration
Pre-configured
Database
90% Reduction in
Configuration
parameters
Automatic setup of
common tasks,
backups, stats
gathering etc
Out of of Box
Database Console
..and a lot more
Automatic Backup
Management
10
Automatic Database
Diagnostic Monitor (ADDM)
Application & SQL
Management
Storage
Management
System Resource
Management
Backup & Recovery
Database
Management
Space
Management
Management
Intelligent Infrastructure
Self-Diagnostic Engine In the
Database
Integrate all components
together
Automatically provides
database-wide performance
diagnostic, including RAC
Real-time results using the
Time Model
Provides impact and benefit
analysis, non problem areas
Provides Information vs. raw
data
Runs proactively out of the
box, reactively when required
11
How Does ADDM Work?
Snapshots in
Automatic Workload
Repository
Automatic
Diagnostic
Engine
Self-Diagnostic
Engine
Top Down Analysis Using AWR
Snapshots
Throughput centric - Focus on
reducing time ‘DB time’
Classification Tree - based on
decades of Oracle
performance tuning expertise
Real-time results
–
High-load
SQL
SQL
Advisor
IO / CPU
issues
System
Resource
Advice
Don’t need to wait hours to
see the results)
Pinpoints root cause
RAC issues
–
Distinguishes symptoms
from the root cause
Reports non-problem areas
Network +
DB config
Advice
–
E.g. I/O is not a problem
12
With Oracle 10g and Diagnostics
Pack….
System is maxed
out on CPU with
most waits in the
concurrency wait
class.
13
ADDM has automatically
identified that high CPU
utilization was caused by
repeated hard parses ……
ADDM Findings
14
…and recommends solution as well
explain how it diagnosed the problem
ADDM Findings
15
Good Performance Page
Once the solution is
applied, CPU
utilization falls
dramatically
..and waits
disappeared
16
Life Before and After ADDM
Scenario: Hard parse problems
Before
Examine system utilization
Look at wait events
Observe latch contention
See wait on shared pool and library cache latch
Review v$sysstat
See “parse time elapsed” > “parse time cpu” and #hard
parses greater than normal
Identify SQL by..
Identifying sessions with many hard parses and trace
them, or
Reviewing v$sql for many statements with same hash
plan
Examine and review SQL
Identify “hard parse” issue by observing the SQL contains
literals
Enable cursor sharing
Oracle10G
Review ADDM
recommendations
ADDM recommends
use of cursor_sharing
17
Oracle 10g
Storage - ASM
Human Error Protection - Flashback
Backup & Recovery – RMAN , Secure Backup
Manageability – Enterprise Management tool
18
Agenda
Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
Customer's Practice
Q&A
19
Highest Availability at
Lowest Cost
Traditionally High Quality = High Cost
–
–
High quality systems were built by combining high quality,
high cost parts
Mainframe model
Oracle enables a new model
–
Oracle’s vision is to attain the highest possible availability
using low cost computers and low cost storage
High Quality AND Low Cost
20
Single Instance
Instance
Server
Listener
Database
Clients
21
What do you do when there is
more than one?
Listeners
Real
Application
Clusters
Instance 1
Server 1
Instance 2
Server 2
Instance 3
Server 3
Database
Shared Disk
22
Client-Side Connection Load
Balancing
Listeners
sales.us.acme.com=
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=on)
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2)
(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=
sales.us.acme.com)))
Clients
23
Server Side Connection Load
Balancing
LISTENER
Service RAC?
RAC1 on N1
Application
Server
Network
RAC2 on N2
Network
RAC3 on N3
RAC Database
24
Connection Load Balancing
LISTENER
Service RAC?
RAC1 on N1
RAC2 on N2
Listeners
Clients
RAC3 on N3
RAC
Database
25
Oracle Database 10g Release 2 –
Real Application Clusters
Improved Robustness
–
–
Cluster Verification Utility
High Availability of Cluster
Required Files
ERP
CRM
DW
High Availability API for
integrated application
availability
Load Balancing Advisory
Runtime Connection Load
Balancing
Improved performance
Certified to 100 nodes
26
What if there are Multiple
Applications?
27
Automatic Workload Management
Application workloads can be defined as
Services
–
–
–
–
–
–
Individually managed and controlled
Assigned to instances during normal startup
On instance failure, automatic re-assignment
Service performance individually tracked
Finer grained control with Resource Manager
Integrated with other Oracle tools / facilities
(E.G. Scheduler, Streams)
28
Automatic Workload Management
Order Entry
Spare
Supply Chain
Normal Server Allocation
29
Automatic Workload Management
Order Entry
Supply Chain
End of Quarter
30
Automatic Workload Management
Order Entry
Spare
Supply Chain
Normal Server Allocation
31
Automatic Workload Management
Order Entry
Spare
Supply Chain
Server Fails
32
Automatic Workload Management
Order Entry
Supply Chain
Reallocate Spare server to Order Entry
33
Automatic Workload Management
Order Entry
Spare
Supply Chain
Failed Server Restored
34
Automatic Workload Management
Order Entry
Order Entry Supply Chain
Supply Chain
Application Resource Requirements Grow
35
Use EM to Define Services
USE EM to Manage Services
Topology View – Grid Control
38
Load Balancing Advisory
Load Balancing Advisory is an advisory for balancing
work across RAC instances.
Load balancing advice
– Is available to ALL applications that send work.
– Directs work to where services are executing well
and resources are available.
– Adjusts distribution for different power nodes,
different priority and shape workloads, changing
demand.
– Stops sending work to slow, hung, failed nodes
early.
39
Runtime Connection Load Balancing
with JDBC, ODP.NET
CRM requests connection
?
60%
connection
cache
“CRM is
bored”
Instance 1
30%
10%
“CRM
is very
busy”
Instance 2
“CRM is
busy”
Instance 3
41
Commercial Grids and
Availability
Grid pools standard low
cost nodes and modular
disk arrays
Perfect for RAC HA
Failover can happen to
any node on the grid
Grid load balancing will
redistribute load over
time
Designed to Tolerate Failures
42
Agenda
Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
Customer's Practice
Q&A
43
Introducing Oracle Data Guard
Oracle’s disaster recovery solution for Oracle data
Automates the creation and maintenance of one or
more synchronized copies (standby) of the production
(or primary) database
If the primary database becomes unavailable (disasters,
maintenance), a standby database can be activated and
assume the primary role
Feature of Oracle Database Enterprise Edition (EE)
–
–
Available at no extra cost
Primary and standby databases need to be licensed EE
44
Data Guard Configuration
Standby Site A
Primary Site
Standby Site B
Broker
Standby
Database
Primary
Database
Standby
Database
Managed as a single configuration
Primary and standby databases can be Real Application Clusters
or single-instance Oracle
Up to nine standby databases supported in a single configuration
45
Oracle Data Guard Architecture
Dallas
Sync or Async
Redo Shipping
Backup
Production
Database
Redo Apply
Network
Chicago
Physical Standby
Database
DIGITAL DATA STORAGE
DIGITAL DATA STORAGE
Broker
Transform
Redo to SQL
Logical Standby
Database
SQL
Apply
Open for
Reports
Boston
46
Switchover and Failover
Primary and Standby role transitions
Switchover
–
–
–
Planned role reversal
No database reinstantiation required
Used for maintenance of OS or hardware
Failover
–
–
–
Unplanned failure (e.g. disasters) of primary
Primary database must be reinstantiated / flashed back [10g]
Automatic failover possible [10g]
Initiated using simple SQL / GUI interface
Data Guard automates the processes involved
47
Flexible Data Protection Modes
Protection Mode
Risk of Data Loss
Redo Shipment
Maximum Protection
Zero Data Loss
Double Failure Protection
Synchronous redo
shipping to 2 sites
Maximum Availability
Zero Data Loss
Single Failure Protection
Synchronous redo
shipping
Maximum Performance
Minimal data loss –
usually 0 to few seconds
Asynchronous redo
shipping
Balance cost, availability, performance, and transaction protection
48
Data Protection Modes (contd.)
S1
P
S1
P
Maximum Availability
S2
P
S1
Maximum Protection
Maximum Performance
49
Low Cost No Compromise
Disaster Recovery
Production
Database
Transaction
Shipping
(Real Time Apply)
Reporting
On Real Time
Data
Standby
Database
Some Nodes
Used for
Other
Computing
No
Delay
Flashback
Flashback
Log
Log
Flashback DB removes need to delay apply of logs to correct errors
Flashback DB removes the need to reinstantiate primary on failover
Real-time log apply enables real-time reporting on standby
Data Guard works transparently across GRID clusters
–
Standby can use fewer CPU resources than primary
50
Fast-Start Failover
If primary database lost in a disaster, Data Guard automatically
fails over to a previously-chosen, synchronized standby, without
requiring any manual steps to invoke the failover
Used in a Broker configuration (DGMGRL or Enterprise
Manager), with a new Broker capability – the Observer, which
monitors the environment, and triggers a failover if necessary
Used in Maximum Availability protection mode, and with
Flashback Database – no data loss incurred
After failover completes, the Broker automatically reinstates the
old primary database as a new standby database
Specialized events generated to facilitate post-failover tasks
such as automatic application failover
51
Fast-Start Failover
Primary Site
Standby Site
Observer
1. Data Guard in steady state – transmitting redo
2. Observer monitoring state of the configuration
52
Fast-Start Failover
Primary Site
Standby Site
Observer
3. Disaster strikes the primary – connections lost
53
Fast-Start Failover
Primary Site
Standby Site
Observer
4. Observer <=> primary connection times out (timeout threshold configurable)
5. Observer asks target standby if it is ready to fail over
6. Observer begins Fast-Start Failover
54
Fast-Start Failover
Primary Site
Observer
7. Target standby automatically becomes new primary
55
Fast-Start Failover
Standby Site
Primary Site
Observer
8. After old primary is repaired, Observer re-establishes connection
9. Observer automatically reinstates old primary to be a new standby
10. Redo transmission starts from new primary to new standby
56
SQL Apply – Rolling Database Upgrades
Upgrade
Redo
Clients
A
Version X
1
B
Logs
Queue
Version X
Initial SQL Apply Config
A
X
2
X+1
Upgrade node B to X+1
Redo
Upgrade
B
Redo
A
B
X+1
X+1
4 Switchover to B, upgrade A
A
X
3
B
Patch Set
Upgrades
Major
Release
Upgrades
Cluster
Software &
Hardware
Upgrades
X+1
Run in mixed mode to test
57
Agenda
Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
Customer's Practice
Q&A
58
Data Guard and RAC
Data Guard and Real Application Clusters are complementary and
should be used together for Maximum Availability Architecture
Real Application Clusters provides high availability
–
–
Provides rapid and automatic recovery from node failures or an
instance crash
Provides low cost, application transparent scale-out using commodity
hardware
Data Guard provides disaster protection and prevents data loss
–
–
–
By maintaining transactionally consistent copies of primary database
Protects against disasters, data corruption and user errors
Does not require expensive and complex HW/SW mirroring
59
Maximum Availability
Architecture (MAA)
Operational Practices are key
M.A.A.
How to
Prevent,
Tolerate, &
Recover
–
Technology alone is not enough
MAA is a blueprint for achieving HA
& DR
–
From Outages
–
Tested, validated, and documented
best practices
Database, Storage, Cluster,
Network
20 person year effort
otn.oracle.com/deploy/availability
Maximum Availability = Unbreakable Architecture + Best Practices
60
Data Guard + RAC Configuration
Standby Site
Broker
Primary
Database
Data Guard
RAC
RAC
Primary Site
Standby
Database
Data Guard + RAC: end-to-end Data Protection and HA
Basis of Maximum Availability Architecture
Managed as a single configuration
61
Agenda
Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
Customer's Practice
Q&A
62
Usage Examples
Example A
Example C
Instance 1
Instance 2
- RAC
Standby machine must be powerful
enough to support multiple production
instances after switchover / failover
Database
Chicago
Dallas
Primary
Database
Standby
Database
Standby
Database
Primary
Database
Primary
Site A
Primary
Database
Primary
Site B
Primary
Database
Standby
Database
Standby
Database
Example B
Maximize primary and
standby resources
Primary
Site C
Primary
Database
Standby
Database
Standby Site
63
Usage Examples
Primary Site
Standby Site A
Physical Standby
Synchronous transport
LAN attached
Used to offload backups
First choice for switchover candidate
Standby Site B
Logical Standby
Synchronous transport
LAN attached
Used to offload reporting
Standby Site C
Example D
Physical Standby
Asynchronous transport
WAN attached
Provides DR and data protection
64
Redo Apply – Oracle Database 10g
E-Business Suite, Global Single Instance
–
With ~300 CPUs supporting production database, probably one
of the largest ERP deployments in the world
8 TB database
150 transactions per sec
7,000 concurrent users generating 8 MB of data/second
–
Maximum Availability Architecture – RAC + Data Guard
4-node primary with 4-node standby, located 1,000 miles apart
Benefit – standby systems used for development, test, and
for other databases & applications while in standby role
http://www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
65
Redo Apply – Oracle Database 10g
Data Guard used for Online Mortgage Banking, Customer Service
and other applications
Maximum Availability Architecture (MAA)
– RAC (5 nodes), Data Guard, RMAN, ASM on Linux
Zero Data Loss – synchronous redo transport
Production and standby sites located 20 miles apart
Benefits – reduced cost & enhanced data protection by replacing
remote-mirroring with Data Guard
http://www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
66
Additional Case Studies
Thomson Financial – MAA (RAC & Data Guard)
–
www.oracle.com/pls/cis/Profiles.print_html?p_profile_id=101166
PayTec – MAA (RAC & Data Guard)
–
www.oracle.com/technology/oramag/oracle/04-mar/o24available_feature.html
Osram Sylvania and BASF – SAP & MAA
–
www.oracle.com/newsletters/sap/volumes/volume14-en.pdf
ADT Security Services – SQL Apply over WAN
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
Kemira GrowHow – Data Guard replaces outsourced DR service
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
Amadeus – Data Guard for rolling upgrades
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
Fannie Mae – Data Guard – High Transaction Rates
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
67
Highest Availability at Lowest Cost
Highest Availability (RAC & Data Guard)
–
–
–
–
–
–
Fault Tolerant Clusters
Flashback Error Correction
Automated Disk Backup
No Compromise Disaster Recovery
Rolling Upgrades
Online Redefinition
At Lowest Cost
–
–
–
Low Cost Grid servers
Low Cost Modular Storage Arrays
Automated & Simple to Use
High Quality AND Low Cost
68
Q U E S T I O N S
A N S W E R S
For More Information?
Oracle業務經理 Lisa Chen 陳志勳
0800-672-251 分機 62185
[email protected]
Oracle 2 Day DBA Course
http://www.oracle.com/technology/obe/2day_dba/index.html
Oracle By Example (OBE) - Oracle Database 10g
Release 2
http://www.oracle.com/technology/obe/admin/db10gr2.html
Useful Website http://otn.oracle.com
70