Oracle 10gR2 Availability

Download Report

Transcript Oracle 10gR2 Availability

Oracle Infrastructure
Overview
CC Yu 余苓華
Senior Sales Consultant
Oracle Corporation
Agenda




Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
 Customer's Practice
 Q&A
2
Data Mirroring with ASM
 ASM mirrors data across
inexpensive modular
storage arrays
 Automatically remirrors
when disk or array fails
 Designed to tolerate
failures
Failure Resiliency using Low Cost Storage
3
Flashback Database
Disk Write
New Block
Version
Data Files
Old Block
Version
Flashback
Log
 A new strategy for point in time recovery
 Flashback Log captures old versions of
changed blocks
– Think of it as a continuous backup
– Replay log to restore DB to time
– Restores just changed blocks
 It’s fast - recover in minutes, not hours
 It’s easy - single command restore
Flashback Database to ‘2:05 PM’
“Rewind” button for the Database
4
Flashback Error Correction
Database
Customer
 Recovery at all levels
 Database Level
–
Flashback Database restores
the whole database to time
 Uses Flashback Logs
 Table Level
–
Order
–
Flashback Table restores
rows in a set of tables to time
 Uses UNDO in database
Flashback Drop restores a
dropped table or a index
 Recycle bin for DROPs
 Row Level
–
Restore individual rows
 Uses Flashback Query
5
Restore Points (10gR2)
 10gR2 allows the declaration of named restore points
Create Restore Point Known_good_point;
…..
Flashback Database to Restore Point Known_good_point;
–
–
Easy way to return to a known time
Guaranteed restore points ensure that flashback logs are retained
 10gR2 can flashback through:
–
–
–
Previous database recovery and open resetlogs
Switchover to standby
Previous Flashback DB
 Flashback and incremental backup apply can now be combined to
maintain a reporting database
–
Similar to split mirror and merge mirror
6
Flashback for All Users
END USER
• Flashback Query
• Flashback Row History
DEVELOPER
• Flashback Row History
• Flashback Transaction History
• Flashback Table
DATABASE ADMIN
• Flashback Database
• Flashback Drop
SYSTEM ADMIN
• Data Guard
7
RMAN is Oracle’s Recommended
Database Backup Tool
Enterprise Manager
& 3rd Party Tools
 RMAN’s deep integration with the database engine
makes it the best tool for DB backup & recovery
–
Oracle Database
–
–
Tape Libraries
RMAN is used at thousands
of enterprise sites
–
–
Smart
 Sophisticated backup and recovery strategies
Fast
 Optimized backup to disk for fastest recovery
 No extra redo during backup
 Block level incremental backup
Reliable
 Block contents are validated during backup
Easy
 Simple management with Enterprise Manager
Supports over 20 Media Managers
 Veritas, Legato, Tivoli, HP, Oracle Backup, etc.
8
Oracle Secure Backup – The Lowest
Cost Tape Backup Manager
File Systems
Linux, Unix
Windows,
Filers
Databases
 Oracle Secure Backup is ideal for customers seeking a
low cost alternative to complex backup products
 Best integrated end-to-end backup of Oracle Databases
–
–
Media manger for RMAN backup and recovery of
Oracle9i and 10g databases to tape
Fastest Database Backup on the market
 Backup Oracle Home, App Server and other file systems
 Oracle Secure Backup includes:
–
–
–
Centralized management of network backups
Scalability to low 100’s of servers, 10’s of millions of files
Easy management through Enterprise Manager
 “Express” edition Bundled with Oracle Database –
replaces LSSV
Supports popular tape
libraries & drives
–
Single vendor support
9
Oracle 10g Manageability Out of Box
Installation
Fast, lightweight install
including Automated
Pre and Post Install
Steps
Installation Media
Optimization
Easy, fast client install
Enhanced silent install
for ISVs
Data Load
Data Pump
Cross-Platform
Transportable TS
Restartable Data Load
Ongoing System Management
Automatic Storage Management
Automatic Shared Memory Tuning
Advisors Out of the Box
Segment Advisor
Undo Advisor
Redo Log file Size Advisor
Automatic Undo Retention
Alert generation, out of the box
thresholds
Resource Manager
* Not a comprehensive list
Simplified Creation &
Configuration
Pre-configured
Database
90% Reduction in
Configuration
parameters
Automatic setup of
common tasks,
backups, stats
gathering etc
Out of of Box
Database Console
..and a lot more
Automatic Backup
Management
10
Automatic Database
Diagnostic Monitor (ADDM)
Application & SQL
Management
Storage
Management
System Resource
Management
Backup & Recovery
Database
Management
Space
Management
Management
Intelligent Infrastructure
 Self-Diagnostic Engine In the
Database
 Integrate all components
together
 Automatically provides
database-wide performance
diagnostic, including RAC
 Real-time results using the
Time Model
 Provides impact and benefit
analysis, non problem areas
 Provides Information vs. raw
data
 Runs proactively out of the
box, reactively when required
11
How Does ADDM Work?
Snapshots in
Automatic Workload
Repository
Automatic
Diagnostic
Engine
Self-Diagnostic
Engine
 Top Down Analysis Using AWR
Snapshots
 Throughput centric - Focus on
reducing time ‘DB time’
 Classification Tree - based on
decades of Oracle
performance tuning expertise
 Real-time results
–
High-load
SQL
SQL
Advisor
IO / CPU
issues
System
Resource
Advice
Don’t need to wait hours to
see the results)
 Pinpoints root cause
RAC issues
–
Distinguishes symptoms
from the root cause
 Reports non-problem areas
Network +
DB config
Advice
–
E.g. I/O is not a problem
12
With Oracle 10g and Diagnostics
Pack….
System is maxed
out on CPU with
most waits in the
concurrency wait
class.
13
ADDM has automatically
identified that high CPU
utilization was caused by
repeated hard parses ……
ADDM Findings
14
…and recommends solution as well
explain how it diagnosed the problem
ADDM Findings
15
Good Performance Page
Once the solution is
applied, CPU
utilization falls
dramatically
..and waits
disappeared
16
Life Before and After ADDM
Scenario: Hard parse problems
Before










Examine system utilization
Look at wait events
Observe latch contention
See wait on shared pool and library cache latch
Review v$sysstat
See “parse time elapsed” > “parse time cpu” and #hard
parses greater than normal
Identify SQL by..
 Identifying sessions with many hard parses and trace
them, or
 Reviewing v$sql for many statements with same hash
plan
Examine and review SQL
Identify “hard parse” issue by observing the SQL contains
literals
Enable cursor sharing
Oracle10G


Review ADDM
recommendations
ADDM recommends
use of cursor_sharing
17
Oracle 10g




Storage - ASM
Human Error Protection - Flashback
Backup & Recovery – RMAN , Secure Backup
Manageability – Enterprise Management tool
18
Agenda




Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
 Customer's Practice
 Q&A
19
Highest Availability at
Lowest Cost
 Traditionally High Quality = High Cost
–
–
High quality systems were built by combining high quality,
high cost parts
Mainframe model
 Oracle enables a new model
–
Oracle’s vision is to attain the highest possible availability
using low cost computers and low cost storage
High Quality AND Low Cost
20
Single Instance
Instance
Server
Listener
Database
Clients
21
What do you do when there is
more than one?
Listeners
Real
Application
Clusters
Instance 1
Server 1
Instance 2
Server 2
Instance 3
Server 3
Database
Shared Disk
22
Client-Side Connection Load
Balancing
Listeners
sales.us.acme.com=
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=on)
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2)
(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=
sales.us.acme.com)))
Clients
23
Server Side Connection Load
Balancing
LISTENER
Service RAC?
RAC1 on N1
Application
Server
Network
RAC2 on N2
Network
RAC3 on N3
RAC Database
24
Connection Load Balancing
LISTENER
Service RAC?
RAC1 on N1
RAC2 on N2
Listeners
Clients
RAC3 on N3
RAC
Database
25
Oracle Database 10g Release 2 –
Real Application Clusters
 Improved Robustness
–
–
Cluster Verification Utility
High Availability of Cluster
Required Files
ERP
CRM
DW
 High Availability API for
integrated application
availability
 Load Balancing Advisory
 Runtime Connection Load
Balancing
 Improved performance
 Certified to 100 nodes
26
What if there are Multiple
Applications?
27
Automatic Workload Management
 Application workloads can be defined as
Services
–
–
–
–
–
–
Individually managed and controlled
Assigned to instances during normal startup
On instance failure, automatic re-assignment
Service performance individually tracked
Finer grained control with Resource Manager
Integrated with other Oracle tools / facilities
(E.G. Scheduler, Streams)
28
Automatic Workload Management
Order Entry
Spare
Supply Chain
Normal Server Allocation
29
Automatic Workload Management
Order Entry
Supply Chain
End of Quarter
30
Automatic Workload Management
Order Entry
Spare
Supply Chain
Normal Server Allocation
31
Automatic Workload Management
Order Entry
Spare
Supply Chain
Server Fails
32
Automatic Workload Management
Order Entry
Supply Chain
Reallocate Spare server to Order Entry
33
Automatic Workload Management
Order Entry
Spare
Supply Chain
Failed Server Restored
34
Automatic Workload Management
Order Entry
Order Entry Supply Chain
Supply Chain
Application Resource Requirements Grow
35
Use EM to Define Services
USE EM to Manage Services
Topology View – Grid Control
38
Load Balancing Advisory
 Load Balancing Advisory is an advisory for balancing
work across RAC instances.
 Load balancing advice
– Is available to ALL applications that send work.
– Directs work to where services are executing well
and resources are available.
– Adjusts distribution for different power nodes,
different priority and shape workloads, changing
demand.
– Stops sending work to slow, hung, failed nodes
early.
39
Runtime Connection Load Balancing
with JDBC, ODP.NET
CRM requests connection
?
60%
connection
cache
“CRM is
bored”
Instance 1
30%
10%
“CRM
is very
busy”
Instance 2
“CRM is
busy”
Instance 3
41
Commercial Grids and
Availability
 Grid pools standard low
cost nodes and modular
disk arrays
 Perfect for RAC HA
 Failover can happen to
any node on the grid
 Grid load balancing will
redistribute load over
time
Designed to Tolerate Failures
42
Agenda




Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
 Customer's Practice
 Q&A
43
Introducing Oracle Data Guard
 Oracle’s disaster recovery solution for Oracle data
 Automates the creation and maintenance of one or
more synchronized copies (standby) of the production
(or primary) database
 If the primary database becomes unavailable (disasters,
maintenance), a standby database can be activated and
assume the primary role
 Feature of Oracle Database Enterprise Edition (EE)
–
–
Available at no extra cost
Primary and standby databases need to be licensed EE
44
Data Guard Configuration
Standby Site A
Primary Site
Standby Site B
Broker
Standby
Database
Primary
Database
Standby
Database
 Managed as a single configuration
 Primary and standby databases can be Real Application Clusters
or single-instance Oracle
 Up to nine standby databases supported in a single configuration
45
Oracle Data Guard Architecture
Dallas
Sync or Async
Redo Shipping
Backup
Production
Database
Redo Apply
Network
Chicago
Physical Standby
Database
DIGITAL DATA STORAGE
DIGITAL DATA STORAGE
Broker
Transform
Redo to SQL
Logical Standby
Database
SQL
Apply
Open for
Reports
Boston
46
Switchover and Failover
 Primary and Standby role transitions
 Switchover
–
–
–
Planned role reversal
No database reinstantiation required
Used for maintenance of OS or hardware
 Failover
–
–
–
Unplanned failure (e.g. disasters) of primary
Primary database must be reinstantiated / flashed back [10g]
Automatic failover possible [10g]
 Initiated using simple SQL / GUI interface
 Data Guard automates the processes involved
47
Flexible Data Protection Modes
Protection Mode
Risk of Data Loss
Redo Shipment
Maximum Protection
Zero Data Loss
Double Failure Protection
Synchronous redo
shipping to 2 sites
Maximum Availability
Zero Data Loss
Single Failure Protection
Synchronous redo
shipping
Maximum Performance
Minimal data loss –
usually 0 to few seconds
Asynchronous redo
shipping
Balance cost, availability, performance, and transaction protection
48
Data Protection Modes (contd.)
S1
P
S1
P
Maximum Availability
S2
P
S1
Maximum Protection
Maximum Performance
49
Low Cost No Compromise
Disaster Recovery
Production
Database
Transaction
Shipping
(Real Time Apply)
Reporting
On Real Time
Data
Standby
Database
Some Nodes
Used for
Other
Computing
No
Delay




Flashback
Flashback
Log
Log
Flashback DB removes need to delay apply of logs to correct errors
Flashback DB removes the need to reinstantiate primary on failover
Real-time log apply enables real-time reporting on standby
Data Guard works transparently across GRID clusters
–
Standby can use fewer CPU resources than primary
50
Fast-Start Failover
 If primary database lost in a disaster, Data Guard automatically
fails over to a previously-chosen, synchronized standby, without
requiring any manual steps to invoke the failover
 Used in a Broker configuration (DGMGRL or Enterprise
Manager), with a new Broker capability – the Observer, which
monitors the environment, and triggers a failover if necessary
 Used in Maximum Availability protection mode, and with
Flashback Database – no data loss incurred
 After failover completes, the Broker automatically reinstates the
old primary database as a new standby database
 Specialized events generated to facilitate post-failover tasks
such as automatic application failover
51
Fast-Start Failover
Primary Site
Standby Site
Observer
1. Data Guard in steady state – transmitting redo
2. Observer monitoring state of the configuration
52
Fast-Start Failover
Primary Site
Standby Site
Observer
3. Disaster strikes the primary – connections lost
53
Fast-Start Failover
Primary Site
Standby Site
Observer
4. Observer <=> primary connection times out (timeout threshold configurable)
5. Observer asks target standby if it is ready to fail over
6. Observer begins Fast-Start Failover
54
Fast-Start Failover
Primary Site
Observer
7. Target standby automatically becomes new primary
55
Fast-Start Failover
Standby Site
Primary Site
Observer
8. After old primary is repaired, Observer re-establishes connection
9. Observer automatically reinstates old primary to be a new standby
10. Redo transmission starts from new primary to new standby
56
SQL Apply – Rolling Database Upgrades
Upgrade
Redo
Clients
A
Version X
1
B
Logs
Queue
Version X
Initial SQL Apply Config
A
X
2
X+1
Upgrade node B to X+1
Redo
Upgrade
B
Redo
A
B
X+1
X+1
4 Switchover to B, upgrade A
A
X
3
B
Patch Set
Upgrades
Major
Release
Upgrades
Cluster
Software &
Hardware
Upgrades
X+1
Run in mixed mode to test
57
Agenda




Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
 Customer's Practice
 Q&A
58
Data Guard and RAC
 Data Guard and Real Application Clusters are complementary and
should be used together for Maximum Availability Architecture
 Real Application Clusters provides high availability
–
–
Provides rapid and automatic recovery from node failures or an
instance crash
Provides low cost, application transparent scale-out using commodity
hardware
 Data Guard provides disaster protection and prevents data loss
–
–
–
By maintaining transactionally consistent copies of primary database
Protects against disasters, data corruption and user errors
Does not require expensive and complex HW/SW mirroring
59
Maximum Availability
Architecture (MAA)
 Operational Practices are key
M.A.A.
How to
Prevent,
Tolerate, &
Recover
–
Technology alone is not enough
 MAA is a blueprint for achieving HA
& DR
–
From Outages
–
Tested, validated, and documented
best practices
 Database, Storage, Cluster,
Network
 20 person year effort
otn.oracle.com/deploy/availability
Maximum Availability = Unbreakable Architecture + Best Practices
60
Data Guard + RAC Configuration
Standby Site
Broker
Primary
Database
Data Guard
RAC
RAC
Primary Site
Standby
Database
 Data Guard + RAC: end-to-end Data Protection and HA
 Basis of Maximum Availability Architecture
 Managed as a single configuration
61
Agenda




Oracle 10g Overview
HA Solution – Real Application Clusters (RAC)
Disaster Recovery Solution – Data Guard
Oracle MAA Architecture
(Maximum Availability Architecture)
 Customer's Practice
 Q&A
62
Usage Examples
Example A
Example C
Instance 1
Instance 2
- RAC
Standby machine must be powerful
enough to support multiple production
instances after switchover / failover
Database
Chicago
Dallas
Primary
Database
Standby
Database
Standby
Database
Primary
Database
Primary
Site A
Primary
Database
Primary
Site B
Primary
Database
Standby
Database
Standby
Database
Example B
Maximize primary and
standby resources
Primary
Site C
Primary
Database
Standby
Database
Standby Site
63
Usage Examples
Primary Site
Standby Site A
Physical Standby
Synchronous transport
LAN attached
Used to offload backups
First choice for switchover candidate
Standby Site B
Logical Standby
Synchronous transport
LAN attached
Used to offload reporting
Standby Site C
Example D
Physical Standby
Asynchronous transport
WAN attached
Provides DR and data protection
64
 Redo Apply – Oracle Database 10g
 E-Business Suite, Global Single Instance
–
With ~300 CPUs supporting production database, probably one
of the largest ERP deployments in the world
 8 TB database
 150 transactions per sec
 7,000 concurrent users generating 8 MB of data/second
–
Maximum Availability Architecture – RAC + Data Guard
 4-node primary with 4-node standby, located 1,000 miles apart
 Benefit – standby systems used for development, test, and
for other databases & applications while in standby role
http://www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
65
 Redo Apply – Oracle Database 10g
 Data Guard used for Online Mortgage Banking, Customer Service
and other applications
 Maximum Availability Architecture (MAA)
– RAC (5 nodes), Data Guard, RMAN, ASM on Linux
 Zero Data Loss – synchronous redo transport
 Production and standby sites located 20 miles apart
 Benefits – reduced cost & enhanced data protection by replacing
remote-mirroring with Data Guard
http://www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
66
Additional Case Studies
 Thomson Financial – MAA (RAC & Data Guard)
–
www.oracle.com/pls/cis/Profiles.print_html?p_profile_id=101166
 PayTec – MAA (RAC & Data Guard)
–
www.oracle.com/technology/oramag/oracle/04-mar/o24available_feature.html
 Osram Sylvania and BASF – SAP & MAA
–
www.oracle.com/newsletters/sap/volumes/volume14-en.pdf
 ADT Security Services – SQL Apply over WAN
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
 Kemira GrowHow – Data Guard replaces outsourced DR service
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
 Amadeus – Data Guard for rolling upgrades
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
 Fannie Mae – Data Guard – High Transaction Rates
–
www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html
67
Highest Availability at Lowest Cost
 Highest Availability (RAC & Data Guard)
–
–
–
–
–
–
Fault Tolerant Clusters
Flashback Error Correction
Automated Disk Backup
No Compromise Disaster Recovery
Rolling Upgrades
Online Redefinition
 At Lowest Cost
–
–
–
Low Cost Grid servers
Low Cost Modular Storage Arrays
Automated & Simple to Use
High Quality AND Low Cost
68
Q U E S T I O N S
A N S W E R S
For More Information?
 Oracle業務經理 Lisa Chen 陳志勳
0800-672-251 分機 62185
[email protected]
 Oracle 2 Day DBA Course
http://www.oracle.com/technology/obe/2day_dba/index.html
Oracle By Example (OBE) - Oracle Database 10g
Release 2
http://www.oracle.com/technology/obe/admin/db10gr2.html
 Useful Website http://otn.oracle.com
70