Think of a hint as something which affects the optimizer`s route
Download
Report
Transcript Think of a hint as something which affects the optimizer`s route
News from the DBA front
Nilo Segura Chinchilla
Oracle Support
CERN IT/DES
31-Oct-2007
Castor Face 2 Face Meeting Bologna
1
Agenda ?
I will just talk about different items
related to the Castor DB services
31-Oct-2007
Castor Face 2 Face Meeting Bologna
2
New RAC/NAS based hardware
Moving to a “certified” Oracle RAC system
Red Hat Certified Vendor
No more “Honest Jim” Computers…
Dell,Sun,HP,IBM,Fujitsu,NEC…
NAS/SAN ?
31-Oct-2007
NAS simpler to manage than SAN
Cluster filesystem!!
No need of Oracle’s ASM :)
NAS boxes used to be pricey.. But not any longer
Castor Face 2 Face Meeting Bologna
3
New RAC/NAS based Hardware
Current setup based on
HP DL380G5 2xCPU Intel dual core 8Gb memory
Next batch will be Quad core based
IBM NAS 5300
Actually re-branded Netapp 3050 boxes
Fiber Channel disks, two controllers
Can upgrade the controllers independently without
interrupting the service
Both controller do have access to all disks in the box.
HP Procurve switches for private interconnects
Gb Ethernet connections (private/public)
Provided by the IT Network team
31-Oct-2007
Castor Face 2 Face Meeting Bologna
4
RAC/NAS Experience
We are very happy with the results
Best thing since the last best thing
Test in ITDC Stager db showed very good results
in terms of transactions per minute
Good good good
Quality, stability, reliability, availability
Moving all our DB services into RAC/NAS
Engineering, HR, Finance, Accelerators,
Infrastructure
31-Oct-2007
Castor Face 2 Face Meeting Bologna
5
Installation
-
Red Hat 4.0 64bits Upgrade 5
-
-
RDBMS 10.2.0.3 64bits with some patches for
-
-
Brings asynchronous i/o over NFS
Init.ora filesystem_option=setall
Clusterware (crs)
Critical Patch Update
Some ORA-600
Automatic installation
-
For the OS using Quattor/CDB Templates technology
-
-
Home made RPMs for the RDBMS/CRS including patches
-
31-Oct-2007
About 30minutes for the installation
Using Oracle Enterprise Manager cloning technology
About 5-10minutes to complete the installation
Castor Face 2 Face Meeting Bologna
6
NAS Configuration
Netapp Data ONTAP 7.2.2
One problem with a box that reported an environmental
temperature of 0 degrees!!
RAID_DP (Double Parity): Improved RAID 6 implementation
We can survive two disk crashes..
Autosize Volume Option
Works pretty well once you set up the correct threshold at the
NAS level
Snapshot option to copy very quickly an image of the database
before doing an upgrade
Allows for a quick db recovery without going to RMAN
Monitoring using Oracle Enterprise Manager agent
Quattor can not be used since no agent can be installed in Data
ONTAP 7.2.2
31-Oct-2007
Castor Face 2 Face Meeting Bologna
7
31-Oct-2007
Castor Face 2 Face Meeting Bologna
8
Database configuration
Use two NAS boxes per RAC
Firsts box contains datafiles, controlfiles,
redo log groups,voting disks, OCR files
Second box contains voting disk files, OCR
files, one of the controlfiles, one log
member file of each redo log group
31-Oct-2007
Castor Face 2 Face Meeting Bologna
9
Database configuration
Set of standard tuning values for the init.ora
SGA_MAX_SIZE 50% of physical memory
SGA_TARGET disabled
PGA_AGGREGATE_TARGET = 400M
DB_CACHE_SIZE=1G
DB_CACHE_ADVICE=OFF
SHARED_POOL_SIZE=400M
Included the space required for the RMAN Block change
tracking feature
31-Oct-2007
Castor Face 2 Face Meeting Bologna
10
Database configuration
Use of dbms_stats
gather_system_stats
gather_dictionary_stats
gather_fixed_stats
Let Oracle to figure out
log_buffer
db_file_multiblock_read_count
31-Oct-2007
Castor Face 2 Face Meeting Bologna
11
Database configuration
Enabled in all Stager and DLF
databases
db_block_checking (default is false)
db_block_checksum (default is true)
We can detect very quickly if there were
corruptions at the db block level
31-Oct-2007
Castor Face 2 Face Meeting Bologna
12
Database configuration
We have set up an Automatic Test
Recovery system
Point in Time recovery of all our databases
Restore + Recovery + Archive Logs
We can discover “early enough” problems
in our backup chain
TSM Server + RMAN scripts
31-Oct-2007
Castor Face 2 Face Meeting Bologna
13
Saving table SUBREQUEST
In the beginning it was the table…
The the DBA said “Let be the Partition”
And the Castor team rejoiced because they saw it and it
was good..
31-Oct-2007
Castor Face 2 Face Meeting Bologna
14
Saving table SUBREQUEST
Non partitioned with an index on status + Function Based Index (FBI) to access
specific STATUS values
Not possible to do ALTER TABLE .. SHRINK SPACE due to limitation in
10gR2
Full Table Scans operation did suffer a lot
Moved to a two partition table
Processing & processed status values
No generic index on STATUS
Still with FBI, not possible to SHRINK SPACE
Better performance
Move to a N-partitions table
No FBI, less indexes to maintain during DML operations on the table
Can SHRINK SPACE (global indexes are maintained automatically with
cascade option)
Multiblock Read Block features improve the FTS performance
Even better performance :)
31-Oct-2007
Castor Face 2 Face Meeting Bologna
15
Performance
Jonathan Lewis on “Why do people think
that Oracle ignores hints ?”
“There are two main reasons. First, the available hints are not properly
documented. Second, the hints are rarely used properly - because they
are not documented properly. (And third, there are a few bugs that make
things go really wrong anyway).Think of a hint as something which affects
the optimizer’s route through it’s code path.
To handle any one statement, the optimizer may work through many join
orders, and each join order may entail several thousand calculations and
decision points. At each moment, a hint may change a calculation, or affect
the outcome of a decision.
But the fact that the optimizer has adjusted some of the numbers, or had its
options pruned, at various steps through the code path does not mean that
you will see the effect you were expecting.”
31-Oct-2007
Castor Face 2 Face Meeting Bologna
16
Performance
Use of PL/SQL helps a lot to change code in
case of performance issues
A new version can be deployed without stopping
any of the stager daemons
A new version of
SELECTTAPECOPIESFORMIGRATION was deployed
during Vlado’s presentation
Only the real databases will give us clues of
what to modify/tune in the code
Test/Stress db instances do (can) not match real
users
31-Oct-2007
Castor Face 2 Face Meeting Bologna
17
Performance
Only ATLAS Stager experienced a bad
behavior due to mistakes on the experiment
side.
CMS, LHCb, Alice are generally speaking rather
stable and well behaved (so far).
DLF db do not pose any problem
Simple system, with well defined set of SQL stmt
DLF dynamically creates one tablespace per day
Older tablespaces are switched to READ ONLY mode to
save backup efforts
31-Oct-2007
Castor Face 2 Face Meeting Bologna
18
Performance
Apparently we have a set of SQl stmt that will
flip flop from one good query plan to a bad
one depending on the amount of data involved
It was the case for selecttapecopiesformigration
A heavy migration exercise in Atlas stager with a lot of
small files triggered the problem
Coding alternative SQL stmts with different
optimizer hints ?
This is common practice in the db community
31-Oct-2007
Castor Face 2 Face Meeting Bologna
19
Performance
Started checking SRM performance
Couple of missing indexes identified
Part of the next release
Excellent performance during the test
period
It seems that performance wise, it will be
less demanding than the Stager
31-Oct-2007
Castor Face 2 Face Meeting Bologna
20
Performance - Summary
For the current workloads, the Stager db system does a good job
A different workload may stress in a different way the system and
may require further tuning at the schema level
I do not believe the init.ora will require major changes
How will it behave once we have (many) fearless individual users
interacting with Castor ?
I am pretty confident it will work o.k :)
If more CPU and/or memory is required, it is pretty easy to
upgrade the system
Adding a new node or replacing one is very simple with
RAC
I/O seems not to be a problem as we manage to cache a
large fraction of the data
31-Oct-2007
Castor Face 2 Face Meeting Bologna
21
Plans for the future
Move DLF to the new RAC system
Maybe integrate Stager/DLF into a single RAC ?
Stager in one node, DLF in the other
I/O would not be a problem
NAS boxes are not stressed at all
Working on Data Guard technologies to create/maintain
synchronized remote copies of the databases in the CERN
Prevessin site
Use of RMAN disk backup to avoid the recovery from tapes (slower)
Goal is to minimize the recovery time after a major problem
Waiting for HW
Re-test PL/SQL compilation in next releases
Problems with a single procedure forced us to disable it
31-Oct-2007
Castor Face 2 Face Meeting Bologna
22
Plans for the future
Incoming upgrades
10.2.0.4 by the end 2007
10.2.0.5 Summer 2008 - Terminal release ?
Security : CPU January, April
Dates already available
11gR1 – not in 2008, planning for 2009.
Wait for at least 1 patchset
We may skip 11gR1 and wait for 11gR2
Some interesting features
Server side PLSQL Functions caching
Direct NFS
Referential partitioning
Automatic creation of partitions based on the partition criteria
No more ALTER TABLE .. ADD PARTITION …
31-Oct-2007
Castor Face 2 Face Meeting Bologna
23
My wishlist…
DB constraints properly implemented in Stager and SRM
Tom Kyte (asktom.oracle.com) proved (at CERN) that a
missing NOT NULL constraint could change an execution
plan
SRM + Stager +DLF + Name Server together in a single
RAC/NAS system
One per experiment
Better isolation, easier to administer, patch
Open the gate to a unique API to deal with files in stager and
name server
Via PL/SQL
31-Oct-2007
Castor Face 2 Face Meeting Bologna
24
Q&A
More reading on the CERN RAC/NAS setup can be
found in the following CHEP 2007 paper
http://indico.cern.ch/contributionDisplay.py?contribId=
110&sessionId=28&confId=3580
31-Oct-2007
Castor Face 2 Face Meeting Bologna
25