Backup and Recovery

Download Report

Transcript Backup and Recovery

Backup and Recovery
Part 1
How oracle handle changes to data
• Changes to data are results of:
– Table being updated/deleted/inserted
– Database objects (tables, views, users) being
created/altered/dropped – (internally those result in
changes to Oracle data dictionary)
• Changes are always handled at the data block
level:
– Oracle keeps track of which blocks have changed
and when they have changed
How oracle handle changes to data
• When data block is changed:
– It is modified in memory and marked as dirty
– Change record (redo record) is added to the log
buffer
– Log buffer is later written to the current redo log
file
– Dirty blocks are later written to data files
Dirty blocks
• Dirty blocks are written to data files:
– When there is no space left in memory, e.g.:
• Query execution needs to load data into memory
• If there is no free memory, some old data needs to be
flushed
• If flushed buffers are dirty, they need to be written to
data files
–
–
–
–
When database is being closed
During checkpoint
Sometimes, during log switch
In some other situations too (depending on
configuration)
Log buffer
• Log buffer – memory buffer
• Every change to data block is recorded in a log
buffer (redo record)
• Log buffer is written to current redo log file:
– At every commit
– When the buffer is full (or almost full)
• Redo records are never used during normal
database operation, they are used for:
– Automatic instance recovery
– Manual media recovery
Changes to data – summary
• Changes to data:
– Are written to data files (delayed)
– Are written to redo log files (almost immediately)
• In case of instance failure:
– Oracle uses redo log files to recover changes that
were not written to data files
Redo log files
• Database must contain at least two redo log files
(online redo log files)
• Redo log files are written sequentially:
–
–
–
–
–
Redo log 1
Redo log 2
…
Redo log n
Redo log 1
• At any time, there is exactly one current redo log file
(the one being written by the database)
Redo log files
• Online redo log can be in three states:
– current – database is writing changes to this redo
log (exactly one online redo log is always current)
– active – if the database crashes now, redo log will
be used for recovery – it contains changes not yet
written to data files
– inactive – redo log contains changes not required
for database recovery – all information is already
written to data files
Checkpoint
• Redo log is active as long as there are dirty
buffers in the database corresponding to
changes written to that redo log
• At checkpoint all dirty buffers are written to
data files
• After checkpoint:
– One redo log file is current
– All other redo log files are inactive
• Checkpoint can be triggered manually:
– alter system checkpoint
Log switch
• Online redo log files have fixed size
• When the online redo log file becomes full,
Oracle switches to the next file – log switch
• log file to switch to must be inactive
• log switch can be triggered manually by the
DBA
– alter system switch logfile
Checkpoint at log switch
• Sometimes Oracle wants to do a log switch, but
the next redo log is active, then Oracle:
• stops all database operation
• performs a checkpoint
• when the next redo log file becomes inactive,
resumes operation
• prints message to alert log: "checkpoint not
complete"
System change number
• System change number – SCN:
– Oracle counts all transactions in the database
– Every time a transaction is committed, SCN is
incremented
– SCN is stored in:
• Control file – to know the current SCN of the database
• Data files – to "timestamp" the file. If file is replaced
with an old version from backup – Oracle detects it by
comparing the SCN in control file with SCN in data file
Oracle startup sequence
• At startup Oracle performs 3 steps
– STARTUP NOMOUNT – start the instance
– STARTUP MOUNT – open all control files
– OPEN DATABASE – open all remaining files –
data files, temp files, online redo logs
Instance startup
• During instance startup (STARTUP
NOMOUNT) Oracle:
– Reads instance parameters from PFILE or SPFILE
– Starts database processes according to the
parameters
– Finds the location of control files, but does not
open them
Mounting the database
• During database mount:
– Oracle opens and reads all copies of Control files
• If there is a problem with any of the control files, the
database cannot be mounted
– Oracle reads location of all data files, temp files,
redo log files, but does not open them
Opening the database
• During database open Oracle:
– Opens and reads headers of all data files
– If the database was closed properly (SCN in each
header file matches SCN in control file), there are
no additional steps
– If the database was not closed properly Oracle tries
to perform automatic instance recovery
Automatic instance recovery
•
During instance recovery Oracle:
1. Restores all data files to the state just before the
crash
–
–
Redo log files, which were active or current at the time
of the crash are read and changes from them are
applied to the data files
This phase is called rolling forward phase
2. Rolls back all transactions which were not
committed at the time of the crash
–
This phase is called rolling back phase
Media recovery
•
Instance recovery can fail for the following
reasons:
– Data file can be missing, corrupted
– Online redo log can be missing, corrupted
– Data file can be too old to be recovered (e.g. data
file was restored from old backup)
– Data file is too new (e.g. all files except one were
restored from the backup)
•
If instance recovery fails, manual media
recovery must be used to recover the database
Backups
•
There are two basic types of backups:
– Offline backup – backup of an inactive database,
that was shut down properly
– Online backup – backup of an open database
•
Oracle backups are performed by copying
Oracle files at operating system level
(Oracle is not involved).
Offline backup
• When performing offline backup:
• backup all data files, control files, server
parameter file (or parameter file)
• do not backup online redo log files! (redo log
files are used for recovery, they are not used
for clean database startup)
• Note: database must be shut down cleanly (e.g.
shutdown immediate). After shutdown abort redo
logs are required to open the database
Restoring offline backup
• To restore offline backup to original database directory:
• Restore all control files, data files, temp files from
backup
• If necessary, restore parameter file or server
parameter file
• Do not restore online redo logs (you can delete old
online redo logs if they are present)
• Startup and open the database with resetlogs option:
• STARTUP MOUNT
• ALTER DATABASE OPEN RESETLOGS
Archivelog mode
• Database can operate in two modes:
• noarchivelog mode – redo logs can be overwritten
as soon as they become inactive
• archivelog mode – redo logs are archived to safe
location before they can be overwritten
• Archivelog mode enables to recover database after
media failure
• In noarchivelog
• database can recover from instance failure
• usually database cannot recover from media
failure
Archivelog mode
• To switch between archivelog and noarchivelog
mode:
• startup mount
• alter database
archivelog/noarchivelog
• archive log list – shows information
• alter database open
• After switching database modes, shutdown the
database and do offline backup.
Archivelog mode
• Archivelog mode enables:
• online backups – backups done while the database
is running
• media recovery – recovering from loss of a data
file
• point in time recovery – recovery until specified
point in time or SCN – useful for recovering from
human errors
Media recovery
• Media recovery – recovery after a loss of a data file
• Media recovery requires:
• backup (online or offline) from before the crash
• archived redo logs from the time of the backup to
the time of the crash
• online redo logs
• Complete recovery – recovery of all committed
transactions
• Incomplete recovery – not all committed transactions
are recovered, some transactions are lost
Incomplete recovery
• Incomplete recovery is performed:
• when some redo log files are missing, e.g. one of
the archived redo logs is missing or online redo
log is missing
• when doing point in time recovery (recovery until
specified time or SCN)
• After incomplete recovery the database must be
opened with RESETLOGS option
Incomplete recovery - example
• Database running in ARCHIVELOG mode
• SCN: 1000 – Full backup
• SCN: 1250 – one of the archived redo logs is
accidentally erased
• SCN: 1500 – disk failure, data file is lost
• Recovery:
• restore the backup
• recover database using archived redo logs
• recovery stops at SCN 1250 and the remaining
transactions are lost
• we open the database at SCN 1250
Example cont.
• After recovery – full backup at SCN 1250
• The database is running and fails again at SCN 1700
• We have two sets of archived logs:
• SCN 1000 – 1500 (with missing 1250)
• SCN 1250 – 1700
• If incorrect redo logs are used during recovery –
database becomes corrupted
• To prevent incorrect usage of archived logs, Oracle
requires RESETLOGS option after incomplete
recovery
Complete recovery
• Online redo logs are required for complete recovery
• Committed transactions are not lost (as in incomplete
recovery)
• There is no need to open database with RESETLOGS
option
Performing recovery
• Recovery can be performed on open or closed
database
• For open database recovery, recovered
datafiles/tablespaces must be taken offline
• SYSTEM tablespace cannot be taken offline SYSTEM tablespace can only be recovered on
closed database
• It is possible to recover:
• Single datafile: RECOVER DATAFILE 'path';
• Single tablespace: RECOVER TABLESPACE users;
• Entire database: RECOVER DATABASE;
Closed database recovery
• Copy damaged files from backup (only damaged files,
not all data files)
• Make archived redo logs available to database
• startup mount
• Issue recover command:
• recover database (for entire database recovery)
• recover tablespace users (tablespace recovery)
• recover datafile ‘filename’ (datafile recovery)
• alter database open (after complete recovery)
• alter database open resetlogs (after incomplete
recovery)
Open database recovery
• Damaged datafiles are automatically taken offline by
Oracle
• Make damaged tablespace offline:
• alter tablespace XXX offline temporary;
• Copy damaged files from backup (only damaged
files, not all data files)
• Make archived redo logs available to database
• Issue recover command:
• recover tablespace XXX (tablespace recovery)
• alter tablespace XXX online;
Opening damaged database
• Database can be opened with some damaged datafiles
(except files from SYSTEM tablespace)
• In order to open the database:
• startup mount;
• alter database datafile ‘filename’
offline;
• alter database open
• alter tablespace XXX offline;
• To recover the datafile/tablespace perform open
database recovery
Recovery
• In order to open database after recovery all datafiles
must be recovered until the same SCN
• Recovery on open database must be complete in
order to make recovered file online
• Incomplete recovery requires restoring full backup,
not only damaged files
• Incomplete recovery can be:
• time based (recover until time)
• change based (recover until SCN)
• cancel based (user is prompted for redo logs and
can stop the recovery at any time)
Incomplete recovery
• Time based recovery:
STARTUP MOUNT
RECOVER DATABASE UNTIL TIME
'2004-04-01:15:12:00‘
ALTER DATABASE OPEN RESETLOGS
• Change based recovery:
STARTUP MOUNT
RECOVER DATABASE UNTIL CHANGE 100343
ALTER DATABASE OPEN RESETLOGS
Incomplete recovery
• Cancel based recovery:
STARTUP MOUNT
RECOVER DATABASE UNTIL CANCEL
... answer questions ...
CANCEL
ALTER DATABASE OPEN RESETLOGS
NOARCHIVELOG database
• Database in NOARCHIVELOG mode cannot be
recovered
• In case of failure – restore most recent backup of
datafiles and controlfiles (don’t backup and restore
online redo logs!)
• Execute:
STARTUP MOUNT
RECOVER DATABASE UNTIL CANCEL
CANCEL
ALTER DATABASE OPEN RESETLOGS
Failures while Oracle is running
• If Oracle detects disk failure while it is running:
• control file or log file -> terminate instance
• SYSTEM tablespace datafile -> terminate instance
• other datafile:
• in NOARCHIVELOG mode -> terminate
instance
• in ARCHIVELOG mode -> take datafile offline,
continue running