Exchange 2003 Backup Restore & Recovery

Download Report

Transcript Exchange 2003 Backup Restore & Recovery

INF 311
Exchange 2003
Backup Restore & Recovery
Ronen Gabbay
► Microsoft Exchange Regional Director ► Exchange MVP
► Microsoft Secure and Well Managed Infrastructure Specialist ► Microsoft Israel
► Microsoft Exchange Server User Group leader
► E-mail [email protected]
Agenda
Are you aware
Exchange is a mission critical application
How to prepare:
Exchange Database Architecture
Backup and Restore
Recovery Storage Group
Dial Tone Recovery
Exchange Snapshot Architecture
How to avoid and what to do when things go bad
Exchange Errors and how to avoid them
ESEUtil & ISInteg and all in between
Server & Alternate Server Restore procedure
Exchange is a mission critical
application
Imagine how would life be without Exchange
No inbound nor outbound mails
No Calendar
No mails between company users
Have you considered how much money your
company loses
How long can you bear to be without Exchange
Have you have any SLA
Have you got any written policy in case……
Are you prepared for the worst
Storage Groups
STORE
Storage Group 1
Storage Group 2
ESE Instance
ESE Instance
LOG
LOG
LOG
EDB
STM
LOG
LOG
LOG
EDB
STM
EDB
STM
EDB
STM
EDB
STM
Exchange 2003 Databases
*.EDB Files
Properties Database
MAPI Messages and Attachments
Headers for STM pages
*.STM Files
Raw ‘streaming’ data (MIME, documents, multimedia,
etc)
Current database = EDB + STM + Unflushed Log
entries
Storing Data
Transaction
Log File
7
15
5
25
4
15
8
4
1
10
Database File
Memory
4 KB
1
2
3
4
7
15
25
4
5
6
7
8
17
8
3
1
9
10
11
12
10
13
14
15
16
18
19
20
21
22
23
24
25
ESE Database Consistency
Normal
Operation
Improper
Normal
Shutdown
Shutdown
Normal Operation
Transaction
Log File
Database File
Memory
77
15
15
55
25
25
44
15
15
88
44
11
10
10
44 KB
KB
4 KB11
22
33
44
77
15
15
25
25
44
55
66
77
88
17
17
88
33
11
99
10
10
11
11
12
12
10
10
13
13
14
14
15
15
16
16
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
Previous Log Files
Current Logs Are Renamed After 5 MB of Data Is Accepted
Current Log
Previous Logs
53 MB
Renamed
Edb.log
New
EXX.log
(5 MB)
EXX00001.LOG
EXX00002.LOG
EXX00003.LOG
.
.
.
Circular Logging
EXX0005.log
EXX0001.log
27
77
15
30
15
62
55
25
53
25
36
44
17
70
17
94
88
47
33
82
11
10
71
10
EXX0002.log
99
44
44
22
18
18
66
13
13
11
11
22
22
12
12
35
35
EXX0003.log
14
14
40
40
16
16
99
99
19
19
33
33
31
31
20
20
52
52
21
21
In most cases Circular Logging
should NOT be enabled
EXX0004.log
44
15
15
23
23
42
42
26
26
34
34
46
46
29
29
61
61
48
48
Checkpoint File
Transaction Log
Entries Written
to the Database
EXX.chk
Transaction Log
Entries Not Yet
Written to the
Database
EXXnnnnn.log
Starting an ESE Database
Locating the Checkpoint
Check File Signatures match
Check Database File Consistency
Recovering a Database at Startup
Online Maintenance
Not clash with backup schedules
Database size, Backup/Restore SLAs
Purge Indices (public and mailbox stores)
Tombstone Maintenance (public and mailbox stores)
Dumpster Cleanup (public and mailbox stores)
Cleanup Deleted Mailboxes (mailbox stores)
Events for each Database processed
700 Defrag Started
701 Defrag Ended
1221 The Defrag ended and the amount of free space in file
Exchange & Active Directory ™ 2003
Active Directory ™
Directory stores Exchange server information
The Exchange configuration is stored in the
Active Directory configuration partition
Mailboxes are not directory objects
A Mailbox is an attribute of a user
Mailboxes can be reconnected to other users
Mailbox retention time
Deleted Items retention time
Dumping Headers using ESEUtil
Configure Exchange EVS on Cluster
Exchange 2003 Backup Process flow
Backup APIs Called
• Backup program calls the ESE Backup Mode
Exchange backup API
• Store informs ESE
• The checkpoint file freeze
each database to
• Online maintenance pauses
backup
Backup Complete
ESE Normal Mode
• The checkpoint file unfreezes
• Online maintenance resume
Begin Backup
• Agent requests DB pages
sequentially 64K at time
• Pages are check-summed
as read
End Backup
• Page read completed
• Logs copied to tape
• Truncate logs
• Backup set closed
Backup
The IS must be running and all databases must be
mounted
Support for storage groups and for a single store
Best practice backup the entire storage group for
logs truncation
Concurrent backup/restore are supported
Truncate Log Files (*.log)
Check-summing
Database files (*.edb)
Stream files (*.stm)
Backup Types
Type
Copies DB
Copy Logs
Full(Normal)
X
X
Copy/Daily
X
X
Incremental
X
Differential
X
Snapshot
X
Offline
X
Truncates Logs
X
X
X
Not
recommended
Exchange 2003 Restore Process Flow
Dismount Database
• Backup
application or
Administrator
dismounts
database
Restore Complete
ESE Restore Mode
• Store informs ESE and
restore mode entered
• Restore SG created
ESE Normal Mode
• DB is mounted by SG
• Data deleted from
temporary directory
Begin Restore
• Agent copies EDB/STM
from Tape to DB path
• Log files from backup
set are copied
To temp restore location
End Restore
• Logs are processed
By ESE restore instance
• Current logs processed
By ESE restore instance
• Cleanup/Restore SG killed
Restore
MSExchangeIS must be running to restore
Databases to restore must be dismounted
and the
Db can be overridden by restore.
Remaining databases can be mounted.
System Attendant not used in restore.
Restore.env file created
Hard recovery via command line
Eseutil /cc
Restore: Restoring Files
Databases
Backup Agent asks the Store where to place
the database based on the
database GUID
Store has the databases placed on top of the
existing databases
Log
Placed in a temporary log directory
Location of temporary directory is specified by
user
Restore.env file is created
Restore.env
Replaces Restore In Progress Key
Placed in temporary log directory during restore
Data included in Restore.env
Restore path
Restore log file path
Storage group
System parameters for the restore storage group
Log file range
Restore time
Restore: Log File Replay
Log file signature checked
Log files replayed
First from temporary log file location
Then from running storage group
No Logs can be played if Circular Logging is enabled
In this case the database will be rolled back to the date of
backup
If there are multiple DB’s in a storage group only the
log records applicable to the failed DB are replayed,
the others are ignored
On-line vs. Off-line Backups
The IS must be running and the database mounted
Page level checksum, during on-line backup
Automatic Log files truncation indicates succesfull
backup
Only online backups are Supported
VSS or Off-line Backup needs manual integrity check
TIP: Moving large files (Eseutil or ESEFILE)
NTBackup to disk use /FU switch
file unbuffered
http://www.msexchange.org/tutorials/OptimizeExchange-2003-Performance-Part2.html
Recovery Storage Group
Enable the Administrator to verify backup consistency
Enables to recover a single item out of backup
In case of a long recovery process enables an instant
solution for mail connectivity – (Dial Tone)
It becomes the default location for the Backup API for
restore purpose
After using The RSG remove it ,or set the registry key
to ignore it by the Backup API
Location: HKEY_LOCAL_MACHINE \ System \
CurrentControlSet \ Services \ MSExchangeIS \
ParametersSystem
Parameter: Recovery SG Override
Type: REG_DWORD Value: 0x00000001
Dial Tone Recovery Explained
Restore mail service immediately; restore data later.
New e-mail vs. historical data
Why swap databases?
When should you use this method?
Q282496: Considerations and Best Practices When
Resetting an Exchange Mailbox Database
Dramatically reduce EXMerged data
Preserve single instance storage and avoid database
bloat
Setting End User Expectations
Send immediate status email
Data is being recovered, but service has been
restored in the meantime
Time expectations
If you intend to swap back in the original database,
users should not reconfigure rules, views, offline files
Merge process
Outlook 2003 Exchange
Recovery Mode
The Message
“Exchange is currently in recovery mode. You can either
connect to your Exchange server using the network,
work offline, or cancel this logon”
The OST problem
Linked to a specific mailbox.
Dependent on a matching key in the current user profile.
Key is destroyed by connecting to a different mailbox.
Q282496: Considerations and Best Practices When
Resetting an Exchange Mailbox Database
Exchange Recovery Mode Options
Offline – Access to current (Old) OST only
Online _ Access to new (Rest) OST only
Creating the Recovery Storage
Group
Stop the entire storage group
Why? So you can preserve all log files
COPY all log files and MOVE affected database files
Start blank store
Send status message to all affected mailboxes
Restoring to the Recovery
Storage Group
If possible, set paths to the same logical drive as the
original database
Recover the database in the Recovery Storage
Group
Mount and dismount the database in the Recovery
Storage Group
Notify users of an impending outage
Swap database methods:
Move database files between folders on the same
drive
Move database files between drives
Swap logical paths
Simple Backup & Restore
Dump the restore.env file
Using the Recovery Storage Group
Dial Tone Restore using RSG
Exchange Snapshot Support
VSS components
Windows Server™ 2003 (VSS service and framework)
Requestor
Third-party backup applications
NTBackup does not use VSS to back up Exchange
Writer
Application-specific logic for participating in snapshot
process, and restore/recovery
Provider
Third-party hardware control software
Windows® includes a software provider
Snapshot services
Exchange VSS Basics
ExWriter.dll Installed with Exchange 2003
Exwriter and file system allows for a completely safe
snap
Fast restores (sometimes minutes)
During an extended snap:
Clients may hour-glass during submits
Outlook 2003 in cached mode experiences zero
production impact
VSS and Exchange 2003
Two restore types:
Point in time restore (victimized restore)
(Recovery is to time of backup)
Roll forward recovery
(Recovery is to time of failure)
VSS can restore to:
Same location
Alternate forest and server
What can VSS back up and restore?
Backup
Only Read-only access is allowed
Storage Group Administration functions prohibited
Backup Choices
Minimum selection is the storage group (SG)
(to truncate log files)
Can snap multiple storage groups at the same time
(best practice will be to snap individual SG)
On Backup Complete
Requestor validates Shadow image consistency using ESEUtil /k
Log files are truncated
Storage Group Administration functions resume
Writing to the database is allowed
Restore
Restore choices
Entire storage group
Single database
Multiple databases from a single SG
VSS Best Practice
Snapshot is not a complete backup replacement but
an alternative to streaming you still need to backup
the databases
Makes sure requester does page check summing to
identify corruption Using Eseutil /K switch
Exchange 2003 supports full backups and copy
backups
Must restore to same logical drive letters
Put each database at its own LUN
This allows snapshot restore of a single database
No native support for VSS restores to RSG
NTbackup does not support Exchange VSS
Using OPC to extract data from snapshot
Going from Pages to Mailboxes
Database is composed of pages.
Pages are linked together into B+-Trees
B+-Trees are collected into tables.
Tables cross-reference each other and store folders,
messages, mailboxes and database metadata
Eseutil ISInteg and all in between
ESE Level
Database is seen as tables and indexes, not folders,
messages, attachments.
Application (Information Store) level
Database is seen as folders, mailboxes, messages,
etc.
Eseutil understands the database at the ESE level
only.
ISInteg understands the database at the application
level only.
ESEUtil switches
/D = Defrag Mode
New signatures
Log files mismatch, immediate backup required
/T can be used to specify alternate location for the temp files
/M = Dump headers (/ML, /MK, /MH)
/R = Recovery performs soft recovery
/K = Integrity Check for Snapshots
/G = Checks Integrity at the ESE Level
No Changes Are Made, This is a reporting Tool
/F = Copy Mode
/C = Hard Recovery Mode (direct to restore.env)
/CC = Force Hard Recovery
/CM Dump the restore.env file header
/P = Repair
/P
/CreateSTM
When to perform a repair
/p  /d  “Isinteg –fix –test alltests”
Dumping File Headers using ESEUtil
Database Header
Database signature, log signature
State: Consistent or Inconsistent
Checkpoint Header
Shows checkpoint log file and signature
Transaction Log File Headers
Generation number
Log file signature
The attached databases
ISInteg
-Dump: Dumps Database folders and indexes
Isinteg –dump
-Test: Test for integrity level errors
-Fix: Fixes integrity level errors
Isinteg –fix –test alltests.
Logical vs. Physical Corruption
You must understand the cause of the failure
Log file are missing or corrupted
Hardware / Virus software etc…
Database file is corrupted
There are three layers of Information Store database
corruption
Page level
ESE level
Store level
Strategies for removing
corruption
Restore an uncorrupted backup of the database
If possible
In case the backup is to old or can not be used
Repair the database
Expunge the corrupted pages from the database
Salvage data and generate a new database
Repairing the Database
Remember that if the cause of the problems are
corrupted or lost log files then the assumption is that
the database is undamaged
Sometimes simple check-disk would do the trick
Running the Repair Function
Attempts to repair links
If it finds physical corruption, it will delete the page
Database signature is changed if fixes are made
After Running Repair
Run ISInteg -fix
Perform a full backup of the database
Lesson: Error -1018, -1019 and 1022
Error -1018: JET_errReadVerifyFailure
Bad checksum
Wrong page number
Error -1019: JET_errPageNotInitialized
Page expected in use is un-initialized (pgno =
0x00000000)
Error -1022: JET_errDiskIO
Generic disk I/O failure
-1018 Error Reporting
Exchange 2003 SP1
Event ID
: 474
Raw Event ID : 474
Record Nr. : 34715
Category
: Database Page Cache
Source
: ESE
Type
: Error
Generated : 10/10/2005 01:00:00
Written
: 10/10/2005 01:00:00
Machine
: Exchange
Message
:
Information Store (4884) The database page read from the
file "C:\EXCHSRVR\MDBDATA\C4SG2DB2.edb" at offset
14080122880 (0x00000003473da000) for 4096 (0x00001000)
bytes failed verification due to a page checksum
mismatch. The expected checksum was 1506336388
(0x59c8de84) and the actual checksum was 1237900932
(0x49c8de84). The read operation will fail with error 1018 (0xfffffc06). If this condition persists then
please restore the database from a previous backup.
-1018 Error Reporting
Exchange 2003 SP2
Root Causes of Error -1018?
Hardware
Firmware
File system corruption
Virus Protection software
Not Exchange!
With some exceptions--having to do with false positives
and negatives, not actually causing -1018s
A -1018 would be Exchange’s fault if it:
Constructed the wrong checksum for a page
Dropped a page in the wrong place in the database
This does not mean that Exchange has no corrupting
bugs—but these errors are not -1018.
How serious is a -1018? When do you see it? What’s on
the page?
During normal operation? (somewhat serious)
During startup? (likely fatal)
During backup (may be minor)
What causes Error -1022?
Any disk I/O failure
File damage or truncation
File locked by another process
Anti-virus software
Almost always fatal to the service, but does not
necessarily indicate database damage
Troubleshooting
Check file size/lock status
Reboot to clear locks and other problems
Do not conclude the database is damaged until you
see that it is
Perform Soft Recovery Using Eseutil
Repair Exchange Database Using Eseutil
Running ISInteg /Fix
Exchange DRA tool
Restore AD Requirements
Exchange 2003 disaster recovery assumes that the
Active Directory is available and if necessary fully
recovered as well
Running Setup /DisasterRecovery rebuilds the
local box and does not re-write ANY data to the AD
Note: Exchange setup does not enforce or check
that objects already exist in the AD
Full Exchange Server Restore
Reconfigure hardware drives similar to original server
Reinstall operating system using the old server name
Install service packs and fixes
Install Exchange
Setup /DisasterRecovery
Install Exchange service packs and fixes
Restore Exchange databases
Any certificates must be restored separately
Using the Setup /Disasterrecovery switch
Alternate Server Restores
Active
Directory
Forest
Active
Directory
Forest
Restore
Production
Server
Copy
Restore
Server
Copy
.pst
Alternate Server Restores
Configuring the restore server
Configure new Windows 2003 Server with the latest
hotfixes/service packs
DCPROMO to create a new forest
Only one Exchange org per AD forest
alternatedomain.corp.mycompany.com
Install DNS as a standard primary, allowing dynamic
updates, and point to self as DNS server
Install Exchange using same org name and
Administrative Group name
Alternate Server Restores
LegacyExchangeDNs must match
/O=organization/OU=site/CN=
container/CN=object
First Exchange 2003 admin group usually is “First
Administrative Group”
AG display names can be changed, but this doesn’t
change the legacyExchangeDN
To match DNs
Use Event 1088 on the Eventvwr or use ISInteg
/DUMP
Use LegacyDN tool to modify legacyExchangeDNs
Alternate Server Restores
Create a new storage group and database with
matching display names
Or rename existing ones
Dismount database and mark “This database can be
overwritten by a restore”
Restore backup sets and mount database
Run Cleanup Agent and reconnect
Log on and extract data
To .PST using Outlook
Or, use Exmerge
Ronen Gabbay
Alternate Server Restore
Rename the Legacy DN using LegcyDN tool
Mounting the database
Create all users using Mailbox Recovery Center
Bulk reconnect all users to their mailboxes
Disaster Recovery Strategies
Service Level Agreements (SLA)
Monitoring and Notifications
Exchange Technical Expertise
Use relative small database files
Build Documentation / Change Control / Patch
Management
Firmware Updates
Software Updates
Documented Disaster Recovery Plan
Regular Recovery Testing
Review
Exchange is mission critical application
Exchange Database Architecture
Deleted items & Mailbox retention time.
Backup and Restore Architecture
Recovery Storage Group
Dial Tone Restore
Exchange Snapshot Architecture
Eseutil & ISInteg and all in between
Full Server Restore procedure
Alternate Server Restore procedure
Thank You !
Ronen Gabbay
► Microsoft Exchange Regional Director ► Exchange MVP
► Microsoft Secure and Well Managed Infrastructure Specialist ► Microsoft Israel
► Microsoft Exchange Server User Group leader
► E-mail [email protected]
This document is for informational purposes only.
MICROSOFT MAKES NO WARRANTIES, EXPRESS OR
IMPLIED, IN THIS DOCUMENT.
2002 Microsoft Corporation. All rights reserved.
Microsoft, BackOffice, the BackOffice logo, Microsoft
Internet Explorer logo, the Office logo, Where do you
want to go today?, the Windows logo, and Windows NT
are either registered trademarks or trademarks of
Microsoft Corporation in the United States and/or other
countries. Other product and company names mentioned
herein may be the trademarks of their respective owners.
© 2005 Microsoft Corporation. All rights reserved. This presentation is for informational
purposes only.
MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.