and “When” before delving into “How to” options
Download
Report
Transcript and “When” before delving into “How to” options
Exploring Database Archival Strategies
With emphasis on “Why” and “When”
before delving into “How to” options
By: Ben Aminnia
President, L.A. SQL Server Professionals Group www.sql.la
Database Architect, Pointer Corporation www.pointercorp.com
Objectives
What “is” and what “is not” covered in this
presentation
The first question we should ask ourselves is
not “How to archive” but rather …
Why do we need to archive and what happens
if we don’t?
And then …
2
Objectives
The “How to” part will encompass multiple
questions …
How do we plan and design the archive from
an architect’s perspective?
How do we look for alternative approaches?
How do we choose among those alternative
approaches?
And more importantly …
Remember that we’re not alone in the decision
process.
3
Management Concerns
Before the archive process starts,
management must approve the approach.
How do we defend and justify our selected
approach?
When talking to IT Management
When talking to Non-IT Management; CEO’s; CFO’s
How do we put ourselves in their shoes?
How do we build a decision matrix to compare
various alternative approaches?
What criteria columns should we put on the
decision matrix?
What’s the cost / benefit summary of different 4
alternatives on our decision matrix?
Management Concerns
How do we measure cost / benefit?
One-time / Initial Costs
Recurring / Annual Costs
What about measuring benefits? Have you thought
about this?
Do we have a Service Level Agreement
(SLA)?
5
Management Concerns
But the challenge of seeking management
approval may go well beyond that;
They may also ask: How did we get here?
Why didn’t we do this and that earlier?
Why can’t we save nothing and just recreate?
Trying to solve one problem while creating
other problems …
L.A. Traffic is Bad
Make all highways one-way to Big Bear Lake
That will solve L.A.’s traffic problem
Let the mayor of Big Bear Lake worry about
their traffic problem; that’s not my problem!6
Why Do We Archive?
Increasing Cost of Storage / Hardware
Performance Degradation / Response
Time
Regulatory / Government Requirements
Application Requirements
Must show current year only
Data transfer to disconnected users
Part of a bigger picture, beyond the scope of
our role in the project
It’s part of SLA!
7
When Do We Archive?
Once a year
During spring cleaning season
When something breaks unexpectedly
and then everyone wakes up and says
“Oops! We forgot to archive.”
When we have budget
When we have nothing else to do
When we are told to get it done by
Monday
8
When Do We Archive?
When DB size approaches a predefined
threshold …
1 GB
10 GB
100 GB
1 TB
The important point is to understand the issue
and to have a strategy for addressing it.
9
Time for a Quick De-Tour!
10
Using ASP.NET, XSLT, and XML to Take
SQL Server to a New Height
SQL Server as a Document Repository
Part 4 - The Database Archival Challenge
By: Ben Aminnia
President, L.A. SQL Server Professionals Group www.sql.la
Database Architect, Pointer Corporation www.pointercorp.com
Agenda
Part 1 Review Summary: Background and
Overview of the VIP System Architecture
Part 2 Review Summary: Generating Reports &
Graphs with SSRS and MS-Chart
Part 3 Review Summary: The Road Ahead –
Using SQL Server as a Document Repository
Part 4: The Database Archival Challenge!
Questions and Answers
12
Architectural Notes and
Challenge for the DBA
Each record is about 100 KB large;
So it takes about ten thousand records to reach
one GB in DB size;
There’s no physical deletion; deleted records
are only marked for deletion (with
[isdeleted]=1);
13
Some Fundamental Questions
There are many questions on How, What, and
Why to deal with database archival process.
Who’s on 1st?
Should we plan archival when we’re running out
of space? … or performance has gone down? …
or some other company policy mandates it?
OR … as the famous database architect, Julie
Andrews, sings in The Sound of Music …
LET’S START AT THE VERY BEGINNING!
14
The First Question:
HOW?, WHAT?, or WHY?
There are six different possible orders to ask
these questions …
I think the answer is:
•
WHY?
•
WHAT?
•
HOW?
15
Why?
Increasing cost of storage / hardware
Performance Degradation / Response Time
Application requirements (e.g. must show
current year only)
16
What and Where to?
The whole record is moved to another location
and deleted from the main location
Only part of the record is moved
Destination could be …
To another DB
To the file system
No longer online
17
How?
How to archive?
How to retrieve the archived record?
What are the possible alternative from an
architectural perspective?
18
Four Alternative Ways …
Method 1: Store in archived
location (e.g. on the network
file system) from the beginning.
There’s nothing to archive periodically or at a
later time.
This used to be the most common way for
document archival, before the XML technology
which we started to use in the VIP Letters
system.
I stored over 50,000 documents from one of
my applications this way.
19
Four Alternative Ways …
Method 2:
Periodic archive of the
whole record to a different database.
Move last year’s data to a different database
with identical format.
Delete the entire archived record from the main
database.
Main database remains small and portable.
Helps with response time.
Also helps with portability (e.g. when laptop
users need to have a small version of database
on their local drive, while disconnected from
corporate network or the internet).
20
Four Alternative Ways …
Method 3: Partial archive of the
old records …
This is the case of our VIP Letters architecture.
Each record is about 100 KB.
10,000 records are almost 1.0 GB.
Most of it is the XML column data which holds
the saved letter.
The archive process will move the XML part of it
to the network share on the file system …
Keeping the other data columns in the main DB.
You then set the “IsArchived” column to 1.
21
Four Alternative Ways …
Method 4: No longer online …
Very common practice when a government or
regulatory agency mandates only x number of
years to keep records online.
The archived records are then scanned and
stored offline (on a tape or in paper form).
22
Back to Our Discussion
Here
23
In Summary
Archival is done in Phase 4
And that’s when it should be done
Nobody said archival should be done in
Phase 1
But we should have it in sight – on the
horizon – from the beginning
Archival planning / implementation
should not come as a surprise!
Example: DB and Website Size Tracking
(before the archival time)
24
Four Methods to Archive
Again, from my other presentation, we
looked into four methods to archive:
Method 1: Store in archived location (e.g.
on the network file system) from the
beginning.
Method 2: Periodic archive of the whole
record to a different database.
Method 3: Partial archive of the old records
Method 4: No longer online
25
A Closer Look at Method 3
Method 3: Partial archive of the old records
Save a 2nd copy of the document on the file
system and then delete it from the XML
column of the database
What tracking columns do we add to the main
Document Archive table?
How / when to copy document(s) to the file
system …
During the original creation
Later; one-by-one; on demand
Later; in batches (e.g. older than 1/1/2005)
How to retrieve it back from the file system 26
A Closer Look at Method 4
Method 4: No longer online
That is, neither in the XML column nor on the
file system
How do we recreate it later when needed?
Scan the paper copy?
I don’t think so!
Regenerate using original letter’s parameter
values which are still in the database?
What if the original template (XSLT) has
changed and recreate doesn’t look like the
original anymore or it fails?
27
Beyond Four Methods
Method 5: Totally move from primary database
to anther (non-DB) medium (e.g. network
share / tape)
More common in legacy system; Not one of my
options.
Method 6: Don’t move anything; just set the
archived flag and use a VIEW to filter out
archived records.
Makes sense if the objective is “visibility” and
not “space” or “response time” or “data
transmission between network and local for
disconnected mode”
28
Beyond Four Methods
Method 7: Use “The Cloud”
A viable alternative for space limitations in a
hosted environment or cost considerations
It’s really NOT an archival alternative.
Gave it a shot
Asked for a compatibility test
Nice presentation, but not compatible!
Analogy with a personal power-generator vs.
electrical outlet connected to DWP …
29
The “Cloud” - Before
My Personal
Power Generator
30
The “Cloud” - After
X
31
The “Cloud” - After
X
What if I need a 150 V generator?
32
Final Thoughts
Before Meeting with Management
Be Prepared to Answer Questions
Which functions will NOT work on archived
records? (e.g. Full-Text Search)
Have a Decision Matrix, showing all options
with pros and cons of each
User Interface to Retrieve an Archived Record
From the main application
From a secondary application solely for the purpose
of archive retrieval
By sending a request to a service application or email
to a designated contact
Turnaround time and other pros and cons for
each of the above approaches
33
The Solution Matrix
Solution Alternatives
Problems Solved
Cost of
Storage
Performance / Application
Response Time Requirements
1. Store in archive location from beginning ?
?
?
2. Move from primary to history DB
√
√
√
3. Partially move; e.g. keep the record but
empty the XML column
4. Just delete the XML column without
moving anything; rebuild (based on most
recent template) if necessary
√
√
X
√
√
X
5. Move from primary table to anther
(non-DB) medium (e.g. network / tape)
6. Don’t move anything; use filters
√
√
√
X
X
√
7. Use “the cloud”
√
X
X
34
The Solution Matrix
Final Thoughts
Ideally, each solution alternative on the matrix
needs the following:
Development Costs
Infrastructure Costs
Maintenance Costs
Which problem(s) are being addressed
Which problem(s) are NOT being addressed
What new problems might be introduced
Eventually, as the IT architects, we will
be responsible for the outcome!
35
Questions and Answers
36
Contact Information
•
Emails:
•
[email protected]
[email protected]
Websites:
www.sql.la
www.pointercorp.com
www.vipletters.com
37
Thank You!
38