SEAS VLDB - Microsoft

Download Report

Transcript SEAS VLDB - Microsoft

SEAS – 2006
SQL Server 2005
VLDB Management & Operation
Agenda




The challenge of Very Large Databases
Take the VLDB challenge and break into smaller
manageable designs
SQL Server 2005 VLDB technologies
 Table partitioning
 Online operations
 Improved performance
 Improved concurrency
 Piecemeal management
Capacity Planning
 Storage






Disks
Tempdb
Transaction Logs
Memory
CPU
VLDB Lessons Learned
SEAS06 - VLDB Management & Operation
2
Very Large Databases
Trends
Drive capacity doubles every
12-18 months
10 TB
Drive cost drops
Data retention driven by
• Regulatory requirements
• Business analysis
5 TB
2 TB
1 TB
Database size doubles every
year for the same cost
SEAS06 - VLDB Management & Operation
3
WinterCorp Study - SQL Server/Windows Growth projections
2001
2003
2005
2007
2009
2011
Minimum entry
criteria for
Windows
100 GB
500 GB
1 TB
2TB
4TB
8TB
Total number of total
entries for
all
databases
168
141
175
???
???
???
Number of entries,
Windows
only
20
36
60
95
150
250
Number of entries,
SQL Server
10
24
43
80
140
240
Number of entries,
SQL Server,
> 1 TB
0
15
43
80
140
240
800 GB
5TB
20TB
100TB +
500TB +
2000TB +
Largest SQL Server
Instance
Projections
SEAS06 - VLDB Management & Operation
4
Challenge of Very Large Databases

Large scale operations including…








Index creation and maintenance
Bulk insertion of new data
Bulk deletion of old data
Concurrent reporting and updates
Backup and restore
Disaster recovery
Detecting and recovering from media errors
Tuning and Performance
…while maintaining availability as dictated
by business requirements.
SEAS06 - VLDB Management & Operation
5
Agenda




The challenge of Very Large Databases
Take the VLDB challenge and break into smaller
manageable designs
SQL Server 2005 VLDB technologies
 Table partitioning
 Online operations
 Improved performance
 Improved concurrency
 Piecemeal management
Capacity Planning
 Storage






Disks
Tempdb
Transaction Logs
Memory
CPU
VLDB Lessons Learned
SEAS06 - VLDB Management & Operation
6
Design Philosophy VLDB


Don’t think of it as Large complex
database, think of it as smaller
manageable components
Partitioning data into smaller manageable
sizes is your best friend


Data Partitioning vs Function Partitioning
Methods for data partitioning
 Database Partitioning
 Instance Partitioning
 SQL Server Partitioning
SEAS06 - VLDB Management & Operation
7
Data separation: Data Partitioning vs
Function Partitioning

Examples of data partitioning by data



Data broken up by States in the US: Alabama,
Arkansas, Alaska…..
Data broken up by months in a year: Jan, Feb,
Mar……
Examples of data partitioning by function


Data separated by business users: sales data,
consulting data, competitive data….
Data separated by business function:
manufacturing data, design data, inventory
data…….
SEAS06 - VLDB Management & Operation
8
Example #1: VLDB Case Study
Function Partitioning




Manage 13TBs of data
Manage large files from several external
customers / day
Load millions of rows from multiple files
and run batch reporting at the same time
How do I manage different SLAs from
different customers with same system.
SEAS06 - VLDB Management & Operation
9
Example #1: VLDB case study –
Function Partitioning
Daily Customer Source files
Account Function
Tracking DB
Individual Account
Databases
Function X Tables
Common Database
Function Y Tables
Common Function
Procedures - Data
SEAS06 - VLDB Management & Operation
Function Z Tables
10
Function Partitioning is scale out

Partitioning is a VERY good idea




Partitioning by business function is very logical
Application must developed to partitioned data versus
using database technology to access partitioned data.
NOT DIFFICULT
Putting all the data in one single database
doesn’t make you a hero.
No matter how you look at it 13TB of data is
13TB of data used to solve a business problem.
SEAS06 - VLDB Management & Operation
11
Example #2: VLDB Case Study
data partitioning






Loads into SQL Server 350 million call detail
records / day. Bulk loaded
Must keep a history
Reporting done all day long based off daily,
weekly, monthly quarterly and yearly intervals
Very partitionable by period of time
5-7 TB database
This was based off a prototype, project being
developed.
SEAS06 - VLDB Management & Operation
12
Example #2: VLDB case study
data partitioining

3 types of databases







Type 1 contains 24 - 1 hour tables
Type 2 contains 7 - 1 day tables
Type 3 contains 7 – 1 week tables
Online consolidation was much more difficult, so all 3
are built in parallel. This is good design if Hardware has
head room
Data is loaded without indexes and then indexed
Data made available to users once indexed
Yesterday, Last Week and Last 7 weeks are available
while Today and This week are being loaded.
SEAS06 - VLDB Management & Operation
13
Example #2:
VLDB Design if using Single server
SQL Server
View
PView
Last week
Week n-7
Last week
This week
This week
Yesterday
Today
SEAS06 - VLDB Management & Operation
Yesterday
Today
14
Example #2: VLDB Design if
changed from 1 Server to 4 Servers
Last week
Week n-7
Last week
This week
This week
Yesterday
Today
SEAS06 - VLDB Management & Operation
Yesterday
Today
15
Example: #2: Design is use 7 server
partitioning
Last week
Last week
This week
This week
Yesterday
Yesterday
Today Today
SEAS06 - VLDB Management & Operation
Yesterday
Today
16
Scalable Shared Database

Read-only copy of a Data Warehouse
database on a SAN can be shared by up to
eight SQL query servers




Network Load Balancing is possible
Inexpensive scale-out solution as alternative to
scaled-up server
Easily support a large or growing group of
query users
SANs can create a shared read-only copy of
the database instantly using SAN Snapshot
technology


Copy on write technology
Refresh the snapshot after new data is loaded
SEAS06 - VLDB Management & Operation
17
Example #3: VLDB Scale out partitioning
SEAS06 - VLDB Management & Operation
18
Agenda




The challenge of Very Large Databases
Take the VLDB challenge and break into smaller
manageable designs
SQL Server 2005 VLDB technologies
 Table partitioning
 Online operations
 Improved performance
 Improved concurrency
 Piecemeal management
Capacity Planning
 Storage






Disks
Tempdb
Transaction Logs
Memory
CPU
VLDB Lessons Learned
SEAS06 - VLDB Management & Operation
19
VLDB SQL Server Technologies:
Table Partitioning

Designed for:




Easy management of very large tables and
indexes
Maintain high availability
Make effective use of large machines (16, 32,
and more CPUs) for any type of workload
Provide performance options for maintenance
SEAS06 - VLDB Management & Operation
20
Table Partitioning: Common Scenario
“Sliding Window”

New data is added continually to a very large
database (Daily, Weekly, Monthly)



Data must be scrubbed, indexed, etc.
As data ages, it is modified less frequently
Eventually, it may be discarded
Sliding Window
Time
Obsolete
Read-Only
Read-Mostly
SEAS06 - VLDB Management & Operation
Hot
21
Locating A Row
Key (partitioning column)
Partition
Scheme
Partition
Function
FileGroup 1
FileGroup 2
SEAS06 - VLDB Management & Operation
FileGroup 1
22
Aligned Index
One-to-one partition correspondence
Index Partition
Table Partition
SEAS06 - VLDB Management & Operation
23
VLDB SQL Server Technologies:
Online Operations




Index creation
Index maintenance
Online restore
Availability during rollback recovery
SEAS06 - VLDB Management & Operation
24
Online Index Operations
SQL Server 2005

Introduced as WITH (ONLINE=ON)





CREATE INDEX
DROP INDEX
ALTER INDEX REBUILD (was DBCC DBREINDEX)
ALTER INDEX add or drop constraint (primary key,
unique)
ONLINE is supported on partitioned tables but it
is not supported for ALTER INDEX REBUILD on
single partition
SEAS06 - VLDB Management & Operation
25
VLDB SQL Server Technologies:
Online Restore

Microsoft® SQL Server™ 2000


Database is not available during restore
SQL Server 2005

Database remains online



Only data being restored is unavailable
Works on Enterprise Edition “by default”, on nonenterprise the whole database is always offline
Options


File / filegroup restore
Damaged page tracking and restore
SEAS06 - VLDB Management & Operation
26
VLDB SQL Server Technologies:
Improved Performance

Bulk insertion and deletion of data


File initialization




Most improvement through partitioning (ALTER TABLE SWITCH
mentioned before)
Skip zeroing out the file(s)
Permission grantable only by Windows® administrator
SE_MANAGE_VOLUME_NAME special privilege
Tuning


ITW renamed to DTA (Database Tuning Advisor)
Time-bound tuning
SEAS06 - VLDB Management & Operation
27
VLDB SQL Server Technologies:
Improved Concurrency

Transaction isolation levels

Database Snapshot
SEAS06 - VLDB Management & Operation
28
Data Access Concurrency

Transaction isolation levels
 Serializable
 Repeatable Read
 Read Committed

Read Committed Snapshot

Read Uncommitted

Snapshot
SEAS06 - VLDB Management & Operation
29
VLDB SQL Server Technologies:
Snapshot Isolation

Increased data availability for read applications




Permits writes, which can cause conflicts



Allows non-blocking consistent reads in an OLTP environment
Writers do not block readers
Readers do not block writers
BUT… includes mandatory conflict detection, therefore…
Consider Snapshot Isolation when the cost of locking data
outweighs the cost of occasionally rolling back a transaction
Whitepaper

“SQL Server 2005 Beta 2 Snapshot Isolation” on
http://www.microsoft.com/technet/prodtechnol/sql/2005/SQL05B.mspx
SEAS06 - VLDB Management & Operation
30
VLDB SQL Server Technologies:
Database Snapshot

Snapshot of a database at a point in time




Base database continues to change



Created instantly
Read only
Transactionally consistent, time-consistent
Database Snapshot does not restrict access to the base
database
Multiple Snapshots are allowed
Database Snapshots can exist forever

Constrained by resources
SEAS06 - VLDB Management & Operation
31
Database Snapshot


Database snapshots do consume resources on
the server. Example: Buffer Pool
Tested – TPC-C run lost less than 15% perf with
single Database snapshot taken
CREATE DATABASE dbSnap AS SNAPSHOT OF mydb
USE mydb
mydb – Database
UPDATE (pages 4, 9, 10)
Page
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
dbSnap – Read-Only Database Snapshot
USE mydbSnap
SELECT (pages 4, 6, 9, 10, 14)
SEAS06 - VLDB Management & Operation
32
VLDB SQL Server technologies:
Backup & Restore

Common Challenge vs. Time



SQL Server Solutions




Backup large data volume
Restore/recover large data volume
SQL Server backup/restore improvements
Online piecemeal restore
Database Snapshots
3rd party solutions


SAN based solutions
Software based solutions
SEAS06 - VLDB Management & Operation
33
Filegroups
Basic unit of availability


Earlier Versions of SQL Server
 Entire database
SQL Server 2005
 Database can be online, yet not all filegroups are
available
SEAS06 - VLDB Management & Operation
34
Example
Server startup
Database
Primary
Filegroup
Filegroup A
Database is available
Filegroup A is
unavailable due
to missing file
Filegroup is now available
Filegroup B
DBA locates file
and recovers the
filegroup
SEAS06 - VLDB Management & Operation
35
Piecemeal Backup-Restore



Partial database availability
during restore
Based on file / filegroup backup
Works with all recovery models



Full, Simple, Bulk-Logged
Supported only for databases with
multiple filegroups
Filegroups must be recovered to consistent point
SEAS06 - VLDB Management & Operation
36
Example
Piecemeal restore for disaster recovery

Online restore of filegroups
in priority order
Backups
Database
Available
Primary
Filegroup
Primary
Filegroup A
Log
Filegroup A
Filegroup B
Filegroup B
SEAS06 - VLDB Management & Operation
37
Online Piecemeal Restore

Principle




Benefit


Filegroups can be restored in stages
Single object restoration (through filegroup) into temporary
database
Logical correction possible to corect error
Granular and quick restoration of table and partition at filegroup
level
Steps



Backup tail of active transaction log
Using piecemeal restore to create temporary work database
 Restore primary db/log to temporary work db with
“norecovery”, “partial”
 Restore filegroup
 Restore from backup of current log
Logically recovers data
See SQLCAT blog by Tom Davidson –
http://blogs.msdn.com/sqlcat/archive/2005/10/27/485580.aspx
SEAS06 - VLDB Management & Operation
38
Short Restore Sequence
For Read-Only Data

Log backups need not be restored when:




File / filegroup is unchanged since backup
File has been rolled forward to where it is read-only
Useful for read-only tables or partitions
Set filegroup to read-only

Recommended for FULL and BULK-LOGGED databases



Prevents inadvertent change
Prevents changes by background tasks
Required for SIMPLE model databases

Log backup for roll forward not available
SEAS06 - VLDB Management & Operation
39
VLDB SQL Server technologies:
Statistics Management

Upgrade from SQL Server 2000




SS2K statistics are rendered invalid
SS2K5 will automatically build statistics after
upgrade upon query execution
 If you don’t run it manually initial performance
may/will suffer
SS2K5 statistics has changed
 Now use formula O(sqrt(tablecard))
 For very large tables this most likely will be less
than 1%
Recommendation: For large tables use 10%
sampling, but you must test and try different
values. The right answer could be 5% or 50%.
SEAS06 - VLDB Management & Operation
40
Agenda




The challenge of Very Large Databases
Take the VLDB challenge and break into smaller
manageable designs
SQL Server 2005 VLDB technologies
 Table partitioning
 Online operations
 Improved performance
 Improved concurrency
 Piecemeal management
Capacity Planning
 Storage and object management







Disks
TempDB
Index Management
Transaction Logs
Memory
CPU
VLDB Lessons Learned
SEAS06 - VLDB Management & Operation
41
Storage Planning…

Planning for SAN



Most VLDB uses SAN storage
Standard disk/controller number may not
apply due to large cache and pre-fetch
algorithm
Recommendation


Using SQLIO or IOMETER to test/benchmark
the SAN prior to implementing SQL Server
Define storage throughput requirement for the
application – e.g. my data warehouse need to
push through 500mb/sec
SEAS06 - VLDB Management & Operation
42
Understanding your SAN IO
Subsystem
Server
SAN
CPU
Client
Mirrore
d
Cache
SQL Server
memory
CPU
CPU ……
SAN Cache
SQL Server
Disks
Controllers
SEAS06 - VLDB Management & Operation
43
Example – insert stmt / commit to log
Server
SAN
CPU
Client
Mirrore
d
Cache
SQL Server
memory
CPU
CPU ……
SAN Cache
Data
Page
Log Record Only
SQL Server
Disks
Controllers
SEAS06 - VLDB Management & Operation
44
Example: write completed
note: Data not on disk
Server
SAN
CPU
Client
Mirrore
d
Cache
SQL Server
memory
CPU
CPU ……
SAN Cache
Data
page
IO COMPLETE
SQL Server
Disks
Controllers
SEAS06 - VLDB Management & Operation
45
Next System Checkpoint – data
page(s) get written to SAN – note:
again not on disk
Server
SAN
CPU
Client
Mirrore
d
Cache
SQL Server
memory
CPU
CPU ……
SAN Cache
Data
Page
SQL Server
Disks
Controllers
SEAS06 - VLDB Management & Operation
46
Periodically the SAN will flush its
data to disk (de-staging)
Server
SAN
CPU
Client
Mirrore
d
Cache
SQL Server
memory
CPU
CPU ……
SAN Cache
SQL Server
Disks
Controllers
SEAS06 - VLDB Management & Operation
47
Storage Planning


SQL Server guidelines – IO throughput
Capacity planning considerations:


# of spindles to handle mb/sec read requirements of
the application
# of spindles to handle de-staging of the SAN cache
Device
Speed
Comments
Disk Speed
200-250 IO(s)/sec
- 13-16mb/sec(64K)
- 2mb/sec (8K)
Note: HW vendors will tell you disk are 500+ IOs/sec and
125+mb/sec. These numbers are raw file speeds with very large
block sizes of 256K or greater. SQL Server reads at 8K, 64K or
larger and writes at 8K or 64K
Controller
125-150 mb/sec
Note: based on per channel / per controller basis
SQL Server random read
4-5 ms
Note: This is just the IO completion, not the complete transaction
SQL Server write
1 ms
Notes:
 Assuming write through cache on controller and SAN.
 This is cost of Windows write in perfmon – AVG DISK SEC /
WRITE counter.
 Log only as data pages are lazy written
 Numbers do not include de-staging of SAN Cache
SEAS06 - VLDB Management & Operation
48
Transaction Log Planning…





Database scalability is limited by the maximum
throughput of the transaction log (Disk I/O)
Plan on 1 ms / log write (utilizing write through
cache) – Avg Disk Sec/Write counter in perfmon
(using SAN)
Note: SQL Server will combine multiple write
requests into a single log write (See diagram next
page) but keep order
Fastest configuration is multiple files and
multiple disks striped to handle de-staging of
cache and multiple log operations (Replication,
Triggers, etc.)
Ballpark – (Depends on many variables): plan on
40-60K database transactions / single log
SEAS06 - VLDB Management & Operation
49
Database Log Thoughput
SQL Server
Update, insert, delete
requests
SS Memory
Data
pages
.
.
.
2. Data page written to SS Memory
3. Write request issued to write
out log to IO Subsystem
Log Buffer
1. Insert happens
4. By the time Log write IO is issued
Several other log writes piggyback
SEAS06 - VLDB Management & Operation
50
TEMPDB Planning…

SQL Server 2005 tempdb requires more
space




Some new features in SQL Server 2005
Some features that use tempdb in earlier
versions of SQL Server may require more
tempdb disk space in SQL Server 2005.
Some features that do not use tempdb in
earlier versions of SQL Server use tempdb in
SQL Server 2005
Good MSDN article: Capacity Planning for
tempdb: http://msdn2.microsoft.com/enus/library/ms345368.aspx
SEAS06 - VLDB Management & Operation
51
TempDB capacity planning

On line index:


2x-3x size of index – Sort size, temp index and rollback
Versioning:



[Size of Version Store] = 2 *
[Version store data generated per minute] *
[Longest running time (minutes) of your transaction] *
number of concurrent transactions/users
Note: Version store data generated per minute and
version store size are now a perfmon parameters
Recommendation repeated: Be sure to tune TempDB
for proper sizing as well as performance
SEAS06 - VLDB Management & Operation
52
TEMPDB capacity Planning…
a more reliable method

Determine tempdb space used
select
sum(user_object_reserved_page_count)*8 as
user_objects_kb,
sum(internal_object_reserved_page_count)*8 as
internal_objects_kb,
sum(version_store_reserved_page_count)*8 as
version_store_kb,
sum(unallocated_extent_page_count)*8 as
freespace_kb
from sys.dm_db_file_space_usage
where database_id = 2
SEAS06 - VLDB Management & Operation
53
TEMPDB capacity planing.....
What tasks are using tempdb?

Determine tempdb space used by Task
SELECT t1.session_id,
(t1.internal_objects_alloc_page_count + task_alloc)
as allocated,
(t1.internal_objects_dealloc_page_count + task_dealloc)
as deallocated
from sys.dm_db_session_space_usage as t1,
(select session_id,
sum(internal_objects_alloc_page_count)
as task_alloc,
sum (internal_objects_dealloc_page_count)
as task_dealloc
from sys.dm_db_task_space_usage group by session_id)
as t2
where t1.session_id = t2.session_id
and t2.session_id >50
order by allocated DESC
SEAS06 - VLDB Management & Operation
54
Index Operations Lessons Learned

Online index build logs the entire contents of the target
index pages. Hence log space requirements increase
significantly.




You can specify to use TempDB to avoid log sizing and
contention.
Recommendation: specify sort_in_tempdb option for online
index build and tune temp_db to avoid excessive log activity
Online index not available on a per-partition basis for
partitioned tables.
ALTER INDEX … REBUILD of a corrupted non-clustered
index may use another index (and that may be
theoretically corrupted as well).

Recommendation: Therefore use DBCC repair rather than alter
index…rebuild
SEAS06 - VLDB Management & Operation
55
TempDB for Index Operation…
TEMPDB space used / Size of the index
Single column index
8% of row size
(4 bytes out of 46 bytes)
Mappin
g Index
Operation
Sort
runs
Total
3 columns index
25% of row size
(10 bytes out of 42 bytes)
All columns index
100% of row size
(42 bytes out of 42 bytes)
Mappin
g Index
Mappin
g Index
Sort
runs
Total
Sort
runs
Total
Online clustered
index create
0.33
1.1
1.43
0.52
1.12
1.64
1.21
1.11
2.32
Online clustered
index rebuild
0.17
0
0.17
0.31
0
0.31
1.01
0
1.01
Online non-clustered
Index create
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
Online non-clustered
index rebuild
0
0
0
0
0
0
0
0
0
Offline clustered
index create
0
1.02
1.02
0
1.02
1.02
0
1.01
1.01
Offline clustered
index rebuild
0
0
0
0
0
0
0
0
0
Offline non-clustered
index create/rebuild
0
1.07
1.07
0
1.05
1.05
0
1.02
1.02
* Above table illustrates the amount of TEMPDB space used with respect to the size of the index
Online clustered index creation can take up to 2.5x size of heap!
SEAS06 - VLDB Management & Operation
56
Index Management

Defrag only necessary


Online Index maintenance in VLDB



Fragmentation has little performance impact on index-seek
Offline operation still much faster than online
On average, Insert/Delete operation slower during online index –
up to 30%
Non-clustered index with included columns to replace
covering index
Operation
Duration
SQL Server 2000 Offline index rebuild
3:34
SQL Server 2005 offline index rebuild
2:36
SQL Server 2005 online index rebuild
10:52
SQL Server 2005 online index rebuild with
normal workload
11:30
SEAS06 - VLDB Management & Operation
Notes
27% faster than SQL
Server 2000
57
Unused Index


Unused index cause unnecessary index
maintenance overhead during insert operation
Determine unused index
select object_name(i.object_id), i.name
from sys.indexes i, sys.objects o
Where objectproperty(o.object_id,'IsUserTable') =
1
And i.index_id NOT IN (select s.index_id
from sys.dm_db_index_usage_stats s
where s.object_id=i.object_id and
i.index_id=s.index_id and
database_id = <dbid> )
and o.type = 'U'
and o.object_id = i.object_id
order by object_name(i.object_id) asc
SEAS06 - VLDB Management & Operation
58
Memory Planning

Avoid paging


Use “Lock Pages in Memory” option
Proper sizing
SQL Server Max Memory = Total system memory
– OS/Application
– <Max worker thread> x <Thread stack size>
Thread size: x86 – 512K; X64 – 2MB; IA64 – 4MB

Monitor Buffer/Plan cache



System with poor plan reuse may starve buffer
Use dbcc freeproccache
Use dbcc freesystemcache
SEAS06 - VLDB Management & Operation
59
CPU Planning

Degree of Parallelism




Affinity Mask




OLTP environment favors low MAXDOP
OLAP/DW environment favors higher MAXDOP
Dynamically changeable setting MAXDOP=0 for
regular business hour, and change to x during index
maintenance
“affinity mask” controls CPU 0-31
“affinity64 mask” controls CPU 32-64
When set, SQL Server 2005 scheduler affinitize to
CPU
NUMA & application partitioning


Associating multiple TCP/IP ports to a single node
Associating a single TCP/IP port to multiple nodes
SEAS06 - VLDB Management & Operation
60
Agenda




The challenge of Very Large Databases
Take the VLDB challenge and break into smaller
manageable designs
SQL Server 2005 VLDB technologies
 Table partitioning
 Online operations
 Improved performance
 Improved concurrency
 Piecemeal management
Capacity Planning
 Storage






Disks
Tempdb
Transaction Logs
Memory
CPU
Summary
SEAS06 - VLDB Management & Operation
61
Summary


VLDB is just many smaller manageable
problems to solve if you partition the big
into smaller components
SQL Server 2005 has many new features
to help VLDB

Partitioning, online utilities, DMV(s), high
performance, etc.
SEAS06 - VLDB Management & Operation
62
© 2006 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
SEAS06 - VLDB Management & Operation
63