Transcript DBA421
DBA421
Building Very Large Databases
with SQL Server 2000 (32/64bit)
Gert E.R. Drapers
Software Architect
Customer Advisory Team
SQL Server Development
SQL Server Development
Customer Advisory Team
We invest in Large Scale SQL Server projects
across the world
Technical Assistance with design, performance
tuning, etc.
Product group programs
We use our customer experiences to make a
better product
We share the knowledge learned from these
challenging implementations so that more can
be built
Session Objective
Show that Large Scale projects using the
Microsoft database platform is not unique
Give you some thoughts on clever design
and implementation tips dealing with large
scale SQL Server applications
Make you feel comfortable that it
absolutely can be done
Scaling up / out with SQL Server - It can be done!
Agenda
General comments on building large scale
applications
Scaling up Large Scale OLTP
Best Practices
Case Study Example
Scaling up Large Scale Data Warehouse
Best Practices
Case Study Example
Scale out - Clarification
64-bit – Awesome, but understand the value
Summary / Q & A
What is large scale?
Any database application that requires special
configuration work outside default
Numbers do not mean anything but I always get asked
so here are some guidelines:
TB+ size databases
1000(s) of Database transactions / sec
1000(s) of connected Users
Throughput of the IO subsystem exceeding 80 mb/sec
Greater than 8 way processor
Greater than 8GB Ram
Large Scale is when you think it is challenging and your
vendor says it is a piece of cake. This is when you start
worrying
Building Large Scale
SQL Server applications
The General Implementation Rules Change slightly
(true for all RDBMS)
It’s not quite auto-tuning
You need to understand your entire platform: Hardware, OS
and Database Server configuration
You may have to change defaults that work great for 99% of
other applications
DOP
Recovery Interval
Query / Index Hints
etc
Experience and know how is what makes large scale
applications work
It’s not a piece of cake but it ABSOLUTELY CAN BE
DONE!
Some interesting SQL Server
implementations in Production
25+ Terabyte or greater applications in production
Currently active on at least as many as we speak
7+TB Telco Billing reporting system growing to 15TB
3TB GIS mapping application growing to 10-20TB
13TB Debt processing financial application
Currently working on system that will grow to 50 TBs
Many High Volume mission Critical applications in
production on SQL Server
Trading application exceeding 60000+ Database transactions
/ sec
Credit card processing system processing 3000
authentications/sec
Banking application processing money transfers of greater
than 40m trx / day while reporting is active
Agenda
General comments on building large scale
applications
Scaling up Large Scale OLTP
Best Practices
Case Study Example
Scaling up Large Scale Data Warehouse
Best Practices
Case Study Example
Scale out - Clarification
64-bit – Awesome, but understand the value
Summary / Q & A
OLTP Large Scale Tips and
Best Practices
Eliminate Log Bottleneck
Eliminate ineffective Stored Procedure
Recompiles
Eliminate Lock escalation that causes
performance issues or deadlocks
Understand your scheduler and what
hardware resources needed
Eliminate Log Bottleneck
Set log on its own drive or RAID 10 stripe set
Assign its own Controller or own LUN(s) if on
SAN
Set controller or IO Subsystem cache for log as
100% write through (unless reading log with
replication or a lot of trigger activity)
Monitor Waitstats to ensure log is not waiting –
DBCC waitstatsDBCC SQLPERF(waitstats,
clear) go
DBCC SQLPERF (waitstats)
Create Waitstats Script
dbcc sqlperf(waitstats,clear)
-- clear out wait statistics
drop table waitstats
create table waitstats (Wait_Type varchar(80),
Requests numeric(18,1),
Wait_Time numeric (18,1),
Signal_Wait_Time numeric(18,1),
timenow datetime default getdate())
declare @start int, @finish int
select @start = 1, @finish = 10
while (@start < @finish)
begin
begin transaction
insert into waitstats (Wait_Type, Requests,
Wait_Time,Signal_Wait_Time)
exec (‘dbcc sqlperf(waitstats)')
commit
select @start = @start + 1
waitfor delay '00:00:10' -- every 10 seconds
End
select * from waitstats order by Wait_Time desc
Eliminate ineffective Stored
Procedure Recompiles
In some cases, stored procedure recompilation behavior adversely
affects performance
Multiple recompilations
Entire stored procedure query plan recompiled (especially for very
large stored procedures)
Stored procedure recompiles cause serialization, proc execution wait
serially during stored proc recompilation
Recompile triggered by
Set options cause recompilation
DML/DDL mixed
TempDB
Table cardinality changes
Etc.
Customers miss recompilation aspect of performance
Possible Solution / Suggestion
Whitepaper on MSDN and KB Article: Q243586
Execute proc [ keepfixedplan at stmt level ]
Statement level recompilation (Yukon)
Be on the look out of Recompilations using Perfmon.
Locking / Lock Escalation
Lock Escalation
SQL Server will escalate to a table lock when a specific
percentage of memory is used up for locking or the system
allocates 2500 locks(5000 in SS SP2).
Table Lock escalation saves on memory allocation but can
cause of deadlocks and other locking problems
Resolution:
Force page locks which will allocate fewer locks
Or get clever like one customer did
Insert dummy record 9999 99 99999 999 999
Open thread and grab update lock on dummy record
Next thread can not escalate because you can not escalate to
a table lock when another lock is held on same table.
Worst case – There is a trace flag to shut it off, call your
Microsoft support prior to using this option.
Monitor through normal locking tools: profiler, sp_who,
sp_who2, sp_lock and sp_blockinfo
Locking script
Send email to [email protected] for:
sp_blockinfo script
waitstats script
Other documentation
SQL Server Scheduler – understand it!
UMS
UMS
UMS
UMS
4 way CPU
5 BCP.exe jobs
1)
2)
3)
4)
Connections get assigned to UMS
UMS schedules across Processors
A connection stayed on UMS for life of thread
Two heavy threads will fight on the same UMS
SQL Server Scheduler – continued
Task
1
2
3
1, 5
2
UMS
3
4
UMS
4
5
5 BCP.exe jobs
UMS
UMS
4 way CPU
1)
2)
3)
4)
Each load job takes 2 minutes to run
Task 2,3 and 4 each finished in 2 minutes
Task 1 & 5 took closer to 4 minutes
Using 4 threads instead of 5 would be a
better solution for this example.
5) Monitor using DBCC SQLPERF (UMSTATS)
UMSSTATS example
DBCC SQLPERF (UMSSTATS,clear)
go
waitfor delay '00:00:15'
go
DBCC SQLPERF (UMSSTATS)
OLTP Case Study –
60000+ database trx/sec
Financial trading messages: Bids / Asks
coming off Tandem to message queue at
a rate of 12000-15000/sec
Each message goes to a SQL Server
stored procedure and makes
approximately 5 database transactions
Simultaneously real time queries are
reading from active data
Topology
Tandem Non-stop SQL
Appl
Tier
Microsoft SQL Server
8-way Dell
Brokers
Queries – real time
Read from Queue
SS Stored Proc
SS Stored Proc
Message Queue
….
6 Different Databases
OLTP Lessons learned
Challenges
Scheduling a mix workload evenly across
Schedulers
Database Log to handle 60000+ database
trx/sec
Real time reporting and loading data
Can be done with SQL Server 2000
Multiple database logs
Read-only queries
Priority scheduling of connections
Agenda
General comments on building large scale
applications
Scaling up Large Scale OLTP
Best Practices
Case Study Example
Scaling up Large Scale Data Warehouse
Best Practices
Case Study Example
Scale out - Clarification
64-bit – Awesome, but understand the value
Summary / Q & A
Data Warehouse
Lessons Learned
Loading large quantity quickly
Maintaining very large databases
Optimizing TempDB
Optimizing your IO subsystem
Data Loading Strategy For VLDB
Bulk Insert without indexes or minimal indexes seems to
be fastest (hundreds of thousands rows/sec) but not
applicable to larger systems with incremental data loads
Bulk Insert with single NC index fairly linear.
Bulk Insert with Clustered index less linear due to
fragmentation
Parallel Bulk Insert can be fastest but not linear with
sorted data and partitioned load files
No contention using multiple Bulk Insert (one per table)
if physical model works for data access
(e.g. horizontal partitioning)
Maintenance Tips:
Reorg & Create Index
Run DBCC ShowContig and look for Scan
Density(logical extent density) and Avg Page
Density(Avg free bytes/page). Below 30-50% then a
Reorg is recommended.
Run Index Defrag if your data in not interleaved (that is
single table/index in its own File Group).
Run DBCC Reindex if you want all clustered and nonclustering indexes rebuilt
Run Create Index with drop existing if you only want
clustered index rebuilt and Index defrag is not helping
Create Index runs fastest with more memory
Index Defrag with
Interleaved data
Single File Group
Tab A….page
Tab B….page
6……………….10
Tab B….page
Tab B….page
Tab A….page
16………………20
Tab A….page
1………………….5
Tab A….page
Tab B….page
11……………..15
Tab A….page
21……………….25
Tab B….page
Run DBCC Index Defrag (Tab A)
Index Defrag with
Interleaved
data…
Single File Group
Tab A….page
Tab B….page
1…………………..5
Tab B….page
Tab B….page
Tab A….page
Tab A….page
16……………20
6………………10
Tab A….page
Tab B….page
11…………..15
Tab A….page
Tab B….page
21……………….25
1.
2.
3.
4.
Start with first extent and swap with next smallest extent
If pages not at fill factor then will fill up to fill factor
If pages are greater than fill factor it will NOT take away records to
meet fill factor
Will not reorganize other objects in file group therefore data will not be
contiguous following Index Defrag
Index Defrag with
Interleaved data…
Single File Group
Tab A….page
Tab B….page
1……………….5
Tab B….page
Tab B….page
Tab A….page
6………………10
Tab A….page
11…………….15
Tab A….page
Tab B….page
16……………..20
Tab A….page
21……………….25
Tab B….page
• When completed data is in correct
• Data may or may not be contiguous
• Free space is used up but not given back
• DBCC Reindex will put data back in contiguous order but takes longer
to run and is more intrusive
TempDB slows down when busy!
Challenge
Performance degradation during peak load
Many concurrent users creating temporary tables
Performance will start to degrade
Sysprocesses shows locks on DBID #2 (TempDB)
Resolution
More files for TempDB file group
Place TempDB across more spindles
Could possibly use trace flag to disable single page
allocation but please please please call Microsoft
support before turning this trace flag on.
Challenge:
New to SAN technology
Never have worked with SAN technology can be
challenging
Does the data layout rules change?
What does the cache really buy me?
How do I monitor?
How do I determine if the problem is SAN or database?
Some thoughts:
Test your SAN throughput out prior to database activity. Use
IOMETER or just copy large files
You should be getting approx. 120mb/sec throughput per
channel
If heavy updated system then make cache more write through
vs read. This will significantly help log writes (Should get 1
ms / write – Avg Disk Writes /sec)
Still make sure data is spread across as many drives as
possible. Individual disks are still the slowest component for
database activity.
Case Study Data Warehouse
Large Telco processing call detail records (CDRs)
Loads into SQL Server 350 million call detail
records / day. Bulk loaded
Must keep a history
Reporting done all day long based off daily,
weekly, monthly quarterly and yearly intervals
Very partitionable by period of time
5-7 TB database
This was based off a prototype, project being
developed.
Case Study –
Data Warehouse design
3 types of databases
Type 1 contains 24 - 1 hour tables
Type 2 contains 7 - 1 day tables
Type 3 contains 7 – 1 week tables
Online consolidation was much more difficult, so all 3
are built in parallel. This is good design if Hardware has
head room
Data is loaded without indexes and then indexed
Data made available to users once indexed
Yesterday, Last Week and Last 7 weeks are available
while Today and This week are being loaded.
Challenges for data warehouse
case study
Real time reporting
Partition management – History
Maintenance of very large databases
Auto Updating statistics on new high
volume data coming in.
Design using 1 server
SQL Server
View
PView
Last week
Week n-7
Last week
This week
This week
Yesterday
Today
Yesterday
Today
Design using 4 servers
Last week
Week n-7
Last week
This week
This week
Yesterday
Today
Yesterday
Today
Design using 7 servers
Last week
Last week
This week
This week
Yesterday
Yesterday
Today Today
Yesterday
Today
Case Study Data Warehouse –
Summary
Large Database with Very Large requirement of
number of rows loaded / day.
Loaded in Parallel to different tables with Hardware
being the determining bottleneck
Partitioning using a data value of time
Could be partitioned using Partitioned Views or
partitioned within application
Data partitioning allowed for easy scalability to multiple
databases, instances or servers with minimal changes to
code
Maintenance decisions weren’t made because
of prototype nature of the project.
Being made now
Agenda
General comments on building large scale
applications
Scaling up Large Scale OLTP
Best Practices
Case Study Example
Scaling up Large Scale Data Warehouse
Best Practices
Case Study Example
Scale out - Clarification
64-bit – Awesome, but understand the value
Summary / Q & A
What is Scale Out?
Multiple Nodes
Evenly distributed data as in data
partitioning
Data Warehouse Telco example is data
partitioning
Distributed Partitioned Views is data
partitioning
Evenly distributed workload/data as in
function partitioning
Online shopping is function partitioning
Break up unrelated functions across resources
vs breaking up related data
Scale Out Case Study:
13TB Data Analysis System
Provides bad loan/credit collection service and reporting
(Online and batch).
Designed to create an environment where credit
grantors, debt buyers, and collection agencies can meet
their business objectives
Large volume of data inserted every night from several
sources
Heavy analytical reporting/data mining performed all
day long to predict and quantify the likelihood of
collecting each delinquent account in a client’s portfolio.
<24 hour turn around to customer requests
Scale Out Case Study:
Challenges
Manage 13TBs of data
Manage large files from several external
customers / day
Load millions of rows from multiple files
and run batch reporting at the same time
How do I manage different SLAs from
different customers with same system.
Scale Out Case Study:
Function Partitioning
Daily Customer Source files
Account Function
Tracking DB
Individual Account
Databases
Function X Tables
Common Database
Function Y Tables
Common Function
Procedures - Data
Function Z Tables
Function Partitioning
Partitioning is a VERY good idea
Partitioning by business function is very logical
Application must developed to partitioned data
versus using database technology to access
partitioned data. NOT DIFFICULT
Putting all the data in one single database
doesn’t make you a hero.
No matter how you look at it 13TB of data is
13TB of data used to solve a business problem.
Agenda
General comments on building large scale
applications
Scaling up Large Scale OLTP
Best Practices
Case Study Example
Scaling up Large Scale Data Warehouse
Best Practices
Case Study Example
Scale out - Clarification
64 bit – Awesome, but understand the value
Summary / Q&A
64-bit – Is it an upgrade lock?
What is the hype of 64-bit?
Produced world record benchmarks
New advancements in CPU – Itanium
It begs the questions:
Should you automatically upgrade?
Is it a performance guarantee?
Is it an easy upgrade?
64-Bit Clear Winners
OLAP Applications with large dimensions
Breaks through the 3GB limit
Essential for dimensions with millions of members
Also can leverage large file system cache
Relational applications requiring extreme
memory (> 32GB)
VLDBs with random access patterns that can be
‘cached’ to DB buffers for performance
Any DB requiring > 64GB buffers
Alternative to 32-bit apps using AWE
No pressure on ‘real’ memory
No overhead for mapping AWE pages
64-bit winners…
If you need to keep procedure/statement cache in the virtual
address space
If you need to keep the page headers in the virtual address space.
The more memory we use via AWE, the more page headers we
have to keep and the less space for real data in the virtual address
space
If you need to keep memory associated with a server side cursor in
virtual address space
Calculating memory for hash joins and sort operations we only can
calculate with the max of the virtual address space. We can not use
AWE memory for such operations. Example: Spill out in TempDB
Going beyond 16GB, we have to reduce the virtual address space
to 2GB. This is contra productive since we get more data to be held
in virtual address space (page headers).
Concurrency issues of AWE under W2K got improved with W2K3.
Problem showed up with 32way hardware.
Some realities!
Documentation states that Itanium twice the CPU
power/speed as Xeon although slower clock speed
You pay premium $ for a slower Itanium compared to
Xeon
CPU-bound SQL relational workloads will see
performance improvement on same number of CPUs
BUT, this does NOT mean 4way Itanium is equal to
8way Xeon.
Very Large Memory could take a long time to shutdown
due to a system checkpoint with a lot of data to flush.
Recommendation – Keep system checkpoint at small
interval (example: 1 minute).
Takeaways
For relational DBMS workloads, 64-bit is
not a slam-dunk
Often better to have more CPUs
Often better to have faster CPUs
Qualify before recommending. Look for
Memory-intensive workloads
Beyond what can be gracefully handled with
AWE today, e.g. >> 32 GB
Large-Scale OLAP workloads
Agenda
General comments on building large scale
applications
Scaling up Large Scale OLTP
Best Practices
Case Study Example
Scaling up Large Scale Data Warehouse
Best Practices
Case Study Example
Scale out - Clarification
64-bit – Awesome, but understand the value
Summary / Q & A
Summary
We can build VLDB systems with SQL Server
today and we are doing it.
Design Wins Program may be able to assist.
Send email to your local Microsoft
representative or [email protected]
It is not auto-tuning , auto configuration. It will
take effort from you. Your Microsoft MCS can
help
Having this knowledge will help you be more
successful while building large scalable
systems
Questions…
[email protected]
Community Resources
Community Resources
http://www.microsoft.com/communities/default.mspx
Most Valuable Professional (MVP)
http://www.mvp.support.microsoft.com/
Newsgroups
Converse online with Microsoft Newsgroups, including Worldwide
http://www.microsoft.com/communities/newsgroups/default.mspx
User Groups
Meet and learn with your peers
http://www.microsoft.com/communities/usergroups/default.mspx
Suggested Reading And Resources
The tools you need to put technology to work!
TITLE
Microsoft® SQL Server™ 2000
Administrator's Companion:07356-1051-7
Microsoft® SQL Server™ 2000
High Availability: 0-7356-1920-4
Available
Today
7/9/03
Microsoft Press books are 20% off at the TechEd Bookstore
Also buy any TWO Microsoft Press books and get a FREE T-Shirt
evaluations
© 2003 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.