Running SQL Server on Windows Azure Virtual Machine

Download Report

Transcript Running SQL Server on Windows Azure Virtual Machine

Windows Azure
Conference 2014
Running SQL Server on
Windows Azure Virtual Machine
Performance Considerations
@vinodk_sql
Windows Azure Conference 2014
Session Objectives And Takeaways
Session Objective(s):
Learn the performance characteristics of running SQL Server on Windows Azure VM (IaaS)
Understand the performance best practices for running SQL Server on Windows Azure VM
Key Takeaway 1
What impacts performance of SQL Server running on Windows Azure VM
Key Takeaway 2
How to monitor and troubleshoot performance of SQL Server running on Windows Azure VM
Windows Azure Conference 2014
Performance tuning
•
Analyzing performance impact
– Is it a problem with my SQL application?
or
– Is it a problem with how I configured the Azure IaaS platform for my usage
Key SQL Server Performance Considerations
Windows Azure IaaS Characteristics
Best Practices
Monitoring
Windows Azure Conference 2014
Key SQL Server Performance
Considerations
• KPIs
– Throughput
– Response time (aka latency)
Metric
OLTP
DW
Read/Write mix
•
•
•
Large number of small
transactions
Transactions are mostly similar in
pattern
Significant amount of reads
•
•
Between 8 and 64K
Mostly random
•
•
IO size and pattern
•
•
•
# users
Windows Azure Conference 2014
high
low
Transaction Log
Scan intensive, large
portions of data at a
time
Bulk loading
•
•
Mostly writes
Requires low
latency
1 64KB read per 8
512KB reads
Mostly sequential
MB/s a critical metric
•
Highly sequential
n/a
Windows Azure VM Characteristics
Architecture
Latency
Variability
Configuration options
Windows Azure Conference 2014
VM Configuration Options
Windows Azure Conference 2014
Virtual Machine
size
Network
bandwidth
Disk types and
configurations
Disk cache
settings
Windows Azure VM Size
Virtual
Machine Size
CPU Cores
Memory
OS Disk
Space
Allocated
Bandwidth
(Mbps)
Temporary
Disk Space
Maximum #
of data disks
(1 TB each)
Maximum
IOPS (500
maximum
per disk)
ExtraSmall
Shared
768 MB
127 GB
20 GB
5
1
1x500
Small
1
1.75 GB
127 GB
70 GB
100
2
2x500
Medium
2
3.5 GB
127 GB
135 GB
200
4
4x500
Large
4
7 GB
127 GB
285 GB
400
8
8x500
ExtraLarge
8
14 GB
127 GB
605 GB
800
16
16x500
A6
4
28 GB
127 GB
285 GB
1,000
8
8x500
A7
8
56 GB
127 GB
605 GB
2,000
16
16x500
http://msdn.microsoft.com/en-us/library/windowsazure/dn197896.aspx
Windows Azure Conference 2014
VM Disk Types & Configurations
•
•
•
OS disk (persistent)
– 127GB disk optimized for OS access patterns (e.g. boot up)
Data disk (persistent)
– A VHD you can attach to a VM to store app data
– Up to 1TB in size
– Up to 16 disks for XL VMs
Temporary local disk (non-persistent)
– Used for transient/temporary data storage & OS page files
– Hosted in attached disks on physical host
– Cleaned up in the event of a VM failure or recycling
– Physical disks shared across other VMs on same physical machine
– Not recommended for user or system database files
Windows Azure Conference 2014
How does VM Disk Caching work?
•
•
Caches VM data
inside physical host
machine
Can reduce disk I/O
latency by reducing #
transactions against
Windows Azure
Storage
• 2-tier cache
• Recently accessed data stored in host RAM cache - space shared by all VMs on
machine
• Less recently accessed data stored on local hard disks of physical machine
• Reserved cache space for VM “OS Disk” and “Data Disks” based on the VM size
Windows Azure Conference 2014
What are Disk Caching Best Practices
for SQL Server?
OS Disk
•
“Read Write” (default) reduces read latency for IO intensive
workloads with smaller DBs (<=10GB)
– Working set can fit in disk cache or memory, reducing
blob storage IO
Data disks
•
Cache setting depends on the IO pattern and workload
intensity
•
Use default of “None” (disable) for higher rate of random
IOs (e.g. OLTP) & higher throughput
– Bypasses physical host local disks, maximizing IO rate
•
Consider enabling read cache for latency sensitive read
heavy workloads
Windows Azure Conference 2014
Best Practices
Network
Latency
Windows Azure Conference 2014
Disk
configuration
TempDB
placement
IO
optimization
VM network performance
•
•
•
Network latency can be higher compared to on-prem
– Virtualization, security, load balancing, proximity
– Reducing network round trips can have more impact compared to on-prem
Consolidate “chatty” application layers on the same machine
Host VMs in the same Virtual Network / Cloud Service
– Allows communication via internal IP addresses (DIPs)
– VMs involved in the same function should be in the same Cloud Service
(example: VMs participating in an AG, or VMs in a given app tier).
• Load balance multiple VMs in the same cloud service via public virtual IP
addresses (for example: read workload with AlwaysOn AG)
– Use Windows Azure Virtual Network for VMs in different cloud services
Windows Azure Conference 2014
Windows Azure Conference 2014
Single Data Disk Configuration
•
•
•
Recommended for <1TB storage
Acceptable performance
Minimal complexity, simpler recovery
Sample SQL IO Measurement tests for single disk:
Random I/O (8KB
Pages)
Sequential I/O (64KB
Extents)
Sequential I/O (256KB Blocks)
Reads
Writes
Reads
Writes
Reads
Writes
IOPS
500
500
500
300
300
300
Bandwidth
4 MB/s 4 MB/s
30 MB/s
20 MB/s
70 MB/s
70 MB/s
Windows Azure Conference 2014
Multiple Disk Configuration
•
•
Recommended for
– >1TB DB files, OR
– for higher IOPS / bandwidth requirements
Two Configuration Choices
– Option 1: Use Filegroups with multiple database files, and place each
database file on a separate data disk
• This showed the best performance in our testing
• Does not help scaling transaction log throughput and bandwidth
– Option 2: Create OS volume on top of multiple data disks (e.g. OS striped
volume or Windows Server 2012 storage space)
• Storage spaces recommended over OS striped volumes
Windows Azure Conference 2014
Aggregated Measurement Samples
4 disks
Random I/O (8KB Pages)
Sequential I/O (64KB Extents)
Sequential I/O (256KB Blocks)
Reads
Writes
Reads
Writes
Reads
Writes
IOPS
2000
2000
2000
1300
700
1100
Bandwidth
20 MB/s
20 MB/s
120 MB/s
80 MB/s
170 MB/s
270 MB/s
8 disks
Random I/O (8KB Pages)
Sequential I/O (64KB Extents)
Sequential I/O (256KB Blocks)
Reads
Writes
Reads
Writes
Reads
Writes
IOPS
4000
4000
2500
2600
700
2200
Bandwidth
30 MB/s
30 MB/s
150 MB/s
160 MB/s
170 MB/s
550 MB/s
16 disks
Random I/O (8KB Pages)
Sequential I/O (64KB Extents)
Sequential I/O (256KB Blocks)
Reads
Writes
Reads
Writes
Reads
Writes
IOPS
8000
8000
2500
5000
700
2400
Bandwidth
60 MB/s
60 MB/s
150 MB/s
300 MB/s
170 MB/s
600 MB/s
Windows Azure Conference 2014
Disk Configuration: Scaling Reads and
Writes
•
•
•
•
Spreading the I/O workload across a number of data disks benefits smaller random
operations (more common in OLTP scenarios) where IOPS and bandwidth scale in a nearly
linear fashion.
As the I/O block size increases, for read operations, IOPS or bandwidth doesn’t linearly scale
by adding more data disks. IOPS and bandwidth taper off after 4 disks.
• If your workload is read intensive with analytical queries, adding more disks will
not necessarily help.
For write intensive workload, adding more data disks can increase performance in a nearly
linear fashion. This means that you can benefit from placing each transaction log for
multiple databases on a separate data disk.
For large sequential I/O block sizes, writes generally scale better than reads.
Windows Azure Conference 2014
Should TEMPDB go on D: drive?
•
•
•
•
•
•
Short answer: No
Long answer: No
Why?
Predictable performance:
OS or data disk can provide same or better performance but D: drive can be more variable,
being a physical disk sharing IO with other VMs on the host. Size and performance also
depends on VM size
Configuration overhead:
SQL Server has to recreate TEMPDB in D: if VM goes down – SQL Server service account
requires Admin privileges. If stored in a separate folder this needs to be created at startup.
TEMPDB can be critical to application performance
– Follow tempDB IO best practices
Windows Azure Conference 2014
Windows Azure Conference 2014
Reduce IO with Instant File Initialization
•
•
•
Not default in Azure VM images
Reduces IO for
– Creating a DB
– Restoring a DB
– Adding files to a DB
– Extending file size
– Autogrow, etc.
Add SQL service
account to Perform Volume
Maintenance Tasks security
policy
Restart SQL Server
Note: Doesn’t help Transaction Log
Windows Azure Conference 2014
Impact of Instant File Initialization
60
50
Time (minutes)
•
•
40
30
20
10
0
Create 100 GB database
Restore 100 GB database
Without Instant File Initialization
With Instant File Initialization
Instant File Initialization – How to?
Windows Azure Conference 2014
Instant File Initialization Recommendations
•
•
•
•
•
Pre-size all database files appropriately
– Autogrow is insurance
– Not a license for drunk driving
Configure Instant File Initialization before creating database
Configure Instant File Initialization before extending files. Example: TEMPDB
Configure Instant File Initialization before restoring database on a new server
Don’t forget to restart SQL Server after configuring Instant File Initialization
Windows Azure Conference 2014
What else to consider for data disk
performance?
•
•
•
Disk warm-up
NTFS Allocation Unit Size ?
Single vs. multiple storage accounts with a single VM
– DO NOT SPREAD DATA FILES OF A SINGLE DATABASE INTO MULTIPLE STORAGE
ACCOUNTS !!!
• Data in different blobs not written at the same time
• BLOBs that make up the stripe set could be out of sync
– Instead:
• Spread the data files across multiple disks to achieve higher IOPS / bandwidth
– Note: a storage account has a limit of 20K tps
Windows Azure Conference 2014
Performance Monitoring &
Troubleshooting
Key tools
Windows Azure Conference 2014
Storage
Analytics
VM
Dashboard
SQL Perf
counters
Performing IO Benchmarking tests
•
Goal – determine IO capacity of VM configurations
– SQLIO
–
–
–
Disk Subsystem Benchmark Tool
Performance metrics
DMVs
Windows Azure Conference 2014
Windows Azure Storage Analytics
Metrics
•
•
Tracks aggregated storage usage for Blobs, Tables and Queues
– Capacity – e.g. #containers, total #blobs
– Requests - #requests, total ingress/egress, average E2E latency and server latency, total # failures
by category, etc.
– Access via storage account namespace
https://<accountname>.table.core.windows.net/Tables("$MetricsTransactionsBlob")
Enable in portal or using Set Blob Service Properties (REST API)
– Set retention policy
See Windows Azure Storage Metrics: Using Metrics to Track Storage Usage
Windows Azure Conference 2014
Performance Charts on the WA
Portal
•
•
VM Dashboard
Monitor tab for storage account
– Enabled under the “configure”
tab
– VM read and write to their VHDs
using GetBlob and PutPage
commands respectively
Windows Azure Conference 2014
Use Perf counters
Typical SQL KPIs
Typical Web App KPIs
•
Max val for \Process(SQLServ)\% Processor Time
•
Max val for \ASP.NET Applications (_Total_)\Reqests/sec
•
Avg val for \Process(SQLServ)\% Processor Time
•
Avg val for \ASP.NET Applications (_Total_)\Reqests/sec
•
Max val for \Processor(_Total)\% Processor Time
•
Avg val for \Memory\Available Mbytes
•
Avg val for \Processor(_Total)\% Processor Time
•
Max val for \Processor(_Total)\% Processor Time
•
Max val for \SQLServer:SQL Statistics\Batch Requests/sec
•
Avg val for \Processor(_Total)\% Processor Time
•
Avg val for \SQLServer:SQL Statistics\Batch Requests/sec
•
Avg val for \ASP.NET\Request Wait Time
•
Avg val for \ASP.NET\Request Execution Time
•
Avg val for \ASP.NET\Requests Queued
•
Avg val for \ASP.NET\Requests Rejected
•
Avg val for \ASP.NET\Requests Current
Typical User/test characteristics
• Number of concurrent users
• Average/Max request execution time
• Number of web servers
• Ramp up period, test method
• Start and end time of test
Windows Azure Conference 2014
Troubleshooting Common VM Issues
Issue
KPIs To Monitor
Actions to Consider
CPU at or near 80%
% Processor Time (_Total)
SOS_SCHEDULER_YIELD waits
Increase instance size
Identify top consuming queries and tune
Load balance (e.g. move DB to another instance)
Near I/O capacity limits or
IO Latency Increases
Memory resource pressure
Windows Azure Conference 2014
Average disk reads per second
Average disk writes per second
Disk reads per second
Disk writes per second
io_virtual_file_stats
PAGEIOLATCH waits
SQL Server: Buffer Manager\Page Life Expectancy
Memory: Available Bytes
Memory: Pages per second
SQL Server: Buffer Manager\Page Life Expectancy
Process: Working Set (for SQL Server)
RESOURCE_SEMAPHORE waits
Check Page Life Expectancy counter, for mem pressure.
Increase instance size
Identify which DB and log files have I/O bottleneck
Add more data disks and separate data files if near IOPS
limits per disk
Tune queries to reduce reads and writes
Consider enabling row or page compression
Check max server memory setting for SQL Server.
Use high memory instance
Identify SQL component (such as, CLR, high memory
grants for app queries, et.), tune appropriately.
Whitepaper
Performance Guidance for SQL Server in
Windows Azure Virtual Machines
Published: June 2013
Download it here:
http://go.microsoft.com/fwlink/?Link
Id=306266
Windows Azure Conference 2014
Conclusions
• Read the Performance Guidance for SQL Server in Windows
Azure Virtual Machines white paper
• Plan and test for IO perf variability
• Identify optimal VM size
• Optimize for reduced IO and network round trips
• Use Filegroups and multiple data disks for large DBs
• Identify your KPIs to monitor
• Revisit optimization decisions as workload changes
Windows Azure Conference 2014
Windows Azure Conference 2014
Windows Azure Conference 2014