Mission Critical SQL Server

Download Report

Transcript Mission Critical SQL Server

http://blogs.msdn.com/b/sqlcat/
• SQL Server 2016, Azure SQL DB v12, SQL DW and SQL Server on Azure VM
Objectives
Provide insight into how SQL Server is used in Mission Critical Environments
Provide real-world customer engagements and design to meet business critical needs with SQL
Server
Takeaways
 SQL Server is used in Mission Critical Business Environments
 SQL Server is Enterprise ready
“[We] operate assembly factories 24
hours a day, seven days a week”
“If this system is down our
business suffers”
“We require 100% uptime for
32400 seconds each business
day”
“Planned or unplanned system
downtime is not acceptable to our
business users or our customers.”
“Failure is not an option”
“Each minute of downtime
costs us money and
customers”
 BWin.party – Online Betting and Gaming

 Nasdaq - Financial

 Quanta Computer - Technology

 SAT – Government

 Temenos – Financial ISV
 Develop core banking software with high throughput on SQL Server
Company Profile
•
•



Business Requirements
•
•
•
•
•
•
•
SQL Server Environment
• 100+ SQL Server Instances
• 120+ TB of data
• 1,400+ Databases
• 1,600+ TB storage
• 450,000+ SQL Statements per second on a single server
• 500+ Billion database transactions per day
Core component in solutions designated for:
• Financial transactions
• Gaming environments
• Tracking user state throughout the system
Solutions primarily scale-up using commodity hardware
TSI-LB2 Farm: BookieTool
BCM Server
(WebServices)
awp2w3ii6081
awp2w3ii6082
WCF Soap
Configure Event Data
Result Mapper
(Windows Service
fr1p1wwsbk005
fr1p1wwsbk006
fr1p1wwsbk007
fr1p1wwsbk008
fr1p1wwsbk101
fr1p1wwsbk102
fr1p1wwsbk103
fr1p1wwsbk104
fr1p1wwsbk105
fr1p1wwsbk106
- Odds
- Betting content
- Results
TaskServer
Posting msgs for automated resulting
IXI
BWinBookmaking
TSI
At1p1wsmsc011
(inactiv)
At2p1wsmsc011
(activ)
At1p1wsmsc012
(activ)
At2p1wsmsc012
(inactiv)
at1p1wwsbk027
at1p1wwsbk028
At1p1wwsbk029
at1p1wwsbk030
at1p1wwsbk031
at1p1wwsbk032
at1p1wwsbk026
Bookmaking DB
Loading Events and Markets
and send Results)
WCF (Soap)
fr1p1wwsbk001
fr1p1wwsbk002
fr1p1wwsbk003
fr1p1wwsbk004
at2p1wwsbk021
at2p1wwsbk022
at2p1wwsbk023
at2p1wwsbk024
at2p1wwsbk025
at2p1wwsbk026
at2p1wwsbk027
at2p1wwsbk028
at2p1wwsbk029
at2p1wwsbk030
at2p1wwsbk031
at2p1wwsbk032
at2p1wwsbk041 at2p1wwsbk043
at2p1wwsbk042 at2p1wwsbk044
at1p1wwsbk041
at1p1wwsbk042
at1p1wwsbk043
at1p1wwsbk037
at1p1wwsbk038
ComAr
at1p1wwsbk039
at1p1wwsbk040
ComMx
at1p1wwsbk045
at1p1wwsbk046
Betoto
at1p1wwsbk035
at1p1wwsbk036
www2
IT
at1p1wwsbk044
at2p1wwsbk037 at2p1wwsbk038
at2p1wwsbk039 at2p1wwsbk040
at2p1wwsbk045 at2p1wwsbk046
at2p1wwsbk035 at2p1wwsbk036
Data Input
Load every second
Std BCM
client/serverDirect WebService calls
- Creating markets
communication
- Searching
(WCF Soap)
Configure
Market TG to be resulted
Direct WebService calls
- Resulting
- Payouts
RNG
(Java
BCM Client
Scoreboard
Loading Bet content
Load every second
Logic resultings, how market
template groups should be
resu lted based on happenings )
Odds are stored, liabilities read
(high load)
Ixipts001150 ( inactiv)
Betcache 5
Replicator
at1p1wsmsc011
at1p1wsmsc012
at2p1wsmsc011
Every second
at1p1wsmsc012
ixipac177104
ixipac177105
LiveBets Deploy
Manager
IXI I NT.betcache
IP:10.130.64.20
AWP0DBSQL501
enriched with some
additional fields
At2p1wwbmk001 (active)
Betcache 3
Replicator
INTNET
ixipac177103
Awp2inapp005 (activ)
RNG database
AWP0DBSQL502
At2p1wwbmk002 (passive)
Scoreboard
XML (Scores)
Mainfeed XML
(Event Data,
Odd Changes,
Markets..)
BetCache 5 DB
AWP0DBSQL503
LEFTNAV (Bet
Content Live
Events)
AWP0DBSQL504
BetCache 3 DB DMZNET
BetCache 3 DB INTNET
Every second
Load during start up
BCM Client
Assemblies (BCM UI
controls used in LBAI)
LBAI Server
- .Net Remoting
Singelton
- Windows Service
.Net Remoting
LBAI Client
- WinForms
Trade
ixipac177102
Sending XML Scoreboard
TSI LB2 Farm: ht tp.lbai ( PriorityGroups)
Configure
ixipac177101
Betcache 3
Replicator
DMZNET
TaskServer
LBAI Client
Assemblies (LBAI UI
controls used in
BCM)
IXI D MZ.betcache
IP:10.130.176.200
Load every second
Translations
Load every 15/60 seco nds
XMLs Published
AT0P0WDSQL020 /awp0dbon001
BetCache DLL
Holding current
content
Only few SPs often used
BetAndWin DB
- Heartbeat server
- Odds
Soccer odds pulled
Tibco Nod es
Bookmaker
from queue and
updated in DB
asp1inmsc001
asp1inmsc002
- User data
- Slips and Transactions
BetandWin
XML Scoreboards
are persisted periodically
and lo aded if newer
State Server
Bet Placement Pre-Checks
Sportsbook Web
(Live/Mainbook WebService)
Live Bets Replicator
(4 instances)
Request/Reply Odds Calculation
- For All Donau supported sports
- For Mainbook and Livebook events
TSI
IXI
All Sports in Mainbook
via Request/Reply
All Live Events
- call with input parameters
Live Soccer Odds sent
~ every 30 seco nds
Ixiplal130050
(IXI_Group1)
Awp2inapp026
(TSI_Group1)
Ixiplal130051
(IXI_Group2)
Awp2inapp026
(TSI_Group2)
LBAI Scoreboard DB
lbaiscoreboard
IXI
Donau Service
(implemented as WebService within IIS)
IXI LB2 Farm:ht tp.donau.intranet
IP:10.130.0.72
Request/Reply Odds Calculation
- For All Donau supported sports
- For Mainbook and Livebook events
Farm: live.bwin.co m
IP:195.72.134.115 (round Robin)
IXI_Group1
IXI_Group2
TSI
Bet Placement Pre-Checks
(CSS Republisher)
BPS
(Bet Placement Service)
Placed Bets
Live FeedWebs
(12 servers for
each replicator)
CSS DB
Farm: live.bwin.co m
IP:195.72.135.26 (round Robin)
TSI_Group1
Write placed Bets
Daily Sync of
Placed bets
Placed Bets
(Oracle DB holding Placed Bets)
TSI_Group2
LEFTNAF
(Bet Co ntent
Live Events) +
Translation
Scoreboard
(Scores)
Input for ELI
Send cumulated
mybets view data
CSS
Mainfeed (Event
Data, Odd
Changes,
Markets..) +
Translation
(Customer Slip Store)
Java
Live Betting Services
(Live/Mainbook WebService)
(MSSQL DB holding Bet Slip Status)
BwinBPSBetStatusStore
Generate HTML
IXI
Racebook Server
(.Net WebService)
externally by GNI
Log Calculations
- For all Request/Reply calls?
TSI
TSI
MyBets View
C# Wrapper for CSS
State & Scheduling DB
Bwin.com client browser window
Racebook Client
(Flex Web client)
Horse Racing
Live Betting Client
(ELI)
**SQL Server DBs
Customer
IXI
IXI
- Storing input/output of Donau calls - State and timing storage for soccer
MyBets View ELI
View betting history
Timing Service
(handles bet
delays)
Calendar, State,
Translations,
MultiEvent ...
Myb ets.itsfogo.co m IP:195.72.135.53
Auditing DB
BPS Status DB
Place Bet
Mainbook HTML page
(Includes MyBets view in
HTML)
TSI
Multiple databases, and multiple availability groups in the topology
Plan for the worst case scenario: Loss of a complete datacenter
RPO: Zero data loss
RTO: 10 seconds or less
>99.99% availability in the last number of years
>99.99% availability even with maintenance
Multiple data centers (3 utilized in most cases)
Data centers hosting primary and secondary replicas are 11 km apart
SYNC
Primary
SYNC
AutoFailover
Partner
File Share
Witness
ASYNC
DataWarehouse
Application
Session State is at the heart of BWin.party platform
Controls user context as they move through the system
Initial configuration able to handle approximately 15,000 users per SQL Server instance
Scale-out 18 SQL Servers
SessionState
WebServer Farm
Latching lead to need to scale-out without fully utilizing system resources.
Faster site and page loads would allow for more user bets and provide a better overall experience
Memory-Optimized Tables
Size in Memory: ~20 GB
Rows in largest tables: 5 million peak
Durability: SCHEMA_ONLY
Indexing: Mostly HASH indexes, with one nonclustered (range) index
Natively Compiled Procedures
Conversions Required:
Migrated all to insert/update/delete to Natively Compiled
InterOp:
Used wrapper procedures for handling conflicts (to avoid client receiving error)
and BLOB passing.
Kept other parts as InterOp as no need to migrate as not the bottleneck
Hardware
4-socket blade. 60 Cores, HT disabled, 512 GB RAM, 900 GB HDD (SAS, RAID-1),
10 GB NIC
Development time
Initial prototype implementation done in 2 days
No application or web server code changes required
In Production for almost 2 years
Before Migration
SQL Server In-Memory OLTP
Company Profile
Established in 1971 as the world’s first electronic stock market
Owns and operates:
3 clearing houses, 5 central securities depositories
26 markets with a combined value that exceeds US$8
trillion.
In addition, Nasdaq's exchange technology, including trading, clearing, CSD
and market surveillance systems, is in operation in over 100 marketplaces
across USA, Europe, Asia, Australia, Africa and Middle East.
Business critical to world financial markets operations
Business Requirements
Availability: Data must be 100% available during business day
Performance: Is as critical as data availability. Handle a million
trans/second with sub ms latency
Managing large data volumes: Working with petabytes worth of
data
SQL Server Infrastructure
•
Almost 200 production SQL Server instances
Single High Transaction throughput system provides:
Mission critical to the business in terms of performance and
availability
Project Description
Maintains US Equities and Options trading data
Processing 10’s of billions of transactions per day
Average over 1 million business transactions/sec into SQL Server
Peak: 10 million/sec
Require last 7 years of online data
Data is used to comply with government regulations
Requirements for “real-time” query and analysis
Approximately 500 TB per year, totaling over 2PB of uncompressed data
Largest tables approaching 10TB (page compressed) in size
Early Adopter and upgrade to SQL Server 2014 in-order-to:
Better manage data growth
Improve query performance
Reduce database maintenance time
Data at this scale require breaking things down into manageable units:
Separate data into different logical areas:
• A database per subject area (17)
• A database per subject area per year (last 7 years)
Table and Index Partitioning:
• 255 partitions per database
• 25,000 filegroups
• Filegroup to partition alignment for easier management/less impact moving data
• Filegroup backups
Taking advantage of compression:
• Compression per partition
• Backup compression
Space Savings:
Year 2012 Database (in TB)
• Compression savings:
600
• 2x-4x over page
compressed
• 5-8x over
uncompressed data
500
400
300
• Additional 2x with backup
compression
200
100
0
Uncompressed
Page Compressed
CCI compression
CCI + with backup
compression
• Archival compression testing
provided 20% further
compression
Load Performance:
Eliminated need for index creation after data-load. Loading directly into Columnstore
Better positioned to keep up with ever-increasing data volumes
On Average 1.5x throughput in loading files. Insert-only via bulk API
Query Performance:
Type of Query
Page Compressed
CCI
Gains
Single Day
5 seconds
1 second
5x
History Report queries*
16.5 min
1.5 min
12x
*More common
Ease of migration:
• Transparent changes to end-user. No application code changes
• Greatly reduced maintenance and troubleshooting overhead. No need to do index
maintenance such as defragment the index
Company Profile




Business Requirements




MSS : Manufacturing Services Solution
Quanta ERP Architecture
E-Customs
Finance Management
OF/
B2Bi Solution
Warehouse Management
SQL Server ERP System
Sales/Marketing Business
Manufacturing
Purchase/Warehouse
Supply Chain
Management System
Computer
Integrated
Manufacturing
Process
Modelling
Workflow
Product Data
Management
Engineering
Change
Documentation






To provide High Availability use
SQL Server AlwaysOn Availability
Groups (AGs)
• Synchronous mode to
provide HA for DB tier
• Use Log Shipping for DR
to remote site
Corporate HQ
Hardware:
2 x HP DL980 1TB RAM
FusionIO cards accelerate log
write performance
Log Shipping
DL980, 1TB RAM
AlwaysOn AGs are handling peak
workloads on Quanta SAP ERP
system
SYNC
DL980, 1TB RAM
Remote DR site
Working with SQL Server 2008 (in 2009)





SQL Server Today

 DB size is kept at ~10TB via aggressive data archiving
Measure of Users, throughput and Latency:
Over 5000 users in the system,
Transactions:
>15,000,000 SAP dialog steps per day
Translate into more than 25,000,000 database transactions/day
CPU
utilization:
Avg <= 30%
Peak 70%
Company Profile



Business Requirements




• Raw data reside in Azure Blob Storage
• Over 90 databases with multi-TBs of metadata
• Mix of Azure SQL Databases and SQL VMs
• Cold vs warm data
• SQL VMs have multiple AlwaysOn AG replicas
• Each AlwaysOn
replica has 20+
Datafiles on Blob
Storage
• Utilize multiple
storage accounts on
Azure to maximize
throughput
• Testing recently
released Premium
Storage to increase
bandwidth and
reduce latency
Company Profile


Business Requirements


Benchmark Results on SQL Server
 The results of the benchmark testing exceeded Temenos expectations
 At peak performance, the system processed 11,500 transactions per second in
online business testing
 Averaged more than 10,000 interest accrual and capitalizations per second during
close-of-business testing,
 Processing 25 million capitalizations and account accruals in less than 42 minutes,
with average database requests over 260,000/s.
 Testing showed a near linear scalability of 95 percent as the number of
agents increased
 The performance results make the combination of T24 and Microsoft
technologies a viable solution for the world’s largest banks
SQL Server is used by many businesses, across geographical region and industry, to
meet their Enterprise Mission Critical requirements
SQL Server is used to handle some of the most demanding workload in the world
SQL Server features such as: AlwaysOn Availability Groups and In-Memory
technologies are utilized to meet challenging business needs around High
Availability and Performance
Businesses are utilizing SQL Server relational capabilities on premise, and in
cloud(Azure) both with VMs and Azure SQL DB (PaaS) to meet their needs
 Case Studies
 BWin.party
 Nasdaq
 Quanta
 Service of Tax Administration
 Temenos
http://myignite.microsoft.com