StretchDB-joeyongx
Download
Report
Transcript StretchDB-joeyongx
SQL SERVER
STRETCH
DATABASE TOP
TO BOTTOM
JOE YONG
SR. PROGRAM MANAGER
M I C R O S O F T C O R P.
LIGHTNING POLL
When you think about your organization’s enterprise
storage or SAN experience are you
1
5
10
What is your data center footprint and DBA team size
compared to database size and complexity growth?
Thinking of your largest OLTP databases, how many rows in
the largest tables? What percentage is cold data?
If you have databases with very large tables
What is your maintenance (e.g. re-index, performance tuning)
experience?
Is there impact to their availability and performance SLAs?
ACTUAL CUSTOMER RESPONSES
Is there a 0 or negative rating?
I want to “talk” to the person who came up with that “do more with less” thing
45 billion rows in a single table (99% cold)
1.3 trillion rows in a partitioned table (>80% cold)
8TB of data in a single table
Re-index? That’s not funny man
We can’t even update statistics beyond 10% sample
We are down for about one day if we need to do a full DB restore though
about 70-80% of the data is cold
~11TB database, no backup, no DBCC CHECKDB in 1.5 years
WHY STRETCH DATABASE?
Ever growing databases
Existing solutions
Massive tables (hundreds of
millions/billions of rows, TBs size)
Keep adding storage and
datacenter capacity
Users want to keep all their data and
use it too
Closed business (cold) data accessed
infrequently but must be online
Does not solve maintenance
and SLA issues (unless you put
cold data in SSDs)
Maintenance challenges (e.g. re-index)
Move data to another location –
database, flat-files, backups, tapes
Business SLAs at risk (e.g. restore time)
Data is near-line or offline
Delete older data
Unknown if data is valuable
Unacceptable or illegal in some
organizations/industries
STRETCH SQL SERVER INTO AZURE
Stretch Database
Microsoft Azure
Mike Wan
Order history
Name
ox7ff654ae6d
3/18/2005
Stretch to cloud
SSN
Steve Karimi
Date
Customer data
cm61ba906fd
2/28/2005
Mike Wan
ox7ff654ae6d
3/18/2005
Product data
Sapna Matuszyk
SQL Mojoe
4/10/2005
i2y36cg776rg
Order History
nx290pldo90l
4/27/2005
Blazej Zwilling
ypo85ba616rj
5/12/2005
Cheng Jeswani
bns51ra806fd
5/22/2005
Nikhat Lindell
mci12hh906fj
6/07/2005
Whitney Lang
utb76b916gi
6/18/2014
Lorenzo Olds
px61hi9306fj
7/1/2014
Sophie Cook
ol43bi506gd
7/12/2014
Aida Durham
tx83hal916fi
7/29/2014
Jaclyn Wade
nb95re926gi
8/10/2014
Adrienne Battle
vc61ira536fe
8/23/2014
Query
App
Stretch cold tables securely from
SQL Server databases to Azure
with remote query processing
Benefits
Cost effective online cold data
Entire table is online and
remains accessible from existing
apps
No application changes
Addresses hard problems
arising from very large tables
Secure by default with optional
additional security with Always
Encrypted and Row Level
Security
SECURE BY DEFAULT (AND THEN SOME)
Data in motion always via
secure channels (TLS1.2 / 1.1)
Supports Always Encrypted
Encryption key remains onpremises
Row level security
SQL Server and SQL Azure
audit
Integrated authentication
(pass-through) with AAD
integration
KEY TAKEAWAYS
Secure communication channel is mandatory
Additional SQL Server data encryption technologies
work transparently
Adheres to SQL Sever security hierarchy
SQL Server and Windows integrated authentication
No change to authorization, permissions model
Audit SQL Server and/or Azure SQL Stretch database
TRICKLE DATA MOVEMENT
Azure SQL Stretch
Database
On-premises
application(s)
On-premises
SQL Server
instance
Trickle data migration
DB in SAN/Local
Storage
SQL Server
Ord_Detail
Txn_detail
Ord_detail_
archive
table
Hot + cold
rows in
same table
Compute
Storage
Ord_detail_
archive
table
Txn_detail
(cold rows
only)
Entire archive Only cold rows
moved
table moved
KEY TAKEAWAYS
Transactionally consistent data movement
Moved in batches and tracked with batch IDs
Fixed batch sizes in SQL Server 2016 release
Optimized for low impact to existing production
workload
DBAs have some control by scheduling pause/resume
via SQL Agent
SMART QUERY PROCESSING
Azure SQL Stretch
Database
Local +
Local data
remote data
only queries
queries
On-premises
application(s)
On-premises
SQL Server
instance
Trickle data migration
Transparent remote
data access
DB in SAN/Local
Storage
SQL Server
Compute
Ord_Detail
Txn_detail
Ord_detail_
archive
table
Hot + cold
rows in
same table
Storage
Ord_detail_
archive
table
Txn_detail
(cold rows
only)
KEY TAKEAWAYS
All cases
No modifications to existing queries
Common optimization techniques still apply
Azure SQL Stretch DB service will “get smarter” without user
action (tune, upgrade, etc…)
Users can increase/decrease performance level of Azure SQL
Stretch DB as needed
Stretch entire table
Queries always pushed to local and remote
Stretch cold rows only (predicate based stretch)
Some smarts about query execution – predicate injection to
query tree during algebrazation so optimizer can prune
remote branch if appropriate
HIGH AVAILABILITY AND DISASTER
RECOVERY
Supports existing SQL Server
HA/DR solutions, except
Backup/restore
DB mirroring – deprecated for SQL
Server
Replication – not supported for
StretchDB in general
DBAs backup/restore local SQL
Server hot data only
StretchDB ensures remote data
is transactionally-consistent with
local
Upon completion of local
restore, SQL Server reconciles
with remote using metadata
BACKUP & RESTORE
Azure SQL Stretch
Database
SQL Server
On-premises
SQL Server
instance
On-premises
application(s)
Compute
Storage
“Shallow”
backup
Regular backup
& restore
restore
&
Table 1
Table 1’
Table 2
Table 2
Table 2
DB in SAN/Local
Storage
Ord_detail
Txn_detail
Ord_detail_arc
hive
table
Txn_detail
(cold rows
only)
KEY TAKEAWAYS
Existing HA/DR technologies and practices still apply
Always On Availability Group with readable secondary works
transparently
Backup/restore experience improves after most of the cold data
has been migrated to Azure
Users backup/restore SQL Server database and logs
Upon completion of restore, run sp_reauthorize_db – triggers
reconcile
Full data copy operation if restore with copy of Azure SQL
Stretch database
SQL Server database restore time and reconcile with metadata if
restore in-place for Azure SQL Stretch database (e.g. 20TB
database, 80% cold data, restore time is restore 4TB SQL Server
and reconcile with Azure SQL Stretch database)
CURRENT LIMITATIONS AND NON-USE
CASES
Limitations
Certain data types, replication, computed columns, etc…
Table with primary key in a PK/FK relationship – cannot stretch parent
tables, child tables ok
Cross database constraints not supported
Detailed list at https://msdn.microsoft.com/enus/library/mt605114.aspx
Non-use cases
Not for frequently accessed data (hot)
Can connect directly to Azure SQL Stretch database but not
recommended
Does not alleviate performance/scale issues for application that have
inefficient design/queries
Can amplify existing performance issues
COMMON CONCERNS
We do not trust public clouds
Encrypt sensitive data with Always Encrypted, keep encryption key on-premises;
Microsoft never has access
Azure is designed to meet/exceed security and compliance requirements with regular
tests and audits
Price is higher than expected
Current entry level optimized for dense/premium compute
Lower entry level work in progress
No customer should be blocked by pricing, contact [email protected]
for help
Licensing and capacity planning is really complicated
StretchDB enabled on all editions
Pay based on consumption; storage and compute charged separately
What is with the PK/FK, unique constraint, XML type and other limitations?
Only stretch child table in parent/child relationship today
Assumes uniqueness already validated before data turns cold
Most limitations are short term and will be removed gradually in future
iterations based on customer feedback
STRETCH DATABASE TAKEAWAYS
Works without code change for most applications
Can save customers a lot more than money – business
SLA, performance, administration, design simplicity
Secure by default and supports advanced security
requirements
Existing HA/DR solutions works as-is, may have
significant restore time reduction for some scenarios
Performance will be at reasonable levels for cold data
scenarios; not intended for active OLTP data
SQL Server 2016 targets customers with large amounts
(TBs) of cold data
WE WANT TO HEAR FROM YOU
Customer engagements, technical/business questions,
project issues
[email protected]
Product/feature feedback, bugs, feature requests
https://connect.microsoft.com/SQLServer/Feedback
Twitter
#stretchdb
A BIG THANKS TO OUR SPONSORS