Understanding_the_Azure_Data_Stackx

Download Report

Transcript Understanding_the_Azure_Data_Stackx

Understanding
the Azure Data Stack
Matan Yungman, CTO, Madeira
Where Did We
Come From?
Things Slowly Started to Change
• Volume
• Variety
• Velocity
Things Slowly Started to Change
• New Data Models
•
•
Key-Value, Document, Graph, Column-Oriented, …
Not conforming to old rules
Things Slowly Started to Change
• Distributed Computing
• Cloud Computing
About Me
@MatanYungman
MadeiraData.com
SQLServerRadio.co.il
SQLServerRadio.com
How to Start
• Technologies used in your company
• Architecture view
• Just do it
Cloud Models
IaaS - Azure VMs
Server
>_
POS
Terminals
Smart
Phones
Servers
ATM
Security
Kinect
PCs/
Laptops
Kiosks
Self Checkout
Stations
Slates/
Tablets
Automation
Devices
Point of
Service Devices
Digital
Signs
Vending
Machines
Thin
Clients
Handhelds
Logic
Controllers
Remote Medical
Monitors
Specialized
Devices
Diagnostic
Equipment
Azure VMs - Databases
Hadoop, Cloudera
SQL, Oracle, DB2 Server
MongoDB, Couch etc
Structured
POS
Terminals
Smart
Phones
PCs/
Laptops
Kiosks
Self Checkout
Stations
Slates/
Tablets
Unstructured/NoSql
Servers
Automation
Devices
Point of
Service Devices
ATM
Digital
Signs
Security
Kinect
Vending
Machines
Thin
Clients
Handhelds
Logic
Controllers
Remote Medical
Monitors
Specialized
Devices
Diagnostic
Equipment
Azure SQL Database
A relational database-as-a-service, fully managed by Microsoft.
For cloud-designed apps when near-zero administration and enterprise-grade capabilities are key.
SQL Server in a VM
Best for…
Resources
TCO
benefits
Scalability
Azure SQL Database
Azure SQL Database
SLA
Increased from 99.9% to 99.99% uptime SLA
Performance
New service design point enables scale up of resources, delivering
predictable throughput & performance
Protection
Point-in-time-restore, geo-restore, and standard and active georeplication protect against human & environmental-initiated events
Compliance
Azure certifications: ISO, HIPAA BAA, EU Model Clause
Auditing on SQL Database
Flexibility
Hourly billing & broad set of price points
Columnstore index representation
100x
Up to
faster queries
15x
Up to
more compression
Updateable clustered columnstore vs. table with customary indexing
Parallel query execution
Query
•
•
Results
•
Store data in columnar format for massive
compression
Load data into or out of memory for nextgeneration performance with up to 60%
improvement in data loading speed
Updateable and clustered for real-time trickle
loading
18
Connecting islands of data with PolyBase
Bringing Hadoop point solutions and the data warehouse together for users and IT
Select…
Microsoft Azure
HDInsight
Hortonworks for
Windows and Linux
Result set
SQL Server
Parallel Data
Warehouse
PolyBase
Cloudera
Hadoop
Provides a single T-SQL query model for PDW
and Hadoop with rich features of T-SQL,
including joins without ETL
Uses the power of MPP to enhance query
execution performance
Supports Windows Azure HDInsight to enable
new hybrid cloud scenarios
Provides the ability to query non-Microsoft
Hadoop distributions, such as Hortonworks and
Cloudera
Introducing SQL Data Warehouse
Fully managed relational data warehouse-as-a-service
The first elastic cloud data warehouse with enterprise-grade capabilities
Support your smallest to largest data sets
Elastic Scale
Spin up for heavy workloads, cycle down for daily activity
Buy time to insight based on what you need, when you need it
Choose the combo of compute and storage that meets your needs
Sample - Portal UX
Pause
Data remains in place – no reloading / restoring of data
When paused, cloud-scale storage is min cost
Automate via PowerShell/REST API
$$$$
SQL Server Compatibility
Mature enterprise-ready SQL for sophisticated DW scenarios
Existing SQL Server scripts and tools just work
Continuous enhancements on language surface
Modular programming
(write once, execute multiple
times)
Faster code execution
Encapsulated
programming logic
Easier maintenance of
large tables
Improves performance
Enhanced scalability
and availability
Allows proper use and
comparisons of
characters in different
languages
Mature Column-Store
technology for bestin-class DW query
performance
Document Database
What is a document database?
{
"id": "13244_user",
"firstName": "John",
"lastName": "Smith",
"age": 25,
"employmentHistory" : [
{
"company":"Contoso Inc"
"start": {"date":"Thu, 02 Apr 2015 20:54:45 GMT", "epoch":1428008086},
"position":"CEO"
Ideally suited to this
kind of document -
},
{
"start": {"date":"Thu, 02 Apr 2012 20:54:45 GMT", "epoch":1428008086},
"end": {"date":"Thu, 01 Apr 2015 20:54:45 GMT", "epoch":1428008086},
"position":"GM"},
],
"address":
{
"streetAddress": "21 2nd Str",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"children": [
{"name":"Megan", "age":10},
{"name": "Bruce", "age":7},
{"name": "Angus", "sports" : ["football", "basketball", "hockey"]}
]
"mobileNumber": "212 555-1234"
}
Data normalization
Come as you are
What is a document database?
•
•
•
•
Part of the NoSQL family of databases
Built for simplicity, scale and performance
Non-relational, no schema enforced
Flexible query options
Microsoft Azure Data Services
fully managed, scalable, queryable, schemafree JSON
document database service for modern applications
transactional processing
rich query
managed as a service
elastic scale
schema-free data model
internet accessible http/rest
arbitrary data formats
Azure DocumentDB
Fully-managed, highly-scalable, NoSQL document database service
{ }
SQ
L
query over
schema-free
JSON
JS
multi-document
transactions
tunable, high
performance
fully managed and
designed for massive
scale
Azure Stream Analytics
Process real-time data in Azure
Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure,
and applications
Performs time-sensitive analysis using SQL-like language against multiple real-time streams and
reference data
Outputs to persistent stores, dashboards or back to devices
POS
Terminals
Smart
Phones
Servers
ATM
Security
Kinect
PCs/
Laptops
Kiosks
Self Checkout
Stations
Slates/
Tablets
Automation
Devices
Point of
Service Devices
Digital
Signs
Vending
Machines
Thin
Clients
Handhelds
Logic
Controllers
Remote Medical
Monitors
Specialized
Devices
Diagnostic
Equipment
Power BI Dashboards
Power BI Graphs
Power BI NLP
Power BI Integrations
Azure Data Factory
Orchestrate trusted information production in Azure
Connect to relational or nonrelational data that is onpremises or in the cloud
Orchestrate data movement
& data processing
MapReduce
Hive
Pig
C#
Stored Procedures
Azure Machine Learning
Publish to Power BI users as a
searchable data view
Operationalize (schedule,
manage, debug) workflows
Lifecycle management,
monitoring
Microsoft Confidential – Under Strict NDA
Wasn’t covered but Worth Mentioning
•
•
•
•
Azure Tables
Azure Search
Azure Redis Cache
Cortana
How to Start
• Technologies used in your company
• Architecture view
• Just do it
Please fill evaluation forms
Special thanks to our great sponsors!