data - Microsoft

Download Report

Transcript data - Microsoft

Big data analytics
Rafal Lukawiecki
Strategic Consultant
Project Botticelli Ltd
[email protected]
@rafaldotnet
Objectives
The information herein is for informational purposes only and represents the opinions and views of Project
Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors.
Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation.
Portions © 2014 Project Botticelli Ltd & entire material © 2014 Microsoft Corp unless noted otherwise. Some
slides contain quotations from copyrighted materials by other authors, as individually attributed or as already
covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other
product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The
information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of
the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions,
it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli
cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli
makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
Register on
projectbotticelli.com
Introduction to BI & Big Data
DAX
MDX
Data Mining
Excel BI
projectbotticelli.com/ppt
Big data, or just complex data?
preparing
interpreting
velocity
volume
Data
variety
complexity
Domain
Common big data scenarios
Financial services
Modeling true risk
Threat analysis and fraud detection
Trade surveillance
Credit scoring and analysis
Media & Entertainment
Recommendation engines
Ad targeting
Search quality
Abuse and click fraud detection
Retail
Point of sales transaction analysis
Customer churn analysis
Sentiment analysis
Telecommunications
Customer churn prevention
Network performance optimization
Call Detail Record (CDR) analysis
Network failure prediction
Government
Cyber security (botnets, fraud)
Traffic congestion and re-routing
Environmental monitoring
Antisocial monitoring via social media
Healthcare
Genomics research
Cancer research
Health pandemics early detection
Air quality monitoring
Which big data?
PDW: near real-time insights
Real-time with complex event processing
Low latency
Sub-zero processing of large event streams
Continuous insight through historical data
mining
Advanced analytics
Descriptive & predictive
Clustering, neural nets, decision trees,
time series, naïve Bayes, sequence
clustering, linear and logistic regression
Semantic search
Conceptual similarities
Geospatial
Geometry and geography
Big data
Hadoop, Mahout
Microsoft HDInsight
Apache Hadoop distribution
Developed by Hortonworks & Microsoft
Integrated with Microsoft BI
010101010101010101
1010101010101010
01010101010101
101010101010
Big, fast, or
complex
data
Microsoft
HDInsight
Tabular
OLAP
SQL
PDW +
Polybase
Interaction,
exploration,
reporting,
visualisation
Hadoop principles
Practical method for
massive parallelisation of
analytical data processing
Part 1: the job
Hadoop data
Hadoop MapReduce
Hadoop cluster
Yahoo! Hadoop cluster, about 2007.
Source: http://developer.yahoo.com. Picture used with permission.
Hadoop cluster
Buster Cluster, an early research project
by Miles Osborne, University of
Edinburgh, School of Informatics.
Picture used with permission.
http://homepages.inf.ed.ac.uk/miles/
Cloud
rent-a-Hadoop-cluster, or:
“Supercomputer for cents”
Windows Azure HD Insight
Processing logic in HDInsight
1.6 2.1 3.0
Processing logic in HDInsight 3.0
Hadoop 2.2
Hadoop data science
Mahout 0.9 (not HDInsight 3.0 yet)
Collaborative filtering,
recommenders, clustering,
singular value decomposition,
parallel frequent pattern mining,
naive Bayes, decision tree
Part 2: the results
Summary
projectbotticelli.com
BI video tutorials, PPTs, and articles
15% Off: 2014SWISS15
Valid in March 2014 only
Follow: @rafaldotnet
Email: [email protected]
Discover: rafal.net
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties,
express, implied or statutory, as to the information in this presentation.
Portions © 2014 Project Botticelli Ltd & entire material © 2014 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright
ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and
represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and
Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.