The “DMA Analytics Council Presents” Series

Download Report

Transcript The “DMA Analytics Council Presents” Series

Analytics-CRM Community
Town Hall
Machine Learning &
Automated Modeling
March 18, 2015
We will be starting at the top of the hour.
Please stay on mute *6 -- not hold.
This session is NOT being recorded.
This is Your Member Community
Members are always welcome to:
» Share knowledge with other analytics pros.
» Contribute to industry thought leadership.
» Get published or be a discussion leader.
» Network to advance your business and
career.
» Email [email protected] with
your questions or suggestions.
Analytics-CRM Community
» Monthly Calls
» Analytics Journal
» Analytics Challenge
2015 – details soon
» Analytics Advantage–
be a guest blogger
» Industry Visibility – be
a speaker or discussion
facilitator
» Thought Leader –
assist in DMA content
planning
» Awareness – Advertise
or sponsor
» Self-Regulation – be
compliant
» Members access all at
http://thedma.org/acc
Upcoming Events
Apr 8
Big Data, Small Data, Clean
Data, Messy Data
May 13
Town Hall
Jun10
Big Data Enabled Analytics for
Actionable Customer Insight
July 8
Town Hall
Aug 12
Practical Text Analytics
Stephen Yu, Willow
Data Strategy
Amit Deshpande,
Epsilon
Steven Struhl,
Converge Analytic
Register at www.thedma.org
Today’s Town Hall Topic
Machine Learning & Automated Modeling
Led by:
» Marty Rose, Senior Data Scientist, Acxiom
» Peter Zajonc, Senior Director, Epsilon
Please stay on mute *6, not hold, until prompted for
questions or use chat function to post questions.
This session is NOT being recorded.
Where To Next?
Goedel machines are selfreferential universal problem
solvers making provably optimal
self- improvements.
http://people.idsia.ch/~juergen/go
edelmachine.html
Definitions – What it is
Machine Learning
Automated modeling
Big Data
Differences
How is Automated modeling
different in the context of Machine
Learning and Big Data?
What are the benefits and why
should we pay attention to this?
Automated Analytic Techniques
» Data Reduction and Standardization
» Predictive Modeling
• Traditional: Simple to Complex Regression
• Statistical Learning Techniques
» Unsupervised Classification Techniques
Too much data incoming?!
Two Real life examples
1. Recommender System – Beauty
Products at Retail Counter
2. New Data Product Extensions
Letting go is hard…(!)
Future
What’s the role of the future analyst
in this brave new world?
-orDo the analysts still hold the keys
to the car? Who will be driving?
References – Will be posted on web site
Useful intro videos
»
Machine Learning: Real Basics, with Ron Bekkerman (LinkedIn Tech Talks) -https://www.youtube.com/watch?v=wjTJVhmu1JM
»
Lecture 01 - The Learning Problem (CalTech) -- https://www.youtube.com/watch?v=mbyG85GZ0PI
»
Lecture 1 | Machine Learning for Engineers (Stanford) -- https://www.youtube.com/watch?v=UzxYlbK2c7E
»
For do-it-yourselfers (Amazon Web Services and Python) -https://www.youtube.com/watch?v=k890Dr5OkZg&list=PLRJx8WOUx5XdosSIpI34ijGVAxCSG_jjT
Critique and Articles
»
http://www.kdnuggets.com/2015/03/all-machine-learning-models-have-flaws.html
»
http://www.kdnuggets.com/2015/03/machine-learning-data-science-common-mistakes.html
»
http://ml.posthaven.com/machine-learning-done-wrong
»
Forrester Wave Reports. Big Data Streaming Analytics, Web Analytics, Cross-Channel Business Analytics, Big
Data Hadoop Solutions, etc. Gartner. Magic Quadrant Reports, which classifies participating companies into
four quadrants: visionaries, leaders, challengers, and niche players.
Marty’s Book List
»
An Introduction to Statistical Learning with Applications in R by James, Witten, Hastie & Tibshirani – This book is
fantastic and has helped me quite a bit. It provides an overview of several methods, along with the R code for
how to complete them. 426 Pages.
»
The Elements of Statistical Learning by Hastie, Tibshirani & Friedman – This is an in-depth overview of
methods, complete with theory, derivations & code. I’d definitely consider this a graduate level text. I’d also
consider it one of the best books available on the topic of data mining. 745 Pages.
»
A Programmer’s Guide to Data Mining by Ron Zacharski – This one is an online book, each chapter
downloadable as a PDF. It’s also still in progress, with chapters being added a few times each year.
References (continued)
»
»
»
»
»
»
»
»
»
Probabilistic Programming & Bayesian Methods for Hackers by Cam Davidson-Pilson – This book is absolutely
fantastic. The author explains Bayesian statistics, provides several diverse examples of how to apply and
includes Python code. Each chapter is an iPython notebook that can be downloaded.
Think Bayes, Bayesian Statistics Made Simple by Allen B. Downey – Another great, easy to digest introduction
to Bayesian statistics. The author’s premise is that Bayesian statistics is easier to learn & apply within the
context of reusable code samples. It includes a number of examples complete with Python code. 195 Pages.
Data Mining and Analysis, Fundamental Concepts and Algorithms by Zaki & Meira – This title is new to me. It’s a
text book that looks to be a complete introduction with derivations & plenty of sample problems. 599 Pages.
An Introduction to Data Science by Jeffrey Stanton – Overview of the skills required to succeed in data science,
with a focus on the tools available within R. It has sections on interacting with the Twitter API from within R, text
mining, plotting, regression as well as more complicated data mining techniques. 195 Pages.
Machine Learning by Chebira, Mellouk & others – This is an introduction to more advanced machine learning
methods. It includes chapters on neural networks, discriminant analysis, natural language processing,
regression trees & more, complete with derivations. Each chapter is downloadable as a PDF. 422 Pages.
Machine Learning – The Complete Guide – This one is new to me. It’s a collection of Wikipedia articles
organized into chapters & downloadable in a number of formats. I didn’t realize they did this, but its a great idea.
Because its a collection of individual articles, it covers quite a bit more material than a single author could write.
Bayesian Reasoning and Machine Learning by David Barber – This is an undergraduate textbook. It includes an
overview, derivations, sample problems and MATLAB code. 648 Pages.
A Course in Machine Learning by Hal Daumé III – Another complete introduction to machine learning topics.
Each chapter is individually downloadable. 189 Pages.
Information Theory, Inference and Learning Algorithms by David J.C. MacKay – Nice overview of machine
learning topics, including an introduction and derivations. One nice feature of this book is that it has a chart that
shows how various topics are related to one another. 628 Pages.
Advanced R by Hadley Wickham. To get the most out of this book, you’ll need to have written a decent amount
of code in R or another programming language.
Your turn
Questions | Comments
Use the chat function or
toggle *6 to unmute/mute yourself.
For free flow discussion,
this session is NOT being recorded.
Speakers Contact Info
Peter Zajonc, Epsilon
845-358-1955
[email protected]
Martin Rose, Acxiom
501-342-8432
[email protected]
Continue the conversation
Linked In: Official DMA + the Analytics Community Group
Twitter: @DMA_USA #dmacommunities
Suggest topics: [email protected]
Participate in the planning of projects
Connect with more Communities: www.thedma.org
THANK YOU FOR ATTENDING