Transcript Slides

m
Machine Learning
F# and Accord.net
Alena Hall
• Software architect, MS in Computer Science
• Member of F# Software Foundation Board
of Trustees
• Researcher in the field of mathematical
theoretical abstractions possible in modern
programming concepts
@lenadroid
• Speaker and active software engineering
community member
Machine
Learning
Questions
• Why machine learning?
• What is the data?
• How?
Data Questions.
Data reality :\
Path to grasping
machine learning
and data science…
Contents
•
•
•
•
Multiple Linear
 Regression
Logistic Regression  Classification
K Means
 Clustering
What’s next?
F# for machine learning
and data science!
Why F#?
1. Exploratory programming, interactive environment
2. Functional programming, referential transparency
3. Data pipelines
4. Algebraic data types and pattern matching
5. Strong typing, type inference, Type Providers
6. Units of measure
7. Concurrent, distributed and cloud programming
Data pipelines
Algebraic data types
// Discriminated Union
Pattern matching
Type Providers
Units of measure
Linear Regression
How to predict?
1. Make a guess.
2. Measure how wrong the guess is.
3. Fix the error.
Make a guess!
MATH
Make a guess?
What does it mean?...
Hypothesis /guess :
weights
Find out our mistake…
Cost function/ Mistake function:
… and minimize it:
Mistake function looks like…
Global minimums
How to reduce the mistake?
Update each slope parameter until Mistake Function
minimum is reached:
Alpha
Learning rate
Simultaneously
Derivative
Direction of moving
Fix the error
Multiple Linear Regression
X [ ] – Predictors:
Statistical data about bike rentals for previous years or
months.
Y – Output:
Amount of bike rentals we should expect today or some
other day in the future.
* Y is not nominal, here it’s numerical continuous range.
Make a guess!
Fix the error
“Talk is cheap. Show me the code.”
Multiple linear regression: Bike rentals demand
What to remember?
Linear Regression
1. Simplest regression algorithm
2. Very fast, runs in constant time
3. Good at numerical data with lots of features
4. Output from numerical continuous range
5. Linear hypothesis
6. Uses gradient descent
Logistic Regression
Hypothesis function
Estimated probability
that Y = 1 on input X
Mistake function
Mistake function is the cost for
a single training data example
h(x)
Full mistake function
1. Uses the principle of maximum likelihood estimation.
2. We minimize it same way as with Linear Regression
“Talk is cheap. Show me the code.”
Logistic Regression Classification Example
What to remember?
Logistic Regression
• Classification algorithm
• Relatively small number of predictors
• Uses logistics function for hypothesis
• Has the cost function that is convex
• Uses gradient descent for correcting the mistake
At this point…
Machine Learning
What society thinks I do…
What other programmers
think I do…
What I really do is…
K-Means
Clustering
What’s next?
I’m Lena
@lenadroid
Thank you!
What if it doesn’t
work?
Algorithm debugging tips
•
•
•
•
•
•
Try more data
Try more features
Try less features
Try feature combinations
Try polynomial features
…
What else can go
wrong?
Ideally... the hypothesis will
… just fit the model
Underfitting … Overfitting
Try out different values for the
regularization parameter.
• Regularization…?
• Too big regularization parameter?
-> underfitting - the line is over-smoothed
• Too small regularization parameter?
-> overfitting - too optimized for train data