Transcript pptx

Special Topics in
Educational Data Mining
HUDK5199
Spring term, 2013
February 27, 2013
Today’s Class
• Feature Engineering and Distillation - How
Special Rule for Today
• Everyone Who Turned in a Homework
Participates
Let’s go back to the list of features
from the last class
• As I read features off
• If you used this feature (or something very
similar), raise your hand
For the features someone used
• Did it end up in your final model?
• Does this match the class’s overall intuition?
List other features
• I’d like everyone who turned in Assignment 4
• To tell me all of your other features that
ended up in final models
We now have…
• We now have a list of features that ended up
being used in models
So let’s…
• Go through how several of them were created
– Actually do it… Re-create it in real-time, or show us
your code…
• Everyone who turned in the homework will show
the class at least one feature
• No one can show a second feature until everyone
has had a chance to show at least one
Comments? Questions?
What tools were used?
• Did anyone use any additional tools?
• How else could you have created features?
Now let’s…
• Make a supermodel!
Comments or Questions
• About Assignment 5?
Final Thoughts?
If you enjoyed today’s class…
• At some point in the next 2 years, we’ll be
offering a Feature Engineering Design Studio
course…
Next Class
• Monday, March 4
• Automated Feature Creation and Selection
• Assignment Due: None
Excel
• Plan is to go as far as we can by 5pm
• We will continue after next class session
• Vote on which topics you most want to hear
about
Topics
•
•
•
•
•
•
•
•
•
Using average, count, sum, stdev (asgn. 4 data set)
Relative and absolute referencing (made up data)
Copy and paste values only (made up data)
Using sort, filter (asgn. 4 data set)
Using countif (asgn. 4 data set)
Making scatterplot (Jan. 28 class data set)
Making histogram (asgn. 4 data set)
Z-test (made up data)
2-sample t-test (made up data)
• Other topics?
The End