Educational Data Mining - Pittsburgh Science of Learning Center

Download Report

Transcript Educational Data Mining - Pittsburgh Science of Learning Center

Data mining with DataShop
Ken Koedinger
CMU Director of PSLC
Professor of Human-Computer Interaction & Psychology
Carnegie Mellon University
Ryan S.J.d. Baker
PSLC/HCII
Carnegie Mellon University
Overview





Motivation for educational data mining
Next
DataShop
Learning curves to improve cognitive models
Past project example
Conclusion
What is educational data mining?

“The area of scientific inquiry centered
around the development of methods for
making discoveries within the unique kinds of
data that come from educational settings, and
using those methods to better understand
students and the settings which they learn
in.” (Baker, under review)
What is educational data mining?

More informally: using “large” data sets to
answer educational and psychological
questions


What “large” means is always changing
Developing methods or algorithms to aid in
discovery
What is educational data mining?

One popular data source is “instrumented”
computer tutors


Fine grained, longitudinal, often across contexts
Other data sources


Records of online courses (e.g. WebCAT)
District or university-level student records

Example: www.icpsr.umich.edu/IAED
Educational Data Mining is a hot
topic!



2008: First International Conference on
Educational Data Mining
2008: Launch of Journal of Educational Data
Mining
2009: Second International Conference on
Educational Data Mining


Submissions due in March 2009
www.educationaldatamining.org
Data Mining Questions &
Methods

How can we reliably model student knowledge
or achievement?
 Bayesian Knowledge Tracing


Simple type of “Bayes Net”, getting less simple all
the time
Item Response Theory (IRT)



Basis for standardized tests, SAT, GRE, TIMSS…
Version of “logistic regression”
Many variations & generalizations …
 See
slides of Brian Junker’s EDM08 invited talk
Data Mining Questions &
Methods


What’s the nature of knowledge students are
learning?
How can we discover cognitive models of
student learning?
 Learning Factors Analysis (LFA)
 Extends IRT to account for learning
 Search algorithm: Discover cognitive
model(s) that capture how student learning
transfers over tasks over time
 Rule space, knowledge space, …
Data Mining Questions &
Methods

How can we model students, beyond just what they
know?
 Models of

Choices: Metacognitive & Motivational






Help-seeking
Gaming the System
Off-Task Behavior
Self-explanation
Affect
Involves prediction methods such as classification,
regression (not just linear regression)
Data Mining Questions &
Methods

What features of a tutor lead to the most
learning?
 Learning Decomposition


Explores different rates of learning due to
different forms of pedagogical support
Close relative of Learning Factors Analysis
Data Mining Questions &
Methods

How to extract reliable inferences about
causal mechanisms from correlations in
data?
 Causal modeling using Tetrad
Data Mining Questions &
Methods

And one generally useful tool for figuring out what’s
going on, in any of these cases:
Exploratory data analysis




Summary & visualization tools in DataShop
Tools in Excel
Clustering algorithms
Visualization packages
Overview





Motivation for educational data mining
Next
DataShop
Learning curves to improve cognitive models
Past project example
Conclusion
Find DataShop at learnlab.org/datashop
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Video Intro of DataShop …

View here:
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
DataShop – Dataset Tabs
Datasets you can
view or edit. You
have to be a project
member or PI for the
dataset to appear
here.
Private datasets you
can’t view. Email us
and the PI to get
access.
Public datasets that
you can view only.
Analysis Tools






Dataset Info
Performance Profiler
Learning Curve
Error Report
Export
Sample Selector
Dataset Info
•
•
Papers and Files storage
Meta data for given
dataset
PI’s get ‘edit’
privileges, others must
request it
Problem Breakdown table
Dataset Metrics
18
Performance Profiler
View measures of
•
•
•
•
•
Aggregate by
•
•
•
•
Step
Problem
KC
Dataset Level
Error Rate
Assistance Score
Avg # Hints
Avg # Incorrect
Residual Error Rate
Multipurpose tool to
help identify areas that
are too hard or easy
Learning Curve
Visualizes changes in
student performance
over time
View by KC or
Student, Assistance
Score or Error Rate
Time is represented on the xaxis as ‘opportunity’, or the #
of times a student (or
students) had an opportunity
to demonstrate a KC
Error Report
•
•
View by
Problem or KC
Provides a breakdown
of problem information
(by step) for finegrained analysis of
problem-solving
behavior
Attempts are
categorized by student
Export
• Two types of export available
• By Transaction
• By Step
• Anonymous, tab-delimited file
• Easy to import into Excel!
You can also export the
Problem Breakdown
table and LFA values!
Sample Selector
Easily create a
sample/filter to view a
smaller subset of data
Shared (only owner can
edit) and private
samples
Filter by
•
•
•
•
•
•
Condition
Dataset Level
Problem
School
Student
Tutor
Transaction
Help/Documentation
Glossary of
common terms,
tied in with PSLC
Theory wiki
•
•
•
Extensive documentation with examples
Contextual by tool/report
http://learnlab.web.cmu.edu/datashop/help
New Features

Manage Knowledge Component models


Addition of Latency Curves to Learning Curve
Reporting




Create, Modify & Delete KC models within
DataShop
Time to Correct
Assistance Time
Problem Rollup & Export
Enhanced Contextual Help
Overview





Motivation for educational data mining
Next
DataShop
Learning curves to improve cognitive models
Past project example
Conclusion
Cognitive Modeling Challenge


Premise: High quality instructional design requires a
high quality cognitive model of student thinking
Problem: Creating such a Cognitive Model is hard to
get right



Hard to program, but more importantly …
A high quality cognitive model requires a deep
understanding of student thinking
Cognitive models created by intuition are often wrong
(e.g., Koedinger & Nathan, 2004)
Significance of improving a cognitive
model

A better cognitive model means better:




Assessment
Instructional feedback & hints (model tracing)
Activity selection & pacing (knowledge tracing)
Better cognitive models advance basic
cognitive science
Using student data to build
better cognitive models

Cognitive Task Analysis methods

Think alouds, Difficulty Factors Assessment


Peer collaboration dialog analysis


General lecture Tuesday
TagHelper track
Data mining of student interactions with on-line
tutors

DataShop track
Knowledge components
are the “germ theory” of
transfer
Germs are hidden elements that carry disease from one agent to
another
Knowledge components are hidden elements that carry learning
experiences from one situation to another -- they account for
transfer
DataShop Supports Theory
Integration


Makes micro theory concrete
Knowledge decomposability hypothesis


Acquisition of academic competencies can be
decomposed into units, called knowledge
components, that yield predictions about student
task performance & the transfer of learning.
Not obviously true

“learning, cognition, knowing, and context are
irreducibly co-constituted and cannot be treated
as isolated entities or processes” (Barab &
Squire, 2004)
Learning curves show performance changes
over time

Learning curves:
Student data
 Statistical model
fit (blue line)


Based on micro level
analysis:
learning event
opportunities
 Averaged across
knowledge
components

QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Not a smooth learning curve -> this
knowledge component model is
wrong. Does not capture genuine
student difficulties.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
This more specific knowledge
component (KC) model (2 KCs) is
also wrong -- still no smooth drop in
error rate.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Ah! Now we get smoother learning
curve. A more specific decomposition
(12 KCs) better tracks nature of
student difficulties & transfer from
one problem situation to another
(Rise near end due to fewer observations
biased toward poorer students)
Summary: KC
model as “germ
theory”
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™
and
a
Without
decomposition,
using
QuickTime™
and a
TIFF
TIFF (LZW)
(LZW) decompressor
decompressor
are
to
this
just
a single
KC,
are needed
needed
to see
see“Geometry”
this picture.
picture.
no smooth learning curve.
But with decomposition,
12 KCs for area concepts,
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
a smooth learning curve.
Upshot: A decomposed KC model
fits learning & transfer data better
than a “faculty theory” of mind
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Overview





Motivation for educational data mining
DataShop
Learning curves to improve cognitive models
Past project example
Next
Conclusion
Past Project Example




Rafferty (Stanford) & Yudelson (Pitt)
Analyzed a data set from Geometry
Applied Learning Factors Analysis (LFA)
Driving questions:


Are students learning at the same rate as
assumed in prior LFA models?
Do we need different cognitive models (KC
models) to account for low-achieving vs. highachieving students?
A Statistical Model for Learning Curves


Predicts whether student is correct depending on knowledge & practice
Additive Factor Model (Draney, et al. 1995, Cen, Koedinger, Junker, 2006)
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Learning rate is different for different
skills, but not for different students
Low-Start High-Learn (LSHL) group has a faster
learning rate than other groups of students
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Rafferty & Yudelson Results 2

Is it “faster” learning or “different” learning?

Fit with a more compact model is better for low start high learn
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.


Students with an apparent faster learning rate are learning a more
“compact”, general and transferable domain model
Resulted in best Young Researcher Track paper at AIED07
Overview





Motivation for educational data mining
DataShop
Learning curves to improve cognitive models
Past project example
Next
Conclusion
Lots of interesting questions to be
addressed with Ed Data Mining!!

Assessment questions



Learning theory questions




What are the “elements of transfer” in human learning?
Is learning rate driven by student variability or content variability?
Can conceptual change be tracked & better understood?
Instructional questions



Can on-line embedded assessment replace standardized tests?
Can assessment be accurate if students are learning during test?
What instructional moves yield the greatest increases in learning?
Can we replace ANOVA with learning curve comparison to better
evaluate learning experiments?
Metacogniton & motivation questions


Can student affect & motivation be detected in on-line click stream
data?
Can student metacognitive & self-regulated learning strategies be
detected in on-line click stream data?
Data Mining-Data Shop Offerings
Data Mining Track:
Tues 9:15 Using DataShop for Exploratory Data Analysis
Tues 1:30 Learning from learning curves
Item Response Theory
Learning Factors Analysis
Wed 9:30 Discovery with Models
General lecture:
Tues 3:30 Educational Data Mining
Bayesian models of knowledge tracing
Causal models with Tetrad
Questions?
Extra slides …
Sample tutor interactions (from
1997 version) that generated
Geometry Area data set used in
example of learning curves …
TWO_CIRCLES_IN_SQUARE problem:
Initial screen
TWO_CIRCLES_IN_SQUARE problem:
An error a few steps later
TWO_CIRCLES_IN_SQUARE problem:
Student follows hint & completes prob
Learning curve constrast in
Physics dataset …
Not a smooth learning curve -> this
knowledge component model is
wrong. Does not capture genuine
student difficulties.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
More detailed cognitive model
yields smoother learning curve.
Better tracks nature of student
difficulties & transfer
(Few observations after 10
opportunities yields noisy data)
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Best BIC (parsimonious fit) for
Default (original) KC model
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Better than simpler Single-KC
model
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
And better than more complex
Unique-step (IRT) model
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.