Using Meta-learning to Create Adaptable Classifiers

Download Report

Transcript Using Meta-learning to Create Adaptable Classifiers

Intelligent Learning Object Guide
(iLOG)
A Framework for Automatic
Empirically-Based Metadata
Generation
S.A. Rileya, L.D. Millera, L.-K. Soha, A. Samala, and G. Nugentb
aUniversity
of Nebraska—Lincoln: Department of Computer Science and Engineering
bUniversity of Nebraska—Lincoln: Center for Research on Children,Youth, Families and Schools
Overview
 Introduction:
 What is a Learning Object (LO)?
 Why do we need LO metadata?
 Metadata problems and iLOG solution
 iLOG Framework
 LO Wrapper
 MetaGen (metadata generator)
 Data Logging
 Data Extraction
 Data Analysis (feature selection, rule mining, statistics)
 Conclusions and Future Work
2
Intelligent Learning Object Guide
Introduction: What is a learning object?
 Self-contained learning content
LO Metadata
 Ideally, each covers a single
topic
 Serve as building blocks for
Learning Object
lessons, modules, or courses
 Can be reused in multiple
instructional contexts
3
Intelligent Learning Object Guide
iLOG Learning
Object structure:
• Content: tutorial,
exercises, assessment
• Metadata
Introduction: What is a learning object?
 The iLOG LOs contain a tutorial, exercises, and assessment
 Each covers a ‘bite-sized’ introductory computer science topic
Tutorial
Exercises
Assessment
4
Intelligent Learning Object Guide
Introduction: Why do we need LO metadata?
 Repositories for LOs are being
constructed
 However, there are barriers to effective
utilization of these repositories:
 Learning Context: not all LOs, even on
the same topic, are suitable for use in a
given learning context
 Uncertainty: we cannot be certain what
will happen with real-world usage
 Search and Retrieval: current metadata
is not machine-readable, and thus is not
adequate to automate the search for LOs
5
Intelligent Learning Object Guide
LO Metadata
LO Metadata
Learning Object
LO Metadata
Learning Object
LO Metadata
Learning Object
LO Metadata
Learning Object
Learning Object
LO Repository
Introduction: Why do we need LO metadata?
Learning Context:
 Students are highly varied:
 Pre-existing knowledge, cultural background, motivation, self-efficacy, etc.
Uncertainty:
 Cannot be certain what will happen when actual students use an actual LO:
 Good for students with low self-efficacy
 Inherent gender bias
 Bad for students without Calculus experience
Search and Retrieval:
 Metadata is fundamental to an instructor’s ability to use LOs:
 Guide in the LO selection process
 Help prevent the feeling that e-learning is ‘too complicated’
6
Intelligent Learning Object Guide
So…
…how do we enable instructors
to locate appropriate LOs for
their students???
7
Intelligent Learning Object Guide
Introduction: Metadata problems and iLOG solution
•
•
Current metadata standards are insufficient (Freisen, 2008)
There are ample opportunities for making e-learning more
“intelligent” (Brooks et al., 2006)
Current Metadata
 Manual generation by
course designer
 Based only on designer
intuition
 Metadata format
inconsistent / incomplete
 Human- but not machinereadable
8
Intelligent Learning Object Guide
Ideal Metadata
 Automated generation
 Based on empirical usage
 Consistent metadata suitable
for guiding LO selection
 Both human and machinereadable
Introduction: Metadata problems and iLOG solution
The iLOG solution is:
 General: iLOG is based on established learning standards
 We use the SCORM learning object standard, the IEEE LOM
metadata standard, and the Blackboard LMS
 Furthermore, it is compatible with existing LOs and does not
require modification to the LOs (noninvasive)
 The iLOG framework can also be applied to other standards
 Automatic: iLOG metadata is automatically generated and
updated
 Interpretable: iLOG metadata is both human and machine
readable
9
Intelligent Learning Object Guide
Introduction: Metadata problems and iLOG solution
 LO Wrapper: logs student
behaviors when using LO
 MetaGen : generates
empirical usage metadata
using data mining
techniques
 Works noninvasively with
pre-existing LOs using
standard learning
management systems
(LMSs)
Learning Object
LO Wrapper
LO Metadata
Learning Object
LO Wrapper
LO Metadata
Learning Object
10
Intelligent Learning Object Guide
LO Wrapper
LO Metadata
Learning
Management
System
(LMS)
Related Work
 Automatic metadata generation
 Primarily focuses on content taxonomies (Roy et al., 2008;
Jovanovic et al., 2006)
 Mining student behavior log files
 Mining has been shown to have a positive impact on instruction
and learning (Kobsa et al., 2007)
 Standardization of educational log file data
 Significant progress has been made with tutor-message format
standard (PSLC DataShop)
11
Intelligent Learning Object Guide
Overview
 Introduction:
 What is a Learning Object (LO)?
 Why do we need LO metadata?
 Metadata problems and iLOG solution
 iLOG Framework
 LO Wrapper
 MetaGen (metadata generator)
 Data Logging
 Data Extraction
 Data Analysis (feature selection, rule mining, statistics)
 Conclusions and Future Work
12
Intelligent Learning Object Guide
iLOG Framework
LO Wrapper
Rules and
Statistics
Statistics
Generation
MetaGen
iLOG
dataset
LO Metadata
Database
Learning Object
Rule Feature Feature
Data
Mining Subset Selection Analysis
Data
Extraction
Data
Logging
Log Files and Existing Metadata
Two components: LO Wrapper and MetaGen
13
Intelligent Learning Object Guide
iLOG Framework: LO wrapper
LO Wrapper
LO Metadata
Learning Object
14
Intelligent Learning Object Guide
LO Wrapper:
 ‘Wraps’ around an existing
LO
 Intercepts student
interactions and logs them
to a database
 Does not require changing
the LO
iLOG Framework: MetaGen
Rules and
Statistics
Statistics
Generation
MetaGen
iLOG dataset
Database
Rule
Mining
Feature
Subset
Data
Feature
Selection Analysis
MetaGen modules:
Data
Extraction
Data
Logging
•Data Logging, Data Extraction, Data Analysis
15
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Logging
MetaGen
LO Wrapper
LO Metadata
Database
Learning Object
Data
Logging
Log Files
Potential data sources:
•Interactions: clicks, time spent, etc.
•Surveys: demographic, motivation, self-efficacy, evaluation
•Assessment scores
16
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Logging
 Data sources used in our iLOG deployment:
17
Static Learner
Data
Static LO Data
Interaction Data
Baseline motivation
Baseline selfefficacy
Gender
Major
GPA
SAT/ACT score
⁞
Topic
Length
Degree of difficulty
Level of feedback.
Blooms’ level for assessment
questions
⁞
Total time on tutorial
Total time on exercises
Total time on assessment
Min time spent on a
tutorial page
Max time spent on a
tutorial page
Avg. time per assessment
question
⁞
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Extraction
MetaGen
LO Wrapper
iLOG
dataset
LO Metadata
Database
Learning Object
Data
Extraction
Data
Logging
Log Files and Existing Metadata
Data Extraction:
 Uses Java application to query the relational database and extract a
‘flat dataset’ suitable for data mining:
 Student Behaviors: Average time per tutorial page, Total time on
assessment, etc.
 Student Characteristics: Total motivation self-rating, GPA, Gender, etc.
18
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Analysis
MetaGen
LO Wrapper
iLOG
dataset
LO Metadata
Database
Learning Object
Feature Feature Data
Subset Selection Analysis
Data
Extraction
Data
Logging
Log Files and Existing Metadata
Data Analysis (feature selection):
 Uses ensemble of feature selection algorithms
 Seeks to identify student behaviors and characteristics
that are relevant to learning outcomes
19
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Analysis
 Feature selection (FS) is used to find a
subset of variables (features) that is
sufficient to describe a dataset
(Guyon et al., 2003)
 Different techniques may generate
different results
 Instead, our goal was to find ALL
features relevant to learning outcomes
 Thus, the feature selection ensemble
members ‘vote’ on which features they
identify as most relevant
20
Intelligent Learning Object Guide
FS#1
FS#3
FS#2
All features
iLOG Framework: MetaGen—Analysis
Notable Results:
 Relevant features varied widely across LOs
 Discovered unexpected patterns:
 Possible gender bias , Calculus bias, etc.
Logic 2
21
Attribute
Number of
Times Selected
highestMath
gender
takenCalculus
assessStdDevSecAboveAvg?
wasAnyPartConfusing?
16
13
13
13
13
Intelligent Learning Object Guide
Searching
Attribute
GPA
assessMinSecPageBelowAvg?
assessmentMinScondsOnAPage
believeLODifficultToUnderstand
courseLevel
Number of
Times Selected
14
11
10
10
9
iLOG Framework: MetaGen—Analysis
MetaGen
LO Wrapper
iLOG
dataset
LO Metadata
Database
Learning Object
Rule Feature Feature Data
Mining Subset Selection Analysis
Data
Extraction
Data
Logging
Log Files and Existing Metadata
Rule Mining:
• Uses Tertius algorithm for predictive rule mining
• Generates rules from selected features (along with rule strength)
22
takenCalculus? = no fail (.52)
currentTotalMotivationAboveAvg? = no fail (.52)
gender = female  fail (.36)
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Analysis
LO Wrapper
Statistics
Generation
MetaGen
iLOG
dataset
LO Metadata
Database
Learning Object
Rule Feature Feature Data
Mining Subset Selection Analysis
Data
Extraction
Data
Logging
Log Files and Existing Metadata
Statistics Generation:
•Empirical data: time to complete, pass/fail rates, and student
ratings of LO
successRate = 51%
averageTime = 433 seconds
averageStudentRating = 4.3/5.0
23
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Analysis
Logic 2—Intro CS for non-majors
assessmentStdDevSecondsAboveAvg? = yes  fail (.35)
successRate = 51%
assessmentMaxSecondsOnAQuestion = high  fail (.33)
averageTime = 433 seconds
highestMath = precalculus fail (.28)
averageStudentRating = 4.3/5.0
gender = female  fail (.24)
Logic 2--Intro CS for majors
baselineStdDevMotivation = low  fail (.72)
successRate = 38%
takenCalculus? = no fail (.52)
averageTime = 688 seconds
currentTotalMotivationAboveAvg? = no fail (.52)
averageStudentRating = 4.16/5.0
Logic 2—Honors Intro CS for majors
OpinionOfLOUsability = negative  fail (.59)
successRate = 55%
BelieveLOAnAidToUnderstanding = yes  pass (.49)
averageTime = 799 seconds
BelieveLONeedsMoreDetail = yes  fail (.43)
averageStudentRating = 3.43/5.0
gender = female  fail (.36)
Appear to be different predictors of success for different learning contexts:
•Honors: student impression of LO, gender
•Majors: motivation, math experience
•Non-majors: long time spent on assessment, math experience, gender
24
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Analysis
Logic 2—Intro CS for non-majors
assessmentStdDevSecondsAboveAvg? = yes  fail (.35)
successRate = 51%
assessmentMaxSecondsOnAQuestion = high  fail (.33)
averageTime = 433 seconds
highestMath = precalculus fail (.28)
averageStudentRating = 4.3/5.0
gender = female  fail (.24)
Logic 2--Intro CS for majors
baselineStdDevMotivation = low  fail (.72)
successRate = 38%
takenCalculus? = no fail (.52)
averageTime = 688 seconds
currentTotalMotivationAboveAvg? = no fail (.52)
averageStudentRating = 4.16/5.0
Logic 2—Honors Intro CS for majors
OpinionOfLOUsability = negative  fail (.59)
successRate = 55%
BelieveLOAnAidToUnderstanding = yes  pass (.49)
averageTime = 799 seconds
BelieveLONeedsMoreDetail = yes  fail (.43)
averageStudentRating = 3.43/5.0
gender = female  fail (.36)
Inverse relationship: time spent on LO and student ratings:
• Advanced students may have higher expectations (lower ratings)
• Advanced students may care more about the material (time spent)
25
Intelligent Learning Object Guide
iLOG Framework: MetaGen—Analysis
LO Wrapper
Rules and
Statistics
Statistics
Generation
MetaGen
iLOG
dataset
LO Metadata
Database
Learning Object
Rule Feature Feature Data
Mining Subset Selection Analysis
Data
Extraction
Data
Logging
Log Files and Existing Metadata
Rules and Statistics:
• Usage statistics and rules are combined to form empirical
usage metadata
26
Intelligent Learning Object Guide
iLOG Framework: Our Implementation
LO wrapper:
 HTML document that uses Java-script to record and timestamp student interactions
with the LO (e.g., page navigation, clicks on a page, etc.).
 Uses a modification of the Easy SCO Adapter1 to interface with the SCORM API and
retrieve student assessment results from the LMS.
 Uses JavaScript to transmit interaction data to MetaGen
MetaGen:
 Data logging: uses PHP to store student interaction data into a MySQL database.
 Data extraction: uses Java to query the database and process the data into the
iLOG dataset.
 Data analysis: uses the Weka (Witten, 2005) implementations of several feature
selection algorithms to generate the iLOG data-subset and the (Flach, 2001) predictive
rule mining algorithm to generate empirical usage metadata rules.
1[http://www.ostyn.com/standards/demos/SCORM/wraps/easyscoadapterdoc.htm#license]
27
Intelligent Learning Object Guide
Overview
 Introduction:
 What is a Learning Object (LO)?
 Why do we need LO metadata?
 Metadata problems and iLOG solution
 iLOG Framework
 LO Wrapper
 MetaGen (metadata generator)
 Data Logging
 Data Extraction
 Data Analysis (feature selection, rule mining, statistics)
 Conclusions and Future Work
28
Intelligent Learning Object Guide
Conclusions
iLOG: a framework for automatic, empirical metadata generation:
 LO Wrapper component:
 “Wraps” noninvasively around pre-existing learning objects (LOs)
 Automatically collects and logs student interaction data
 Resulting LOs can be played on a standard LMS, such as Blackboard
 MetaGen component (metadata generator):
 Uses data mining to create empirical usage metadata:
 Feature selection: provides insights on which student characteristics and
behaviors may contribute to success in different learning contexts.
 Rule mining: uses salient features to generate rules predicting success
 Usage statistics: empirical evidence of time to complete, scores, etc.
 iLOG’s empirical use metadata should enable instructors to locate LOs
that are appropriate to their students’ learning context
29
Intelligent Learning Object Guide
Future Work: Closing the Loop
LO Wrapper
Rules and
Statistics
Statistics
Generation
MetaGen
iLOG
dataset
LO Metadata
Database
Learning Object
Rule Feature Feature Data
Mining Subset Selection Analysis
Data
Extraction
Data
Logging
Log Files and Existing Metadata
• Method to automatically write empirical usage metadata to
the LO metadata file
• Method to integrate new metadata with existing metadata
30
Intelligent Learning Object Guide
References











31
IEEE 1484.12.1-2002 Standard for Learning Object Metadata (LOM). Retrieved January 7, 2009, from
http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf
N. Friesen, The International Learning Object Metadata Survey. Retrieved August 7, 2008, from
http://www.irrodl.org/index.php/irrodl/article/view/195/277/
C. Brooks, J. Greer, E. Melis, C. Ullrich, Combining ITS and eLearning Technologies: Opportunities and Challenges, Proc.
8th Int. Conf. on Intelligent Tutoring Systems (2006), 278-287.
D. Roy, S Sarkar, S. Ghose, Automatic Extraction of Pedagogic Metadata from Learning Content, Int. J. of Artificial
Intelligence in Education 18 (2008), 287-314.
J. Jovanovic, D. Gasevic, V. Devedzic, Ontology-Based Automatic Annotation of Learning Content, Int. J. on SemanticWeb and
Information Systems, 2(2) (2006), 91-119.
B. Jong, T. Chan, Y. Wu, Learning Log Explorer in E-Learning Diagnosis, IEEE Transactions on Education 50(3) (2007), 216228.
E. Garcia, C. Romero, S. Ventura, C. Castro, An architecture for making recommendations to courseware authors using
association rule mining and collaborative filtering, User Modeling and User-Adaptive Interaction (to appear).
E. Kobsa, V. Dimitrova, R. Boyle, Adaptive Feedback Generation to support teachers in web-based distance education, User
Modeling and User-Adapted Interaction 17 (2007), 379-413.
I. Guyon, A. Elisseeff, An Introduction to Variable and Feature Selection, Journal of Machine Learning Research 3 (2003),
1157-1182.
P.A. Flach, N. Lachiche, Confirmation-Guided Discovery of First-Order Rules with Tertius, Machine Learning 42 (2001),
61-95.
Ian H. Witten and Eibe Frank "Data Mining: Practical machine learning tools and techniques",2nd Edition, Morgan
Kaufmann, San Francisco, 2005.
Intelligent Learning Object Guide
Contact and Acknowledgement
iLOG project website: http://cse.unl.edu/agents/ilog
Authors: S.A. Rileya, L.D. Millera, L.-K. Soha, A. Samala, and G.
Nugentb
Email: [email protected], [email protected],
[email protected], [email protected], [email protected]
 This material is based upon work supported by the National
Science Foundation under Grant No. 0632642 and an NSF
GAANN fellowship.
32
Intelligent Learning Object Guide