Item Response Theory Pattern Scoring

Download Report

Transcript Item Response Theory Pattern Scoring

The ABC’s of
Pattern Scoring
Dr. Cornelia Orr
Vocabulary
• Measurement – Psychometrics is a
specialized application
• Classical test theory
• Item Response Theory – IRT
(AKA logistic trait theory)
• 1 – 2 – 3 parameter IRT models
• Pattern Scoring
General & Specialized
Measurement
• Assigning numbers
to objects or
events
• Ex. – time, height,
earthquakes,
hurricanes, stock
market
Psychometrics
• Assigning numbers
to psychological
characteristics
• Ex. – personality,
IQ, opinion,
interests,
knowledge
Different Theories of
Psychometrics
Classical Test Theory
• Item discrimination
values
• Item difficulty values
(p-values)
• Guessing (penalty)
Number correct scoring
Item Response Theory
a) Item discrimination
values
b) Item difficulty values
c) Guessing (pseudoguessing) values
Pattern scoring
Similar constructs – Different derivations
Different Methods
of Scoring
Number-Correct Scoring
•
Simple Mathematics
•
Raw scores (# of points)
–
Mean, SD, SEM, % correct
•
Number right scale
•
Score conversions
–
Scale scores, percentile
ranks, etc.
Pattern Scoring
• Complex Mathematics
• Maximum likelihood
estimates
– Item statistics, student’s
answer pattern, SEM
• Theta scale (mean=0,
standard dev=1)
• Score conversions
– Scale scores, percentile
ranks, etc.
Comparison: Number Correct
and Pattern Scoring
Similarities
• The relationship of
derived scores is the
same
For example, a scale
score obtained in a test
corresponds to the
same percentile for
both methods.
Differences
• Methods of deriving
scores
• The number of scale
scores possible
– Number right = limited
to the number of items
– IRT = unlimited or
limited by the scale
(ex. 100-500)
Choosing the Scoring Method
•
•
•
•
Which model?
Simple vs. Complex?
Best estimates?
Advantages/Disadvantages?
Ex. – Why do the same number correct get
different scale scores?
Ex. – Flat screen TV – how do they do that?
Disadvantages of IRT
and Pattern Scoring
• Complex Mathematics – Technical
– Difficult to explain
– Difficult to understand
• It doesn’t add up!
• Perceived as Hocus Pocus
Advantages of IRT and
Pattern Scoring
• Better estimates of an examinee’s ability
– the score that is most likely, given the
student’s responses to the questions on the
test (maximum likelihood scoring)
• More information about students and
items are used
• More reliability than number right scoring
• Less measurement error (SEM)
0.8
0.6
D is c rimination=1
D iffic ulty =0.5
0.2
0.4
Ps eudo-Gues s . =0.13
0.0
Probability of Correct Response
1.0
Item Characteristic
Curve (ICC)
-4
-2
0
Ac hiev ement Index (Theta)
2
4
Examples
5 Items (Effects of Item
Discrimination)
No Type a
1 MC 0.0250
2 MC 0.0200
3 MC 0.0150
4 MC 0.0100
5 MC 0.0050
b
c
300.000 0.2
300.000 0.2
300.000 0.2
300.000 0.2
300.000 0.2
4 examinees’ response
patterns (1=correct)
Pattern
12345
11100
01110
00111
10011
SEM
39
46
61
94
SS
300
278
258
260
Examples
5 items (Effects of item
difficulty)
No Type a
b
1 1 MC 0.0150 250.000
2 1 MC 0.0150 275.000
3 1 MC 0.0150 300.000
4 1 MC 0.0150 325.000
5 1 MC 0.0150 350.000
c
0.1
0.1
0.1
0.1
0.1
4 examinees’ response
patterns (1=correct)
Pattern
12345
11100
01110
00111
10011
SEM
SS
43
43
43
43
300
305
299
310
Missing easy items can result in a
lower scores.