The life and times of Coretta Scott King

Download Report

Transcript The life and times of Coretta Scott King

Grade 3 FCAT – Test
Construction & Equating
June 1, 2007
Cornelia S. Orr, Assistant Deputy
Commissioner of Accountability, Research,
and Measurement (ARM)
Office of Assessment and School
Performance
Florida Department of Education
“
Experience teaches only the teachable.”
Aldus Huxley (1894-1963)
Topics
• The Grade 3 Test in 2006
• Test Construction
• Process and Product
• Science and Art
• Psychometric Primer
• Test Calibration and Equating
2
The Grade 3 Test in 2006
• Passages – Questions – Forms
• Student scores based on 5 passages & 45 questions
• 30 different forms, each with 1 passage & 7-8 questions
• Forms are used for anchor and field test questions
• One of the 6 passage positions is used for anchor and
field test questions
2006 Grade 3 FCAT Test Passages and Positions
Day 1 – Session 1
Day 2 – Session 2
1
4
2
3
5
6
3
The Grade 3 Test in 2006
2006 Grade 3 FCAT Test Passages and Positions
Day/
Session
1
2
Passage
Position
Number of
Questions
Passage Description
1
8
2
7 or 8
3
10
A Gift of Trees (Inform.)
4
13
Swim, Baby, Swim (Lit.)
5
8
Slip, Slop, Slap/Sunny Sidebar
(Inform.)
6
6
Making Spring (Lit.)
TOTAL
Ladybird, Ladybird, Fly Away
Home (Lit.)
Anchor and Field Test
Passages (Varies)
52-53
4
“Test Construction”
• Process of building the test
• Occurs the summer before a test
• Based on available passages,
questions, and statistics
• Guidelines for building the test
• Test Construction Specifications
• Building the test is an iterative process
5
Test Construction Cycles
1999
2000
2001
2002
2003
Oct/Nov Item
Review
Meetings
Oct/Nov Item
Review
Meetings
Oct/Nov Item
Review
Meetings
Oct/Nov Item
Review
Meetings
Nov/Dec Item
Review
Meetings
May/Aug Item
Review
Meetnigs
Sept/Oct
Field Test
Construction
June Field
Test
Construction
June Test
Construction
June Field
Test
Construction
June Test
Construction
June Field
Test
Construction
June Test
Construction
June Field
Test
Construction
June Test
Construction
2004
June Field
Test
Construction
June Test
Construction
2005
2006
June Test
Construction
2006 Test
Admin.
March Field
Test
2005 Test
Admin.
March Field
Test
2004 Test
Admin.
March Field
Test
2003 Test
Admin.
March Field
Test
2002 Test
Admin.
March Field
Test
2001 Test
Admin.
March Field
Test
6
Test Construction Specifications - 1
• Guidelines for building the test
• Ranges for each category
• Iterative process
• Content Guidelines
• Reading Passages (type and word counts)
• Benchmark Coverage
• Reporting Category (Strand) Coverage
• Multicultural & Gender Representation
• Cognitive Level Guidelines
7
Test Construction Specifications – 2
Statistical Guidelines for Questions
• Classical Item Difficulty and Discrimination
• IRT Difficulty, Discrimination, and Guessing
• Differential Item Functioning (DIF)
• IRT Model Fit Statistics
Statistical Guidelines for Tests
• Test Characteristic Curves
• Test Information Functions
• Standard Error Curves
8
Test Construction Specifications – 3
Anchor Item Guidelines
• Number and position of questions
• Content Representation – Mini Test
• Performance Characteristics (range of
difficulty)
• Previous use as a Core or Anchor
• No change in wording
• Passage position
9
Test Construction
Review and Approval Process
• 1st Draft of Content – Harcourt Content Staff
• Review of Content – DOE Content Staff
• Review of Statistics – Harcourt Psychometric
Staff
• Review of Statistics – DOE Psychometric Staff
• Approval by DOE FCAT team leadership
10
Psychometric Primer -1
Classical Item Statistics:
• P-value or difficulty – the percent (P)
who answer the question correctly.
• Discrimination (point-biserial) – the
degree to which students who get high
scores answer the question correctly
and vice versa (similar to correlation).
11
Psychometric Primer -2
Item Response Theory (IRT) Statistics –
• A-parameter – discrimination or how well the question
differentiates between lower and higher performing students.
• B-parameter – difficulty or the level of ability on the 100-500
scale required to answer the question correctly.
• Guessing – the probability of examinees with extremely low
ability levels getting a correct answer.
• FIT – how well the scores for a given item fit, or match, the
expected distribution for the model.
• DIF (Differential Item Functioning) – the degree to which the
question performs similarly for all demographic groups based
on ability.
12
0.8
0.6
Discrimination=1
Difficulty =0.5
0.2
0.4
Pseudo-Guess. =0.13
0.0
Probability of Correct Response
1.0
Item Characteristic Curve – Figure 1
-4
-2
0
2
4
Achievement Index (Theta)
13
Test Characteristic Curve – Figure 2
% corr TCC: Reading 2006 Gr. 3 Core ver. 7
.... Base Form (2005) ____ New Form
Expe cted Pct-Correct Score
100.0
75.0
50.0
259
284
332
394
25.0
0.0
100
200
300
400
500
Scale Score
14
Standard Error Curve – Figure 3
SEM: Reading 2006 Gr. 3 Core ver. 7
.... Base Form (2005) ____ New Form
Standard Error of Meas urement
100
75
50
259
284
332
394
25
0
100
200
300
400
500
Scale Score
15
Test Calibration and Equating
• Calibration – Converting from Raw Scores
to IRT scores
• Equating – Making Scores Comparable
Across Years
• Florida uses Item Response Theory (IRT)
to score and equate FCAT results from
year to year.
16
FCAT Calibration and Equating Process
Step 1: Determine unscaled item
parameters for core & anchor together.
Current Year Core Test
New Values
tion
brat
Cali
ll
of a
Step 5:
Generate
Student Scores
Using Equated
Score Scale
Step 2: Analyze the change in the
anchor Items
Current Year
Anchor Item
New Values
Prior Year
Anchor Item
Values
1
1
2
2
3
3
4
4
1
2
3
4
5
6
7
8
9
10
11
12
s)
13
eter
ram
14
c pa
&
,
,b
15
er (a
16 s togeth
em
it17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
5
5
6
7
pare
Com
nt
Curre
lues
or Va
to Pri
6
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
Step 4: Apply the Equating
Adjustments and Convert to
Scaled Item Parameters
Step 3: Identify
Equating
Adjustments
(M1 & M2)
17
Equating Solutions
• 2006 equating solution – anchor questions ???
• Identify a “better” equating solution
• Define “better”
• Process considerations
• Select anchor questions
• Follow the guidelines
• Evaluate the quality of the anchor
18