Chapter 15 - standardized testing
Download
Report
Transcript Chapter 15 - standardized testing
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
CHAPTER
15
Standardized Tests and Teaching
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Learning Goals
1. Discuss the nature of standardized tests.
2. Compare aptitude and achievement
testing and describe current uses of
achievement tests.
3. Identify the teacher’s role in standardized
testing.
4. Evaluate some key issues in standardized
testing.
15.2
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Standardized Tests and Teaching
The Nature of
Standardized
Tests
What Is a
Standardized
Test?
The Purposes
of
Standardized
Tests
Criteria for
Evaluating
Standardized Tests
15.3
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
The Nature of Standardized Tests
Standardized Tests
• Have uniform procedures for administration and
scoring.
• Allow comparison of student scores by age,
grade level, local and national norms.
• Attempt to include material common across most
classrooms.
15.4
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Enter the Debate
Should students have to pass a test to
earn a high school diploma?
YES
NO
15.5
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Purposes of Standardized Tests
Diagnose students’
strengths and
weaknesses
Provide information about
student progress and
program placement
Contribute to
accountability
Provide information for
planning
and instruction
Help in program
evaluation
15.6
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
The Nature of Standardized Tests
Standards-based tests assess skills
that students are expected to have
mastered before they can be permitted
to move to the next grade or be
permitted to graduate.
High-stakes testing is using tests in a
way that will have important
consequences for the student, affecting
major educational decisions.
15.7
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Evaluating Standardized Tests
Norms – Does the normative group represent all
students who may take the test?
Reliability – Are test scores stable, dependable
and relatively free from error?
Validity – Does the test measure what it is
purported to measure?
15.8
Correlation
Indicates strength
of relationship
(0.00 to 1.00)
Correlation
coefficient
Correlation Coefficient is a
statistical measure of relationship
between two variables.
r = + 0.37
Indicates direction
of relationship
(positive or negative)
9
Pearson correlation coefficient
• r = the Pearson coefficient
• r measures the amount that the two
variables (X and Y) vary together (i.e.,
covary) taking into account how much
they vary apart
• Pearson’s r is the most common
correlation coefficient; there are others.
Computing the Pearson correlation
coefficient
• To put it another way:
degree to which X and Y vary toge ther
r
degree to which X and Y vary separately
• Or
covariabil ity of X and Y
r
variabilit y of X and Y separately
Sum of Products of Deviations
• Measuring X and Y individually (the denominator):
– compute the sums of squares for each variable
• Measuring X and Y together: Sum of Products
– Definitional formula
SP ( X X )(Y Y )
– Computational formula
XY
SP XY
n
• n is the number of (X, Y) pairs
Correlation Coefficent:
SP
r
• the equation for Pearson’s r: SS X SSY
• expanded form:
XY
XY
n
r
2 X 2 2 Y 2
X
Y
n
n
Correlation Coefficient Interpretation
Coefficient
Range
0.00 - 0.20
Strength of
Relationship
Practically None
0.20 - 0.40
Low
0.40 - 0.60
Moderate
0.60 - 0.80
High Moderate
0.80 - 1.00
Very High
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Reliability
Test-retest: The extent to which a test yields
the same score when given to a student
on two different occasions
Alternate-forms: Two different forms of the
same test on two different occasions to
determine the consistency of the scores
Split-half: Divide the test items into two halves;
scores are compared to determine test
score consistency
15.19
Methods of Studying Reliability
Interrater Reliability- The consistency of a test to
measure a skill, trait, or domain across examiners.
This type of reliability is most important when
responses are subjective or open-ended.
Terry Overton
Assessing Learners with Special Needs, 5e
Copyright ©2006 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Types of Validity…
Content: Test’s ability to sample the content that
is being measured
Criterion-related:
1. Concurrent: The relation between a test’s
score and other available criteria
2. Predictive: The relationship between test’s
score and future performance
Construct: The extent to which there is evidence
that a test measures a particular construct
15.21
Factor Analysis
statistical technique which uses the correlations between
observed variables to estimate common factors and the
structural relationships linking factors to observed
variables. The diagram below illustrates how two
observed variables can correlate because of their
relationships with a common factor.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Standardized Tests and Teaching
Aptitude and
Achievement
Tests
Comparing
Aptitude and
Achievement
Tests
Types of
Standardized
Achievement
Tests
District and
National
Tests
High-Stakes
State-Mandated
Tests
15.23
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Aptitude vs. Achievement Tests
Aptitude Tests
Achievement Tests
Predict a student’s ability to
learn a skill
or accomplish a task.
(Stanford Binet,
Wechsler, SAT when
used to predict success)
Measure what the
student has learned
or mastered.
(California Achievement,
IOWA Basic Skills,
SAT when used to
determine what has been
learned)
15.24
High-Stakes State-Mandated Tests
Possible
Advantages
Criticisms
- Improved student performance
- More teaching time
- Higher student expectations
- Identification of poor-performing
schools/teachers
- Improved confidence in schools
- “Dumbing down” and more
emphasis on rote memorization
- Less time for problem-solving and
critical thinking skills
- Teachers “teaching to the test”
- Discrimination against low-SES
and ethnic minority children
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
National Assessment of Educational Progress
A federal “census-like” exam of students’ knowledge,
skills, understanding, and attitudes
Reading 1992–2000 4th grade
no improvement
1992–1998 8th and 12th no improvement
Math
1990–2000 4th and 8th
1990–2000 12th
Science 1996–2000 4th and 8th
1996–2000 12th
improvement
decline
no change
decline
15.26
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Standardized Tests and Teaching
The Teacher’s
Role
Preparing Students
to Take
Standardized
Tests
Administering
Standardized
Tests
Using
Standardized
Test Scores to
Plan
and Improve
Instruction
Understanding
and
Interpreting
Test Results
15.27
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
The Don’ts of Standardized Testing
DON’T
•Teach to the test
• Use the standardized test format for
classroom tests
• Describe tests as a burden
• Tell students that important decisions will be
made solely on the results of a single test
• Use previous forms of the test to prepare
students
• Convey a negative attitude about the test
15.28
Descriptive statistics are the mathematical procedures
that are used to describe and summarize data.
Counting the Data-Frequency
Look at the set of data that follows on the next slide.
A tally mark was made to count each time
score occurred
Which number most likely represents the
average score?
Which number is the most frequently
occurring score?
a
Frequency Distribution
Average
Score?
Scores
100
99
98
94
90
89
88
82
75
74
68
60
Tally
1
1
11
11
1111
1111 11
1111 1111
1111 1
11
1
1
1
Frequency
1
1
2
2
5
7
10
6
2
1
1
1
88
Most
Most Frequent
Score?
88
Tally
1
1
11
11
1111
1111 11
1111 1111
1111 1
11
1
1
1
This frequency count represents data that
closely represent a normal distribution.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Descriptive Statistics
15.32
Frequency Polygons
5
Data
100 89
99 89
98 89
98 89
94 88
94 88
90 75
90 75
90 74
90 68
90 60
4
3
2
1
60 68 74 75 88 89 90 94 98 99 100
Scores
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Measures of Central Tendency
Measures of central tendency provide information
about the average or typical score in a data set
Mean: The numerical average of a group of scores
Median: The score that falls exactly in the middle of
a data set
Mode: The score that occurs most often
15.34
Central tendency = representative or typical value in a
distribution
X
Mean
M
Same thing as an average
N
Computed by
Summing all the scores (sigma, )
Dividing by the number of scores (N)
Mean- To find the mean, simply add the
scores and divide by the number of scores
in the set of data.
98 + 94 + 88 + 75 = 355
Divide by the number of scores: 355/4 = 88.75
Mean
Measures of Central Tendency
• Steps to computing the median
1. Line up scores from highest to lowest
2. Count up to middle score
• If there is 1 middle score, that’s the
median
• If there are 2 middle scores, median
is their average
Median-The Middlemost point in a set of data
Data Set 1
100
99
99
98
97
96
90
88
85
80
79
Data Set 2
Median
96
100
99
98
97
86
82
78
72
70
68
The median is
84 for this set.
84 represents
the middle
most point in
this set of
data.
Mode-The most frequently occurring score in
a set of data.
Find the modes for the following sets of data:
Data Set 3
99
89
89
89
89
75
Mode:
89
Data set 4
99
88
88
87
87
72
70
88 and 87 are both
modes for this
set of data. This is
called a bimodal
distribution.
Measures of Variability (Dispersion)
Range- Distance between the highest
and lowest scores in a set of data.
100 - 65 = 35
35 is the range in this set of scores.
Variance - Describes the total amount
that a set of scores varies from the
mean.
1. Subtract the mean from
each score.
When the mean for a set of data is
87, subtract 87 from each score.
100 - 87 = 13
98- 87 = 11
95- 87 = 8
91- 87 = 4
85- 87 = -2
80- 87 = -7
60- 87 = -27
2. Next-Square each differencemultiply each difference by itself.
13 x 13 =
169
11 x 11 =
121
8 x 8=
64
4 x 4 =
16
-2 x -2 =
4
-7 x -7 = 49
-27x -27= + 729
1,152
3. Sum these
differences
Sum of
squares
4. Divide the sum of squares by the
number of scores.
1,152 divided by 7 =164.5714
This number
represents the variance for this set of data.
Standard Deviation-Represents the typical
amount that a score is expected to vary
from the mean in a set of data.
5. To find the standard deviation, find the
square root of the variance. For this
set of data, find the square root of
164.5714.
The standard deviation for this set of data
is 12.82 or 13.
Ceiling and Floor Effects
• Ceiling effects
– Occur when scores can go
no higher than an upper
limit and “pile up” at the
top
– e.g., scores on an easy
exam, as shown on the
right
– Causes negative skew
• Floor effects
– Occur when scores can go
no lower than a lower limit
and pile up at the bottom
– e.g., household income
– Causes positive skew
Skewed Frequency Distributions
• Normal distribution (a)
• Skewed right (b)
– Fewer scores right of the peak
– Positively skewed
– Can be caused by a floor effect
• Skewed left (c)
– Fewer scores left of the peak
– Negatively skewed
– Can be caused by a ceiling effect
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Understanding Descriptive Statistics
The Normal Distribution: A “bell-shaped” curve in which
most of the scores are clustered around the mean; the farther
from the mean, the less frequently the score occurs.
15.52
Bell Curve
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Commonly Reported Test Scores Based
on the Normal Curve
15.54
Z Scores
• When values in a distribution are
converted to Z scores, the distribution will
have
– Mean of 0
– Standard deviation of 1
• Useful
– Allows variables to be compared to one
another even when they are measured on
different scales, have very different
distributions, etc.
– Provides a generalized standard of comparison
Z Scores
• To compute a Z
score, subtract the
mean from a raw
score and divide
by the SD
• To convert a Z
score back to a
raw score, multiply
the Z score by the
SD and then add
the mean
(X M )
Z
SD
X ( Z )( SD ) M
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Standardized Tests and Teaching
Issues in
Standardized
Testing
Standardized Tests,
Alternative
Assessments,
High-Stakes Testing
Diversity and
Standardized
Testing
15.57
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Issues in Standardized Testing
Alternative Assessments
•
•
•
•
Assessments of oral presentations
Real-world problems
Projects
Portfolios
Diversity and Standardized Tests
• Gaps on standardized tests have been
attributed to environmental rather than
hereditary factors
• Special concern in creating culturally
unbiased tests
15.58
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Crack the Case
Standardized Tests
1. What are the issues involved in this situation?
2. Examine Ms. Carter’s testing procedures. What
does she do incorrectly? How might this reduce
the validity of the students’ scores?
3. How would you answer each of the parents’
questions?
15.59
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update
Reflection & Observation
Reflection:
What standardized tests have you
taken?
How have these tests affected your
perceptions of competence?
Observation:
What are some of the mother’s
concerns regarding her son’s
standardized test scores?
What error does the teacher make in
interpreting one of the test scores?
How would you explain this score?