Transcript day6
Instrumentation (cont.)
February 28
Note: Measurement Plan Due Next
Week
Unobtrusive Measures
• Many instruments require the cooperation of the respondent in one
way or another.
• An intrusion into an ongoing activity could be involved which causes
a form of negativity within the respondent.
• To eliminate this, researchers use unobtrusive measures, data
collection procedure that involve no intrusion into the naturally
occurring course of events.
• In most cases, no instrument is used, however, good record keeping
is necessary.
• They are valuable as supplements to the use of interviews and
questionnaires, often providing a useful way to corroborate what
more traditional data sources reveal.
Types of Scores
• Quantitative data is reported in the form of scores
• Scores are reported as either raw or derived scores
– Raw score is the initial score obtained
• Taken by itself, a raw score is difficult to interpret, since it has little meaning
– Derived score are scores that have been taken from raw scores and
standardized
• They enable researchers to say how well the individual performed compared to
others taking the same test
• Examples include:
– Age and Grade-level Equivalents
– Percentile Ranks
– Standard scores are mathematically derived scores having comparable
meaning on different instruments
Four Types of Measurement Scales
Norm-Referenced vs. CriterionReferenced Instruments
• All derived scores give meaning to individual scores by
comparing them to the scores of a group.
• The group used to determine derived scores is called the
norm group and the instruments that provide such
scores are referred to as norm-referenced instruments.
• An alternative to the use of achievement or performance
instruments is to use a criterion-referenced test.
• This is based on a specific goal or target (criterion) for
each learner to achieve.
• The difference between the two tests is that the criterion
referenced tests focus more directly on instruction.
Descriptive Statistics
Statistics vs. Parameters
• A parameter is a characteristic of a population.
– It is a numerical or graphic way to summarize data
obtained from the population
• A statistic is a characteristic of a sample.
– It is a numerical or graphic way to summarize data
obtained from a sample
Types of Numerical Data
•
There are two fundamental types of
numerical data:
1)
2)
Categorical data: obtained by determining the
frequency of occurrences in each of several
categories
Quantitative data: obtained by determining
placement on a scale that indicates amount or
degree
Techniques for Summarizing and
Presenting Quantitative Data
• Visual
–
–
–
–
Frequency Distributions
Histograms
Stem and Leaf Plots
Distribution curves
• Numerical
– Central Tendency
– Variability
Summary Measures
Summary Measures
Variation
Central Tendency
Arithmetic
Mean
Median Mode
Range
Variance
Standard Deviation
Measures of Central Tendency
Central Tendency
Average (Mean)
Median
n
X
X
i 1
n
N
X
i 1
N
i
i
Mode
Mean
• The most common measure of central
tendency
• Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 5
0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 6
Median
• Robust measure of central tendency
• Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10
Median = 5
0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
• In an Ordered array, median is the
“middle” number
– If n or N is odd, median is the middle number
– If n or N is even, median is the average of the
two middle numbers
Mode
•
•
•
•
•
•
A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Variability
• Refers to the extent to which the scores on a quantitative
variable in a distribution are spread out.
• The range represents the difference between the highest
and lowest scores in a distribution.
• A five number summary reports the lowest, the first
quartile, the median, the third quartile, and highest score.
– Five number summaries are often portrayed graphically by the
use of box plots.
Variance
• The Variance, s2, represents the amount of variability of the
data relative to their mean
• As shown below, the variance is the “average” of the
squared deviations of the observations about their mean
s
2
( x x)
i
n 1
2
Standard Deviation
• Considered the most useful index of variability.
• It is a single number that represents the spread of a
distribution.
• If a distribution is normal, then the mean plus or minus 3
SD will encompass about 99% of all scores in the
distribution.
Calculation of the Variance and Standard
Deviation of a Distribution (Definitional formula)
Raw
Score
85
80
70
60
55
50
45
40
30
25
Mean
54
54
54
54
54
54
54
54
54
54
X–X
31
26
16
6
1
-4
-9
-14
-24
-29
2
(X – X)
961
676
256
36
1
16
81
196
576
841
2
Σ(X – X)
Variance (SD ) =
N-1
2
Standard deviation (SD) =
=
3640
=404.44
9
2
√
Σ(X – X)
N-1
Comparing Standard Deviations
Data A
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
S = 3.338
Data B
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
S = .9258
Data C
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
S = 4.57
Facts about the Normal Distribution
• 50% of all the observations fall on each side of the
mean.
• 68% of scores fall within 1 SD of the mean in a
normal distribution.
• 27% of the observations fall between 1 and 2 SD
from the mean.
• 99.7% of all scores fall within 3 SD of the mean.
• This is often referred to as the 68-95-99.7 rule
The Normal Curve
Different Distributions Compared
Fifty Percent of All Scores in a Normal
Curve Fall on Each Side of the Mean
Probabilities Under the Normal Curve
Correlation
Correlation Coefficients
• Pearson product-moment correlation
– The relationship between two variables of
degree.
• Positive: As one variable increases (or decreases)
so does the other.
• Negative: As one variable increases the other
decreases.
– Magnitude or strength of relationship
• -1.00 to +1.00
– Correlation does not equate to causation
Positive Correlation
Negative Correlation
No Correlation
Correlations
• Thickness of scatter plot determines strength of
correlation, not slope of line.
– For example see:
• http://noppa5.pc.helsinki.fi/koe/corr/cor7.html
• Remember correlation does not equate
causation.
Negative Correlation
Validity and Reliability
Chapters 8
Validity and Reliability
• Validity is an important consideration in the choice of an
instrument to be used in a research investigation
– It should measure what it is supposed to measure
– Researchers want instruments that will allow them to make
warranted conclusions about the characteristics of the subjects
they study
• Reliability is another important consideration, since
researchers want consistent results from instrumentation
– Consistency gives researchers confidence that the results
actually represent the achievement of the individuals involved
Reliability
•
•
•
•
Test-retest reliability
Inter-rater reliability
Parallel forms reliability
Internal consistency (a.K.A. Cronbach’s
alpha)
Validity
• Face
– Does it appear to measure what it purports to
measure?
• Content
– Do the items cover the domain?
• Construct
– Does it measure the unobservable attribute
that it purports to measure?
Validity
• Criterion
– Predictive
– Concurrent
• Consequential
Types of validity (cont.)
The instrument
The construct
Here the instrument samples some and only of the construct
Types of validity
The construct
The instrument
Here the instrument samples all and more of the construct
The construct
Here the instrument fails
to sample ANY of the
construct
The instrument
The construct
The instrument
Here the instrument
samples some but not all
of the construct
Perfection!
The construct and the instrument!
Reliability and Validity