Transcript Z-scores

Psych 230
Psychological Measurement
and Statistics
Pedro Wolf
September 16, 2009
Today….
• Symbols and definitions reviewed
• Understanding Z-scores
• Using Z-scores to describe raw scores
• Using Z-scores to describe sample means
Symbols and Definitions Reviewed
Definitions: Populations and Samples
• Population : all possible members of the group of
interest
• Sample : a representative subset of the population
Symbols and Definitions: Mean
• Mean
– the most representative score in the distribution
– our best guess at how a random person scored
• Population Mean = x
• Sample Mean = X
Symbols and Definitions
• Number of Scores or Observations = N
• Sum of Scores = ∑X
• Sum of Deviations from the Mean = ∑(X-X)
• Sum of Squared Deviations from Mean = ∑(X-X)2
• Sum of Squared Scores = ∑X2
• Sum of Scores Squared = (∑X)2
Symbols and Definitions: Variability
• Variance and Standard Deviation
– how spread out are the scores in a distribution
– how far the is average score from the mean
• Standard Deviation (S) is the square root of the
Variance (S2)
• In a normal distribution:
– 68.26% of the scores lie within 1 std dev. of the mean
– 95.44% of the scores lie within 2 std dev. of the mean
Symbols and Definitions: Variability
• Population Variance = 2X
• Population Standard Deviation = X
• Sample Variance = S2x
• Sample Standard Deviation = Sx
• Estimate of Population Variance = s2x
• Estimate of Population Standard Deviation = sx
Normal Distribution and the Standard Deviation
Mean=66.57
HEIGHT
14
Var=16.736
12
StdDev=4.091
10
8
6
Fre q u e n cy
4
2
0
51
61
56
71
66
58.38
HEIGHT
81
76
74.75
62.48
70.66
Normal Distribution and the Standard Deviation
• IQ is normally distributed with a mean of 100 and
standard deviation of 15
13%
70
13%
85
100
115
130
Understanding Z-Scores
The Next Step
• We now know enough to be able to accurately
describe a set of scores
–
–
–
–
measurement scale
shape of distribution
central tendency (mean)
variability (standard deviation)
• How does any one score compare to others in the
distribution?
The Next Step
• You score 82 on the first exam - is this good or bad?
• You paid $14 for your haircut - is this more or less
than most people?
• You watch 12 hours of tv per week - is this more or
less than most?
• To answer questions like these, we will learn to
transform scores into z-scores
– necessary because we usually do not know whether a
score is good or bad, high or low
Z-Scores
• Using z-scores will allow us to describe the relative
standing of the score
– how the score compares to others in the sample or
population
Frequency Distribution of Attractiveness Scores
Frequency Distribution of Attractiveness Scores
Interpreting each score in relative terms:
Slug: below mean, low frequency score, percentile low
Binky: above mean, high frequency score, percentile medium
Biff: above mean, low frequency score, percentile high
To calculate these relative scores precisely, we use z-scores
Z-Scores
• We could figure out the percentiles exactly for every single
distribution
– e ≈ 2.7183, π≈ 3.1415
• But, this would be incredibly tedious
• Instead, mathematicians have figured out the percentiles for a
distribution with a mean of 0 and a standard deviation of 1
– A z-distribution
• What happens if our data doesn’t have a mean of 0 and standard
deviation of 1?
– Our scores really don’t have an intrinsic meaning
– We make them up
• We convert our scores to this scale - create z-scores
• Now, we can use the z-distribution tables in the book
Z-Scores
• First, compare the score to an “average”
score
• Measure distance from the mean
– the deviation, X - X
– Biff: 90 - 60 = +30
– Biff: z = 30/10 = 3
– Biff is 3 standard deviations above the mean.
Z-Scores
• Therefore, the z-score simply describes the distance
from the score to the mean, measured in standard
deviation units
• There are two components to a z-score:
– positive or negative, corresponding to the score being
above or below the mean
– value of the z-score, corresponding to how far the score
is from the mean
Z-Scores
• Like any score, a z-score is a location on the
distribution. A z-score also automatically
communicates its distance from the mean
• A z-score describes a raw score’s location in terms
of how far above or below the mean it is when
measured in standard deviations
– therefore, the units that a z-score is measured in is
standard deviations
Raw Score to Z-Score Formula
• The formula for computing a z-score for a raw score
in a sample is:
XX
z
SX
Z-Scores - Example
• Compute the z-scores for Slug and Binky
• Slug scored 35. Mean = 60, StdDev=10
• Slug: = (35 - 60) / 10 = -25 / 10 = -2.5
• Binky scored 65. Mean = 60, StdDev=10
• Binky: = (65 - 60) / 10 = 5 / 10 = +0.5
XX
z
SX
Z-Scores - Your Turn
XX
z
SX
• Compute the z-scores for the following heights in
the class. Mean = 66.57, StdDev=4.1
•
•
•
•
•
65 inches
66.57 inches
74 inches
53 inches
62 inches
Z-Scores - Your Turn
XX
z
SX
• Compute the z-scores for the following heights in
the class. Mean = 66.57, StdDev=4.1
•
•
•
•
•
65 inches: (65 - 66.57) / 4.1 = -1.57 / 4.1 = -0.38
66.57 inches: (66.57 - 66.57) / 4.1 = 0 / 4.1 = 0
74 inches: (74 - 66.57) / 4.1 = 7.43 / 4.1 = 1.81
53 inches: (53 - 66.57) / 4.1 = -13.57 / 4.1 = -3.31
62 inches: (62 - 66.57) / 4.1 = -4.57 / 4.1 = -1.11
Z-Score to Raw Score Formula
• When a z-score and the associated Sx and X are
known, we can calculate the original raw score. The
formula for this is:
X  ( z )( S X )  X
Z-Score to Raw Score : Example
X  ( z )( S X )  X
• Attractiveness scores. Mean = 60, StdDev=10
• What raw score corresponds to the following zscores?
• +1 : X = (1)(10) + 60 = 10 + 60 = 70
• -4 : X = (-4)(10) + 60 = -40 + 60 = 20
• +2.5: X = (2.5)(10) + 60 = 25 + 60 = 85
Z-Score to Raw Score : Your Turn
X  ( z )( S X )  X
• Height in class. Mean=66.57, StdDev=4.1
• What raw score corresponds to the following zscores?
•
•
•
•
+2
-2
+3.5
-0.5
Z-Score to Raw Score : Your Turn
X  ( z )( S X )  X
• Height in class. Mean=66.57, StdDev=4.1
• What raw score corresponds to the following zscores?
•
•
•
•
+2: X = (2)(4.1) + 66.57 = 8.2 + 66.57 = 74.77
-2: X = (-2)(4.1) + 66.57 = -8.2 + 66.57 = 58.37
+3.5: X = (3.5)(4.1) + 66.57 = 14.35 + 66.57 = 80.92
-0.5: X = (-0.5)(4.1) + 66.57 = -2.05 + 66.57 = 64.52
Using Z-scores
Uses of Z-Scores
• Describing the relative standing of scores
• Comparing scores from different distributions
• Computing the relative frequency of scores in any
distribution
• Describing and interpreting sample means
Uses of Z-Scores
• Describing the relative standing of scores
• Comparing scores from different distributions
• Computing the relative frequency of scores in any
distribution
• Describing and interpreting sample means
Z-Distribution
• A z-distribution is the distribution produced by
transforming all raw scores in the data into z-scores
• This will not change the shape of the distribution,
just the scores on the x-axis
• The advantage of looking at z-scores is the they
directly communicate each score’s relative position
• z-score = 0
• z-score = +1
Distribution of Attractiveness Scores
Raw scores
Z-Distribution of Attractiveness Scores
Z-scores
Z-Distribution of Attractiveness Scores
Z-scores
In a normal distribution, most scores
lie between -3 and +3
Characteristics of the Z-Distribution
•
A z-distribution always has the same shape as the
raw score distribution
•
The mean of any z-distribution always equals 0
•
The standard deviation of any z-distribution always
equals 1
Characteristics of the Z-Distribution
•
Because of these characteristics, all normal zdistributions are similar
•
A particular z-score will be at the same relative
location on every distribution
•
•
Attractiveness: z-score = +1
•
Height: z-score = +1
You should interpret z-scores by imagining their
location on the distribution
Uses of Z-Scores
• Describing the relative standing of scores
• Comparing scores from different distributions
• Computing the relative frequency of scores in any
distribution
• Describing and interpreting sample means
Using Z-Scores to compare variables
• On your first Stats exam, you get a 21. On your first
Abnormal Psych exam you get a 87. How can you
compare these two scores?
• The solution is to transform the scores into zscores, then they can be compared directly
• z-scores are often called standard scores
Using Z-Scores to compare variables
• Stats exam, you got 21. Mean = 17, StdDev = 2
• Abnormal exam you got 87. Mean = 85, StdDev = 3
• Stats Z-score: (21-17)/2 = 4/2 = +2
• Abnormal Z-score: (87-85)/2 = 2/3 = +0.67
Comparison of two Z-Distributions
Stats: X=30, Sx=5
English: X=40, Sx=10
Millie scored 20
Millie scored 30
Althea scored 38
Althea scored 45
Comparison of two Z-Distributions
Uses of Z-Scores
• Describing the relative standing of scores
• Comparing scores from different distributions
• Computing the relative frequency of scores in any
distribution
• Describing and interpreting sample means
Using Z-Scores to compute relative frequency
• Remember your score on the first stats exam:
• Stats z-score: (21-17)/2 = 4/2 = +2
• So, you scored 2 standard deviations above the
mean
• Can we compute how many scores were better and
worse than 2 standard deviations above the mean?
Proportions of Area under the Standard Normal
Curve
Relative Frequency
•
Relative frequency can be computed using the
proportion of the total area under the curve.
•
The relative frequency of a particular z-score will
be the same on all normal z-distributions.
•
The standard normal curve serves as a model for
any approximately normal z-distribution
Z-Scores
XX
z
SX
• z-scores for the following heights in the class.
– Mean = 66.57, StdDev=4.1
•
•
•
•
•
65 inches: (65 - 66.57) / 4.1 = -1.57 / 4.1 = -0.38
66.57 inches: (66.57 - 66.57) / 4.1 = 0 / 4.1 = 0
74 inches: (74 - 66.57) / 4.1 = 7.43 / 4.1 = 1.81
53 inches: (53 - 66.57) / 4.1 = -13.57 / 4.1 = -3.31
62 inches: (62 - 66.57) / 4.1 = -4.57 / 4.1 = -1.11
Z-Scores
XX
z
SX
• z-scores for the following heights in the class.
– Mean = 66.57, StdDev=4.1
•
•
•
•
•
65 inches: (65 - 66.57) / 4.1 = -1.57 / 4.1 = -0.38
66.57 inches: (66.57 - 66.57) / 4.1 = 0 / 4.1 = 0
74 inches: (74 - 66.57) / 4.1 = 7.43 / 4.1 = 1.81
53 inches: (53 - 66.57) / 4.1 = -13.57 / 4.1 = -3.31
62 inches: (62 - 66.57) / 4.1 = -4.57 / 4.1 = -1.11
• What are the relative frequencies of these heights?
Z-Scores
• How can we find the exact relative frequencies for
these z-scores?
•
•
•
•
•
65 inches: z = -0.38
66.57 inches: z = 0
74 inches: z = 1.81
53 inches: z = -3.31
62 inches: z = -1.11
Z-Scores
• How can we find the exact relative frequencies for
these z-scores?
•
•
•
•
•
65 inches: z = -0.38
66.57 inches: z = 0
74 inches: z = 1.81
53 inches: z = -3.31
62 inches: z = -1.11
Proportions of Area under the Standard Normal
Curve
a
th
e
T
th
e
T
the
Proportions of Area under the Standard Normal
Curve
a
Z = -0.38
a
a
Proportions of Area under the Standard Normal
Curve
a
Z = -0.38
How many
scores lie in
this portion of
the curve?
a
a
Z-Scores
• To find out the relative frequencies for a particular
z-score, we use a set of standard tables
– z-tables
– They’re in the book
Z-Scores
• To find out the relative frequencies for a particular
z-score, we use a set of standard tables
– z-tables
• 65 inches: z = -0.38
Z
0.38
area between mean & z
0.1480
area beyond z in tail
0.3520
Proportions of Area under the Standard Normal
Curve
a
Z = -0.38
0.3520 of
scores lie
between this
z-score and
the tail
a
a
Proportions of Area under the Standard Normal
Curve
a
Z = -0.38
a
a
0.1480 of
scores lie
between this
z-score and
the mean
Z-Scores - Your turn
• Find out what percentage of people are taller than
the heights given below:
– z-tables
•
•
•
•
•
65 inches: z = -0.38
66.57 inches: z = 0
74 inches: z = 1.81
53 inches: z = -3.31
62 inches: z = -1.11
Z-Scores - Your turn
• Find out what percentage of people are taller than
the heights given below:
– z-tables
•
•
•
•
•
65 inches: z = -0.38
66.57 inches: z = 0
74 inches: z = 1.81
53 inches: z = -3.31
62 inches: z = -1.11
64.8%
50%
3.51%
99.95%
86.65%
Using Z-scores to
describe sample means
Uses of Z-Scores
• Describing the relative standing of scores
• Comparing scores from different distributions
• Computing the relative frequency of scores in any
distribution
• Describing and interpreting sample means
Sampling Distribution of Means
• We can now describe the relative position of a
particular score on a distribution
• What if instead of a single score, we want to see
how a particular sample of scores fit on the
distribution?
Sampling Distribution of Means
• For example, we want to know if students who sit in
the back score better or worse on exams than others
• Now, we are no longer interested in a single score’s
relative distribution, but a sample of scores
• What is the best way to describe a sample?
• So, we want to find the relative position of a sample
mean
Sampling Distribution of Means
• To find the relative position of a sample mean, we
need to compare it to a distribution of sample
means
• just like to find the relative position of a particular score,
we needed to compare it to a distribution of scores
• So first we need to create a new distribution, a
distribution of sample means
• How to do this?
Sampling Distribution of Means
• We want to compare the people in a sample to
everyone else
• To create a distribution of sample means, we can
select 10 names at random from the population and
calculate the mean of this sample
• X1 = 3.1
• Do this over and over again, randomly selecting 10
people at a time
• X2 = 3.3, X3 = 3.0, X4 = 2.9, X5 = 3.1, X6 = 3.2, etc etc
Sampling Distribution of Means
a
a
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
Sampling Distribution of Means
a
a
2.3
2.5
2.7
2.9
3.1
3.3
3.5
Each score is not a raw score, but
is instead a sample mean
3.7
3.9
Sampling Distribution of Means
• In reality, we cannot infinitely draw samples from
our population, but we know what the distribution
would be like
• The central limit theorem defines the shape, mean
and standard deviation of the sampling distribution
Central Limit Theorem
• The central limit theorem allows us to envision the
sampling distribution of means that would be
created by exhaustive random sampling of any raw
score distribution.
Sampling Distribution of Means: Characteristics
• A sampling distribution is approximately normal
• The mean of the sampling distribution () is the
same as the mean of the raw scores
• The standard deviation of the sampling distribution
(x) is related to the standard deviation of the raw
scores
Sampling Distribution of Means
a
a
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
Sampling Distribution of Means
a
Shape of distribution is normal
a
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
Sampling Distribution of Means
a
Mean is the same as raw score mean
a
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
Sampling Distribution of Means
a
SD related to raw score SD
aa
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
Standard Error of the Mean
• The standard deviation of the sampling distribution
of means is called the standard error of the mean.
The formula for the true standard error of the
mean is:
X 
X
N
Standard Error of the Mean - Example
• Estimating Professor’s Age:
• N = 197
• Standard deviation () = 4.39
X 
X
N
Standard Error of the Mean - Example
• N = 197
• Standard deviation () = 4.39
• Standard error of the mean = 4.39 / √197 = 4.39 /
14.04 = 0.31
X 
X
N
Z-Score Formula for a Sample Mean
• The formula for computing a z-score for a sample
mean is:
z
X 
X
Z-Score for a Sample Mean - Example
•
•
•
•
Mean of population = 36
Mean of sample = 34
Standard error of the mean = 0.31
Z = (34 - 36) / 0.31 = -2 / 0.31 = -6.45
z
X 
X
Sampling Distribution of Means - Why?
• We want to compare the people in sample to
everyone else in population
• Creating a sampling distribution gives us a normal
distribution with all possible means
• Once we have this, we can determine the relative
standing of our sample
• use z-scores to find the relative frequency
Done for today
• Read for next week.
• Pick up quizzes at front of class.