Transcript Chapter 1

Chapter 3
The Normal Curve
Where have we been?
To calculate SS, the variance, and the
standard deviation: find the deviations from
, square and sum them (SS), divide by N (2)
and take a square root().
Example: Scores on a Psychology quiz
Student
X
John
7
Jennifer
8
Arthur
3
Patrick
5
Marie
7
X = 30
N=5
 = 6.00
X-
+1.00
+2.00
-3.00
-1.00
+1.00
(X- ) = 0.00
(X - )2
1.00
4.00
9.00
1.00
1.00
(X- )2 = SS = 16.00
2 = SS/N = 3.20
 = 3.20 = 1.79
The variance and standard
deviation are numbers that
describe how far, on the
average, scores are from
their mean, mu.
But we often want additional detail about how
scores will fall around their mean.
We may also wish to theorize about how scores
should fall around their mean.
Describing and theorizing
about how scores fall
around their mean.
Frequency distributions
Stem and leaf displays
Bar graphs and histograms
Theoretical frequency distributions
Frequency distributions
# of
acdnts
0
1
2
3
4
5
6
7
8
9
10
11
Absolute
Frequency
117
157
158
115
78
44
21
7
6
1
3
1
708
Cumulative
Frequency
117
274
432
547
625
669
690
697
703
704
707
708
Cumulative
Relative
Frequency
.165
.387
.610
.773
.883
.945
.975
.983
.993
.994
.999
1.000
Cumulative frequencies
show number of scores
at or below each point.
Calculate by adding all
scores below each point.
Cumulative relative
frequencies show the
proportion of scores at
or below each point.
Calculate by dividing
cumulative frequencies
by N at each point.
Stem and Leaf Display
Reading time data
i = .05
#i = 10
Reading
Time
2.9
2.9
2.8
2.8
2.7
2.7
2.6
2.6
2.5
2.5
Leaves
5,5,6,6,6,6,8,8,9
0,0,1,2,3,3,3
5,5,5,5,5,6,6,6,7,7,7,7,7,7,7,8,9,9,9,9
0,0,1,2,3,3,3,3,4,4,4
5,5,5,5,6,6,6,8,9,9
0,0,0,1,2,3,3,3,4,4
5,6,6,6
0,1,1,1,2,3,3,4
6,6,8,8,8,8,8,9,9,9
0,1,1,1,2,2,2,4,4,4,4
Transition to Histograms
4
4
4
4
2
2
2
1
1
1
0
9
9
9
8
8
8
8
8
6
6
4
3
3
2
1
1
1
0
2.502.54
2.552.59
2.60 –
2.64
6
6
6
5
2.65 –
2.69
4
4
3
3
3
2
1
0
0
0
2.70 –
2.74
9
9
8
6
6
6
5
5
5
5
2.75 –
2.79
4
4
4
3
3
3
3
2
1
0
0
2.80 –
2.84
9
9
9
9
7
7
7
7
7
7
7
6
6
6
5
5
5
5
3
3
3
2
1
0
0
2.85 –
2.89
2.90 –
2.94
9
8
8
6
6
6
6
5
5
2.95 –
2.99
Histogram of reading
times
F
r
e
q
u
e
n
c
y
20
18
16
14
12
10
8
6
4
2
0
2.502.552.60 – 2.65 – 2.70 – 2.75 – 2.80 – 2.85 – 2.90 – 2.95 –
2.54
2.59
2.64
2.69
2.74
2.79
2.84
2.89
2.94
2.99
Reading Time (seconds)
Normal Curve
Principles of theoretical
frequency distributions
Expected frequency = Theoretical relative
frequency X N
Expected frequencies are your best estimates
because they are closer, on the average, than
any other estimate when we square the error.
Law of Large Numbers - The more observations
that we have, the closer the relative frequencies
should come to the theoretical distribution.
Using the theoretical
frequency distribution
known as the normal
curve
The Normal Curve
Described mathematically by Gauss in 1851. So
it is also called the “Gaussian”distribution. It
looks something like a bell, so it is also called a
“bell shaped” curve.
The normal curve is a figural representation of a
theoretical frequency distribution.
The frequency distribution represented by the
normal curve is symmetrical.
The mean (mu) falls exactly in the middle.
68.26% of scores fall within 1 standard deviation of the mean.
95.44% of scores fall within 2 standard deviations of the mean.
99.74% of scores fall within 3 standard deviations of mu.
IMPORTANT CONCEPT:
Since the curve is
symmetrical around the
mean, whatever happens
on one side of the curve is
exactly mirrored on the
other side.
The normal curve and Z scores
The normal curve is the theoretical relative
frequency distribution that underlies most
variables that are of interest to psychologists.
A Z score expresses the number of standard
deviations that a score is above or below the
mean in a normal distribution.
Any point on a normal curve can be referred to
with a Z score
The Z table and the curve
The Z table shows the normal curve in tabular
form as a cumulative relative frequency
distribution.
That is, the Z table lists the proportion of a
normal curve between the mean and points
further and further from the mean.
The Z table shows only the cumulative
proportion in one half of the curve. The highest
proportion possible on the Z table is therefore
.5000
Why does the Z table show
cumulative relative frequencies
only for half the curve?
Why does the Z table show
cumulative relative frequencies
only for half the curve?
 The cumulative relative frequencies for half the curve
are all one needs for all relevant calculations.
 Remember, the curve is symmetrical.
 So the proportion of the curve between the mean and a
specific Z score is the same whether the Z score is
above the mean (and therefore positive) or below the
mean (and therefore negative).
 Separately showing both sides of the curve in the Z
table would therefore be redundant and (unnecessarily)
make the table twice as long.
IMPORTANT CONCEPT:
The proportion of the curve
between any two points on the
curve represents the theoretical
relative frequency (TRF) of
scores between those points.
Area of the curve between
two points
If the area of the curve between two
points is 56.32% of the curve, we would
expect to find a proportion of .5632 of the
scores between those two points.
With a little arithmetic, using the
Z table, we can determine:
The proportion of the curve above or below any Z
score.
Which equals the proportion of the scores
we can expect to find above or below any
Z score.
The proportion of the curve between any two Z
scores.
Which equals the proportion of the scores
we can expect to find between any two Z
scores.
Normal Curve – Basic
Geography
F
r
e
q
u
e
n
c
y
Standard
deviations
Z scores
Percentages
The mean
One standard
deviation
Measure
3
2
1
0
1
2
3
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
|---34.13--|--34.13---|
|--------47.72----------|----------47.72--------|
|--------------49.87-----------------|------------------49.87------------|
The z table
The Z table contains pairs of
columns: columns of Z scores
coordinated with columns of
proportions from mu to Z.
The columns of proportions show
the proportion of the scores that
can be expected to lie between
the mean and any other point on
the curve.
The Z table shows the cumulative
relative frequencies for half the
curve.
Z
Proportion
Score
mu to Z
0.00
0.01
0.02
0.03
0.04
.
1.960
2.576
.
3.90
4.00
4.50
5.00
.0000
.0040
.0080
.0120
.0160
.
.4750
.4950
.
.49995
.49997
.499997
.4999997
Another important concept: Most
scores are close to the mean! So if
you have two equal sized intervals, the
one closer to mu contains a higher
proportion of scores
What proportion of scores falls in the interval
between Zs of -.50 to +.50 (an interval of one
standard deviation right around the mean)?
P
.1915 + .1915
=
.3830
(almost
40%)
r
Note: this opis the one-standard-deviation-wide interval
o
with the highest
proportion
anywhere
on
the
curve.
r
t
i
o
n
=
.
1
Intervals further from mu
What proportion of scores falls in the interval between Zs of 0.00
to +1.00 (an interval of one standard deviation starting at the
mean, but not right around it)?
This one can be read directly from the table - .3413
(It is a little over a third)
What proportion of scores falls in the interval between Zs of +.50
to +1.50 (an interval of one standard deviation a little further
from the mean)?
..4332 - .1915 = .2417
This time we are down to less than a quarter of the population.
Common Z scores – memorize
these scores and proportions
Z Proportion
Score mu to Z
0.00
.0000
1.00
.3413
2.00
.4772
3.00
.4987
1.960
.4750 ( x 2 = 95% between Z= –1.960 and Z= +1.960)
2.576
.4950
(x 2 = 99% between Z= –2.576 and Z= + 2.576)
USING THE Z TABLE - Proportion of the scores
between a specific Z score and the mean.
F
r
e
q
u
e
n
c
y
Standard
deviations
.
Proportion mu to Z for -0.30
= .1179
Proportion score to mean
=.1179
score
3
2
1
470
0
1
2
3
USING THE Z TABLE - Proportion of the scores in
a population between two Z scores that are identical
in size, but have opposite signs.
F
r
e
q
u
e
n
c
y
Standard
deviations
.
Proportion mu to Z for -0.30
Proportion between +Z
and -Z
= .1179 + .1179
= .2358
= .1179
score
3
2
1
470
0
530
1
2
3
The critical values of the
normal curve
Critical values of a distribution show which
symmetrical interval around mu contains
95% and 99% of the curve.
In the Z table, the critical values are
starred and shown to three decimal places
95% (a proportion of .9500) is found
between Z scores of –1.960 and +1.960
99% (a proportion of .9900) is found
between Z scores of –2.576 and +2.576
USING THE Z TABLE – Proportion of the scores in
a population above a specific Z score.
F
r
e
q
u
e
n
c
y
Standard
deviations
Proportion mu to Z for .30
= .1179
Proportion above score
= .1179 + .5000
= .6179
score
3
2
1
470
0
1
2
3
USING THE Z TABLE - Proportion of scores
between a two different Z scores on opposite sides of
the mean. (ADD THE TWO PROPORTIONS!).
Proportion mu to Z for -1.06
= .3554
F
r
e
q
u
e
n
c
y
+0.37
-1.06
Z scores
-3.00
-2.00
-1.00
Proportion mu to Z for .37
= .1443
0.00
1.00
2.00
3.00
Percent between two scores.
Area
Area
Add/Sub Total
Per
Z1
Z2 mu to Z1 mu to Z2 Z1 to Z2 Area Cent
-1.06 +0.37 .3554
.1443
Add
.4997 49.97 %
USING THE Z TABLE - Proportion of scores
between two Z scores on the same side of the mean.
(Subtract the smaller proportion from the larger one.)
Proportion mu to Z for 1.12
= .3686
F
r
e
q
u
e
n
c
y
Proportion mu to Z for 1.50
= .4332
+1.12
Z scores
-3.00
-2.00
-1.00
0.00
1.00
+1.50
2.00
3.00
Percent between two scores.
Area
Area
Add/Sub Total
Z1
Z2 mu to Z1 mu to Z2 Z1 to Z2 Area
+1.50 +1.12 .4332
.3686
Sub
.0646
Per
Cent
6.46 %
What proportion of the curve falls
between a Z of -.30 and +1.30?
What proportion of the curve falls
between a Z of +.30 and +1.30?
Add the proportions if Z
scores are on opposite sides
of the mean, subtract if the
scores are on the same side
Between the mean and a Z of -.30 there
is a proportion of .1179.
Between the mean and a Z of +1.30 there
is a proportion of .4032
Between -.30 and +1.30 there is a
proportion of .1179 +.4032 = .5211
Between +.30 and +1.30 there is a
proportion of .4032-.1179 = .2853
Obtaining expected
frequencies (EF) from the
normal curve.
Basic rule: To find an expected frequency,
multiply the proportion of scores expected
in the part of the curve by the total N.
Expected frequency = theoretical relative
frequency x N.
Expected frequencies are
another least squared,
unbiased prediction.
Expected frequencies usually must be
wrong, as they are routinely written to
two decimal places, while it is impossible
to actually find 65 hundreths of a score
anywhere.
So, expected frequencies are another set of
least squared, unbiased predictions.
Such predictions can be expected to be
wrong, but close.
EF=TRF x N
In the examples that I’ve solved that follow, let’s
assume we have a population of size 300
(N=300)
To find the expected frequency, compute the
proportion of the curve between two specific Z
scores, just as we have been doing.
Then multiply that proportion (also called the
theoretical relative frequency or TRF) by N.
Expected frequency = theoretical relative frequency x number of
participants (EF=TRF*N). TRF from mean to Z = -.30 = .1179.
If N = 300: EF= .1179*300 = 35.37.
EF= .1179x300 = 35.37
F
r
e
q
u
e
n
c
y
Standard
deviations
.
Proportion mu to Z for -0.30
= .1179
3
2
1
470
0
1
2
3
In the questions that you will
solve that follow, assume
N=500.
If N=500, how many
participants should score
between the mean and a Z
score of +1.30?
Expected frequency = theoretical relative
frequency x number of participants
(EF=TRF * N).
TRF from mean to Z = +1.30 = .4032.
If N = 500: EF= .4032 * 500 = 201.60
Expected frequency above
a score.
This is like asking the EF between your Z score and a
Z score of +100.00. Half the curve lies between mu
and a Z score of +100.00
If Z is below mu, TRF is between two Z scores on
opposite sides of the mean. To get TRF, add half of
the curve (.5000) to the area from mu to Z. To get EF,
then multiply TRF by N.
If Z is above mu, TRF is between two Z scores on the
same side of the mean. To get TRF, subtract the area
from mu to Z from half of the curve (.5000).
To get EF, then multiply TRF by N
TRF above Z of -.30 is .1179 + .5000 = .6179.
If N = 300: EF=.6179 x 300 = 185.37
F
r
e
q
u
e
n
c
y
Standard
deviations
Proportion mu to Z for .30
= .1179
3
2
1
-.30
Proportion above score
= .1179 + .5000
= .6179
0
1
2
3
If N=500, how many
participants should score
above a Z score of +1.30?
Expected frequency = theoretical relative
frequency x number of participants (EF=TRF*N).
Z is above the mean, so you must subtract the
TRF between the mean and Z from the
proportion of the curve above the mean (which
is half the curve or a proportion of .5000).
TRF from mu to Z = +1.30 = .4032. N = 500:
EF= (.5000-.4032) * 500 = .0968*500 =48.40
EF below a score.
This is the opposite of expected frequencies above a
score. It is like asking the EF between your Z score
and a Z score of -100.00. Half the curve lies between
mu and a Z score of -100.00
If Z is above mu, TRF is between two Z scores on
opposite sides of the mean. To get TRF, add half of
the curve (.5000) to the area from mu to Z. To get EF,
then multiply TRF by N.
If Z is below mu, TRF is between two Z scores on
the same side of the mean. To get TRF, subtract the
area from mu to Z from half of the curve (.5000).
To get EF, then multiply TRF by Z
If N = 300, what is the EF of scores
below a Z of 1.00.
Expected frequency below a score: If Z is
above mu, to get TRF, add half of the curve
(.5000) to the area from mu to Z.
TRF below Z = +1.00 is .3413 + .5000 =
.8413.
If N = 300: EF=.8413 x 300 = 252.39.
F
r
e
q
u
e
n
c
y
Standard
deviations
Proportion = .5000 up to mean
+ .3413 for 1 SD
= .8413
inches
3
2
1
0
1
2
3
Percentile Rank
Percentile rank is the proportion
of the population you score as
well as or better than times 100.
The proportion you score as well
as or better than is shown by the
part of the curve below (to the
left of) your score.
Computing percentile rank
Above the mean, add the proportion of the
curve from mu to Z to .5000.
Below the mean, subtract the proportion of
the curve from mu to Z from .5000.
In either case, then multiply by 100 and
round to the nearest integer
(if 1st to 99th).
For example, a Z score of –2.10
Proportion mu to Z = .4821
Proportion at or below Z = .5000 - .4821 =.0179
Percentile = .0179 x 100 = 1.79 = 2nd percentile
Percentile Rank is the percent of the population you score as
well as or better = TRF below your Z score times 100. What is
the percentile rank of someone with a Z score of +1.00
F
Percentile: .5000 up to mean
r
+ .3413 =.8413
e
q
.8413 x 100 =84.13
u
=84th percentile
e
n
c
y
inches
Standard
deviations
3
2
1
0
1
2
3
A rule about rounding
percentile rank
 Between the 1st and 99th percentiles, you round off to
the nearest integer.
 Below the first percentile and above the 99th, use as
many decimal places as necessary to express percentile
rank.
 For example, someone who scores at Z=+1.00 is at the
100(.5000+.3413) = 84.13 = 84th percentile.
 Alternatively, someone who scores at Z=+3.00 is at the
100(.5000+.4987)=99.87= 99.87th percentile. Above 99
and below 1, don’t round to integers.
 We never say that someone is at the 0th or 100th
percentile.
Calculate percentiles
Z
Area
Add to .5000 (if Z > 0) Proportion Percentile
Score mu to Z Sub from .5000 (if Z < 0) at or below
-2.22
.4868
.5000 - .4868
.0132
1st
-0.68
.2517
.5000 - .2517
.2483
25th
+2.10 .4821
.5000 + .4821
.9821
98th
+0.33 .1293
.5000 + .1293
.6293
63rd
+0.00 .0000
.5000 +- .0000
.5000
50th
Below the 1st percentile
and above the 99th: Don’t
round!
What percentile are you at if your Z score is
+3.04?
Area mu to Z = .4988.
Since Z is above the mean, add proportion mu
to Z to .5000
Percentile = (.4988+.5000)*100 = 99.88
Above 99th percentile, DON”T ROUND!
The answer is the 99.88th percentile