Normal distribution - IB Class of 2011 @ Gateway

Download Report

Transcript Normal distribution - IB Class of 2011 @ Gateway

The Normal Distribution
“the bell curve”
Some slides downloaded from
www.registart.co.uk
The Most Important Distribution in
Statistics!

Describes the characteristics of many realworld data sets:
–
–
–
–
–
–
test scores for large groups of students
actual sizes (length, width) of jeans at Kohl’s
eyesight of all 20-year-olds in Kissimmee
actual lifetimes of 1000 AA batteries
testosterone level of all male students at GHS
length of middle finger of 250 students
Characteristics




Symmetric, bell-shaped curve.
X can take any value (continuous RV)
Shape of curve depends on 2 parameters:
– Center of distn is population mean 
– Spread is determined by std deviation 
Most values fall around the mean, but a few
values are much smaller and a few are much
larger (equal chance).
Probability Density Function (PDF)

“X is distributed normally with mean μ and standard
deviation σ”
1 x   
2

1
2
f ( x) 
e
2

 
  x  
Shape Depends on Mean, Std. Dev
Bell-shaped curve
0.08
Mean = 70 SD = 5
0.07
Density
0.06
0.05
0.04
Mean = 70 SD = 10
0.03
0.02
0.01
0.00
40
50
60
70
Grades
80
90
100
As a Histogram
(Area of rectangle = probability)
Symmetrical Binomial Distribution
B(10, 0.5)
0.3
P(X=r)
0.25
Prob
0.2
0.15
0.1
0.05
0
0
1
2
3
4
5
r
6
7
8
9
10
Decrease interval size...
Symmetrical Binomial Distribution
B(30, 0.5)
0.16
P(X=r)
0.14
0.12
Prob
0.1
0.08
0.06
0.04
0.02
0
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
r
Decrease interval size more….
0.09
Binomial Distribution :
B(100,0.5)
P(X=r)
0.08
0.07
0.05
Almost a nice
continuous curve
0.04
0.03
0.02
0.01
r
95
10
0
90
85
80
75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
0
0
Prob
0.06
Probability = Area under Curve

Curve describes probability of getting a
range of values
–


e.g., P(X > 60), P(X < 30), P(20 < X < 50)
Area under whole curve = 1
Probability of getting specific number is 0,
e.g. P(X=60) = 0
–
so P(x < 60) is the same as P(x ≤ 60)
Probability that X is less than a #
f(x)
0.095
P(X < 23)
[or P(X ≤ 23)]
0.09
0.085
0.08
mean 20
std dev 5
0.075
0.07
0.065
0.06
0.055
0.05
0.045
0.04
0.035
0.03
0.025
0.02
0.015
0.01
0.005
x
-1
-0.005
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Probability that X is more than a #
P( X  23)  1  P( X  23)
f(x)
0.095
P(X > 23)
[or P(X ≥ 23)]
0.09
0.085
0.08
mean 20
std dev 5
0.075
0.07
0.065
0.06
0.055
0.05
0.045
0.04
0.035
0.03
0.025
0.02
0.015
0.01
0.005
x
-1
-0.005
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Probability that X is between 2 #s
P (13  X  21)  P ( X  21)  P ( X  13)
f(x)
0.095
P(13 < X < 21)
[or P(13 ≤ X < 21), etc.]
0.09
0.085
0.08
mean 20
std dev 5
0.075
0.07
0.065
0.06
0.055
0.05
0.045
0.04
0.035
0.03
0.025
0.02
0.015
0.01
0.005
x
-1
-0.005
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Standard Deviation

Graph (H&H p.730)
Draw with GDC

Set window
–
–

X from μ - 3σ to μ + 3σ
1
Y from 0 to
 2
(99.7% of all values)
Draw
–
–
2nd PRGM (DRAW)  CLRDRAW (#1)
2nd VARS (DISTR)  DRAW  ShadeNorm(lower limit,
upper limit, [μ, σ])
if omit [μ, σ]  0, 1
Draw with GDC (con’t)

For normally distributed X with mean 15, std dev 2:

P(8 ≤ X ≤ 12):
P(X ≥ 17):
P(X ≤ 16):


use ShadeNorm(8, 12, 15, 2)
use ShadeNorm(17, E99, 15, 2)
use ShadeNorm(-E99, 16, 15, 2)
use E99 in place of ∞, -E99 in place of -∞
Calculate with GDC

For normally dist’d X with mean 71.5, std dev 3.8:

2nd VARS (DISTR)  #2
normalcdf(lower limit, upper limit, [μ, σ])
if omit [μ, σ]  0, 1



P(62.1 ≤ X ≤ 68.7):
P(X ≥ 89.0):
P(X ≤ 42.5):
use normalcdf(62.1, 68.7, 71.5, 3.8)
use normalcdf(89.0, E99, 71.5, 3.8)
use normalcdf(-E99, 42.5, 71.5, 3.8)
Note: P(62.1 ≤ X ≤ 68.7) is just P(X ≤ 68.7) - P(X ≤ 62.1)
Standard Normal Distribution (Z-Distn)

To make a table of values for X, need to know both μ
and σ
–
–




One table for each combination of μ and σ
 LOTS of tables!!!
Make a new random variable Z = (X – μ)/σ
Z is called the standard normal distribution
Need only one table of values for Z, since μ = 0 and
σ = 1 always
Symmetric, so P(Z < -k) = P(Z > k)
The Standard Normal Distribution
(Z)
Z ~ N (0,1)
Z-Values (“Z-Scores”)

Value of Z is just the # of standard deviations
from the mean:
–
–
–
–
–
Z = -2 corresponds to X = μ - 2σ
Z = -1 corresponds to X = μ - σ
Z = 0 corresponds to X = μ
Z = 1 corresponds to X = μ + σ
Z = 2 corresponds to X = μ + 2σ
Etc.

(Insert graph of preceding slide)
Z-Values with GDC

P(-1.5 ≤ Z ≤ 2.1)  normalcdf(-1.5, 2.1)
(Omitting μ and σ means μ = 0 and σ = 1)

If starting with X-values (μ ≠ 0 and/or σ ≠ 1),
don’t forget to convert to Z, then back to X
The Standard Normal Distribution
Z ~ N (0,1)
The probabilities are
given by the area under
the curve
P(Z<-1.6)
The Standard Normal Distribution
Z ~ N (0,1)
The probabilities are
given by the area under
the curve
P(Z< -1.6)
=0.0548
By symmetry:
P(Z < -1.6) = P(Z > 1.6)
P(Z < -1.6) = 1 - P(Z < 1.6)
Z-Values from Tables

Table in formula packet
“Area under the standard normal curve (topic 6.11)”



Gives probability that Z is less than (actually
< or ≤) a specified value
Table is for positive values of Z, only
Before using table, convert X-values to Z
Reading Table of Z-Values


(INSERT picture of table), with animations
showing how to read z to 2 decimal places
Highlight Z-values on top and on left,
highlight cross-indexed area
Example : Using Table of Z-Values
For X with mean μ = 26, std dev σ = 1.4,
find P(X < 27.1)
Z = (X – μ)/σ = (27.1 – 26)/1.4 = 0.786  use Z = 0.79
P(X < 27.1) = P(Z < 0.79) = 0.7852
compare to answer from normalcdf(-E99, 27.1, 26, 1.4)
 P(X < 27.1) = 0.7840 (slightly different because we rounded Z)
P(X < 27.106) = P(Z < 0.79) (no rounding) = 0.7852 (to 4 d.p.’s)
Extending the Table

Table from formula packet only works for:
–
–


P(Z < z)
Positive Z-values
What to do if you want P(Z > z), or if Z is a
negative value?
Think of the graph and which areas you
should shade…
Calculating P(Z > z) from Table


Use the fact that the total area under the bell
curve equals 1
P(Z < z) + P(Z > z) = 1
(remember P(Z = z) = 0)

P(Z > z) = 1 – P(Z < z)
Example: P(Z > z) from Table
Find P(Z > 2.58)
P(Z > 2.58) = 1 – P(Z < 2.58)
From table, P(Z < 2.58) = 0.9951
 P(Z > 2.58) = 1 – 0.9951 = 0.0049
Example: P(X > x)
For X with mean μ = 26 and std dev σ = 1.4,
find P(X > 26.8)
Z = (X – μ)/σ = (26.8 – 26)/1.4 = 0.571  use Z = 0.57
P(X > 26.8)  P(Z > 0.57)
= 1 - P(Z < 0.57) = 1 - 0.7157 = 0.2843
compare to P(X > 26.8) using normalcdf(26.8, E99, 26, 1.4):
P(X > 26.8) = 0.2839
(again, difference due to rounding)
Using Table with Negative Z’s




Use the fact that the bell curve is symmetric!
(insert graph)
P(Z < -z) = P(Z > z)
= 1 – P(Z < z)
P(Z > -z) = P(Z < z)
Example: Using Table with Z < 0
Given normally dist’d X with μ = 54.4, σ = 6.7,
find P(X < 49.8)
Z = (X – μ)/σ = (49.8 – 54.4)/6.7 = -0.687
 use
Z = -0.69
P(Z < -0.69) = P(Z > 0.69)
= 1 – P(Z < 0.69) = 1 – 0.7549
= 0.2451
Compare: normalcdf(-E99, 49.8, 54.4, 6.7) = 0.2462
Using Table for P(a < X < b)



Subtract the areas
P(a < X < b) = P(X < b) – P(X < a)
INSERT pictures
Example: P(a < X < b) from Table
Given normally dist’d X with μ = 54.4, σ = 6.7, find
P(45.0 < X < 49.8)
Z1 = (X – μ)/σ = (45.0 – 54.4)/6.7 = -1.40
Z2 = (X – μ)/σ = (49.8 – 54.4)/6.7 = -0.69
P(-1.40 < Z < -0.69) = P(Z < -0.69) – P(Z < -1.40)
= P(Z > 0.69) – P(Z > 1.40)
= [1 – P(Z < 0.69)] – [1 – P(Z < 1.40)]
= [1 – 0.7549] – [1 – 0.9192] = 0.1643
Compare: normalcdf(45.0, 49.8, 54.4, 6.7) = 0.1659
Inverse Normal Probabilities

Now we work backwards:
–
–

Examples of questions:
–
–
–

know the probability
want to find corresponding value of X (or Z)
Find k such that P(X ≤ k) = 95.4%
If P(-0.10 < X < b) = 0.357 (i.e., 35.7%), find b
Find μ so that P(X > 0.771) = 80.8%
Could use trial and error, but there’s a better way
Inverse Normal Probabilities by GDC



Use 2nd VARS (DISTR)  #3
invNorm(area, [μ, σ])
μ and σ are optional
If omitted, then:
μ=0
σ=1
(omit when using Z-score, not X)
Example: Inv. Normal Prob. by GDC
X is normally distributed with μ = 70, σ = 10.
Find k such that P(X ≤ k) = 0.954 (i.e., 95.4%)
2nd VARS (DISTR)  invNorm(0.954, 70, 10) = 86.8
 k = 86.8
Check: normalcdf(-E99, 86.8, 70, 10) = 0.954

Inverse Normal Probabilities by Table

Table in formula packet (2 pages)
“Inverse Normal Probabilities (topic 6.11)”



Gives probability that Z is less than (actually
< or ≤) a specified value
Table is for probabilities between 0.5 and
0.999, and only for positive values of Z
Before using table, convert X-values to Z
Reading Inverse Probability Table


(INSERT picture of table), with animations
showing how to read z to 2 decimal places
Highlight probabilities on top and on left,
highlight cross-indexed Z-score
Examples: Using Inverse Table
Find k such that P(Z < k) = 0.695
p = 0.695  read Z = 0.5101  k = 0.5101
Check: normalcdf(-E99, 0.5101) = 0.695 
(omit μ, σ)

Find k such that P(Z > k) = 0.128
P(Z < k) = 1 – P(Z > k)
= 1 – 0.128 = 0.872
p = 0.872  read Z = 1.1359  k = 1.1359
Check: normalcdf(1.1359, E99) = 0.128 
(omit μ, σ)

Example: Using Inverse Table for X
X is dist’d normally with μ = 24.6, σ = 0.8
For what value of k is P(X < k) = 0.602?

read Z = 0.2585
X = Zσ + μ
= (0.2585)(0.8) + 24.6
= 24.8
Check: normalcdf(-E99, 24.8, 24.6, 0.8) = 0.599
(difference due to rounding X)
p = 0.602
Z = (X – μ)/σ 
Extending the Inverse Table

Table from formula packet only works for
0.5 < p < 0.999 and Z > 0

What to do if p < 0.5?
–

use P(Z < k) + P(Z > k) = 1
What to do if P(Z > k)?
–
use P(Z > k) = P(Z < -k)
(symmetry)
Example: Inverse Table for p < 0.5
For what value of k is P(Z < k) = 0.210?
P(Z < k) = 1 – P(Z > k) 
P(Z > k) = 1 – 0.210 = 0.79
(which is > 0.5, so we can use the table now)
By symmetry, P(Z > k) = P(Z < -k)
(the table requires P(Z < k))

read Z = 0.8064

-k = 0.8064
Check: normalcdf(-E99, -0.8064) = 0.210
p = 0.79
 k = -0.8064

Example: Inverse Table, P(a < X < b)
Insert example…
Example: Inverse Table, μ = ?
X is dist’d normally with σ = 1.75 but unknown μ.
Find μ if P(X > 4.92) = 0.4.
P(X > 4.92) = 0.4
 read Z = 0.2534
Z = (X – μ)/σ




P(Z > k) = 0.4
1 - P(Z < k) = 0.4
P(Z < k) = 1 - 0.4 = 0.6
0.2534 = (4.92 – μ)/1.75
μ = 4.92 – (0.2534)(1.75)
= 4.48
Check: normalcdf(4.92, E99, 4.48, 24.6, 1.75) = 0.401 
(difference due to rounding X)