Standard Normal Distribution

Download Report

Transcript Standard Normal Distribution

Normal distributions
• The most important continuous probability distribution in the
entire filed of statistics is the normal distributions.
• All normal distributions have the same overall shape.
• The exact density curve for a particular normal distribution is
specified by giving its mean  and its variance 2.
• The mean is located at the center of the symmetric density curve
and is the same as the median and the mode.
• Changing  without changing  moves the normal curve along the
horizontal axis without changing its spread.
STA286 week 6
1
• The standard deviation  controls the spread of a normal curve.
STA286 week 6
2
• The density funstion of the normal random variable is given by
 1 2  x   2
1
f X x  
e 2
,   x  
2 
• Notation: A normal distribution with mean  and variance 2 is
denoted by N(, 2).
• Note, there are other symmetric bell-shaped density curves that
are not normal e.g. t distribution.
STA286 week 6
3
The 68-95-99.7 rule
In the normal distribution with mean  and standard deviation  ,
 Approximately 68% of the observations fall within  of the mean .
 Approximately 95% of the observations fall within 2 of the mean .
 Approximately 99.7% of the observations fall within 3 of the mean .
STA286 week 6
4
Standardizing and z-scores
• If x is an observation from a distribution that has mean  and standard
deviation  , the standardized value of
x is given by

z  x
• A standardized value is often called a z-score.
• A z-score tells us how many standard deviations the original
observation falls away from the mean of the distribution.
• Standardizing is a linear transformation that transform the data into
the standard scale of z-scores. Therefore, standardizing does not
change the shape of a distribution, but changes the value of the mean
and standard deviation.
STA286 week 6
5
Example
• The heights of women is approximately normal with mean
64.5 inches and standard deviation  = 2.5 inches.
=
• The standardized height is
z  height 64.5
2.5
• The standardized value (z-score) of height 68 inches is
z  68 64.5 1.4
2.5
or 1.4 std. dev. above the mean.
• A woman 60 inches tall has standardized height
z  60 64.5 1.8
2.5
or 1.8 std. dev. below the mean.
STA286 week 6
6
The Standard Normal distribution
• The standard normal distribution is the normal distribution N(0, 1)
that is, the mean  = 0 and the sdev  = 1 .
• If a random variable X has normal distribution N(, ), then the
standardized variable
Z  X 
has the standard normal distribution.
• Areas under a normal curve represent proportion of observations
from that normal distribution.
• There is no formula to calculate areas under a normal curve.
Calculations use either software or a table of areas. The table and
most software calculate one kind of area: cumulative proportions .
A cumulative proportion is the proportion of observations in a
distribution that fall at or below a given value and is also the area
under the curve to the left of a given value.
STA286 week 6
7
The standard normal tables
• Table A.3 gives cumulative proportions for the standard normal
distribution. The table entry for each value z is the area under the
curve to the left of z, the notation used is P( Z ≤ z).
e.g. P( Z ≤ 1.4 ) = 0.9192
STA286 week 6
8
Standard Normal Distribution
z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
.5000
.5398
.5793
.6179
.6554
.6915
.7257
.7580
.7881
.8159
.8413
.8643
.8849
.9032
.9192
.9332
.9452
.9554
.9641
.9713
.9772
.9821
.9861
.9893
.9918
.9938
.9953
.9965
.9974
.9981
.9987
.5040
.5438
.5832
.6217
.6591
.6950
.7291
.7611
.7910
.8186
.8438
.8665
.8869
.9049
.9207
.9345
.9463
.9564
.9649
.9719
.9778
.9826
.9864
.9896
.9920
.9940
.9955
.9966
.9975
.9982
.9987
.5080
.5478
.5871
.6255
.6628
.6985
.7324
.7642
.7939
.8212
.8461
.8686
.8888
.9066
.9222
.9357
.9474
.9573
.9656
.9726
.9783
.9830
.9868
.9898
.9922
.9941
.9956
.9967
.9976
.9982
.9987
.5120
.5517
.5910
.6293
.6664
.7019
.7357
.7673
.7967
.8238
.8485
.8708
.8907
.9082
.9236
.9370
.9484
.9582
.9664
.9732
.9788
.9834
.9871
.9901
.9925
.9943
.9957
.9968
.9977
.9983
.9988
.5160
.5557
.5948
.6331
.6700
.7054
.7389
.7703
.7995
.8264
.8508
.8729
.8925
.9099
.9251
.9382
.9495
.9591
.9671
.9738
.9793
.9838
.9875
.9904
.9927
.9945
.9959
.9969
.9977
.9984
.9988
.5199
.5596
.5987
.6368
.6736
.7088
.7422
.7734
.8023
.8289
.8531
.8749
.8944
.9115
.9265
.9394
.9505
.9599
.9678
.9744
.9798
.9842
.9878
.9906
.9929
.9946
.9960
.9970
.9978
.9984
.9989
.5239
.5636
.6026
.6406
.6772
.7123
.7454
.7764
.8051
.8315
.8554
.8770
.8962
.9131
.9279
.9406
.9515
.9608
.9686
.9750
.9803
.9846
.9881
.9909
.9931
.9948
.9961
.9971
.9979
.9985
.9989
.5279
.5675
.6064
.6443
.6808
.7157
.7486
.7794
.8078
.8340
.8577
.8790
.8980
.9147
.9292
.9418
.9525
.9616
.9693
.9756
.9808
.9850
.9884
.9911
.9932
.9949
.9962
.9972
.9979
.9985
.9989
.5319
.5714
.6103
.6480
.6844
.7190
.7517
.7823
.8106
.8365
.8599
.8810
.8997
.9162
.9306
.9429
.9535
.9625
.9699
.9761
.9812
.9854
.9887
.9913
.9934
.9951
.9963
.9973
.9980
.9986
.9990
.5359
.5753
.6141
.6517
.6879
.7224
.7549
.7852
.8133
.8389
.8621
.8830
.9015
.9177
.9319
.9441
.9545
.9633
.9706
.9767
.9817
.9857
.9890
.9916
.9936
.9952
.9964
.9974
.9981
.9986
.9990
The table shows
area to left of ‘z’
under standard
normal curve
For a negative
number, -z :
Area below (-z) =
Area above (z) =
1 – Area below (z)
9
The standard normal tables - Example
• What proportion of the observations of a N(0,1) distribution
takes values
a) less than z = 1.4 ?
b) greater than z = 1.4 ?
c) greater than z = -1.96 ?
d) between z = 0.43 and z = 2.15 ?
STA286 week 6
10
Properties of Normal distribution
• If a random variable Z has a N(0,1) distribution then P(Z = z)=0. The
area under the curve below any point is 0.
• The area between any two points a and b (a < b) under the standard
normal curve is given by
P(a ≤ Z ≤ b) = P(Z ≤ b) – P(Z ≤ a)
• As mentioned earlier, if a random variable X has a N(, )
distribution, then the standardized variable
Z
X 

has a standard normal distribution and any calculations about X
can be done using the following rules:
STA286 week 6
11
•



P(X = k) = 0
for all k.
a 

P  X  a   P Z 

 

b 

P  X  b   1  P Z 

 

b 
a
Pa  X  b   P
Z

 
 
• The solution to the equation P(X ≤ k) = p is
k = μ + σzp
Where zp is the value z from the standard normal table that has
area (and cumulative proportion) p below it, i.e. zp is the pth
percentile of the standard normal distribution.
STA286 week 6
12
Questions
1.
The marks of STA286 students has N(65, 15) distribution. Find the
proportion of students having marks
(a) less then 50.
(b) greater than 80.
(c) between 50 and 80.
2.
Scores on SAT verbal test follow approximately the
N(505,
110) distribution. How high must a student score in order to place in
the top 10% of all students taking the SAT?
3.
The time it takes to complete a STA286 term test is normally
distributed with mean 100 minutes and standard deviation 14
minutes. How much time should be allowed if we wish to ensure
that at least 9 out of 10 students (on average) can complete it?
STA286 week 6
13
4.
General Motors of Canada has a deal: ‘an oil filter and lube job in 25
minutes or the next one free’. Suppose that you worked for GM and
knew that the time needed to provide these services was approximately
normal with mean 15 minutes and std. dev. 2.5 minutes. How many
minutes would you have recommended to put in the ad above if it was
decided that about 5 free services for 100 customers was reasonable?
5.
In a survey of patients of a rehabilitation hospital the mean length of
stay in the hospital was 12 weeks with a std. dev. of 1 week. The
distribution was approximately normal.
Out of 100 patients how many would you expect to stay longer than 13
weeks?
What is the percentile rank of a stay of 11.3 weeks?
What percentage of patients would you expect to be in longer than 12
weeks?
What is the length of stay at the 90th percentile?
What is the median length of stay?
a)
b)
c)
d)
e)
STA286 week 6
14
Normal Approximation to the Binomial
• If X has a Binomial distribution with mean µ = np and variance σ2 =
npq, then the limiting form of the distribution of
X  np
Z 
npq
as n∞, is the standard normal distribution.
• It turns out that the normal distribution provides a fairly good
approximation even when n is not so large (section 6.5).
• As a rule of thumb, we will use this approximation for values of n
and p that satisfy np ≥ 10 and n(1-p) ≥ 10 .
week 8
15
Example
• You are planning a sample survey of small businesses in your area.
You will choose a SRS of businesses listed in the Yellow Pages.
Experience shows that only about half the businesses you contact
will respond.
(a) If you contact 150 businesses, it is reasonable to use the
Bin(150; 0.5) distribution for the number X who respond. Explain
why.
(b) What is the expected number (the mean) who will respond?
(c) What is the probability that 70 or fewer will respond?
(d) How large a sample must you take to increase the mean
number of respondents to 100?
week 8
16
Exercise
• According to government data, 21% of American children
under the age of six live in households with incomes less than
the official poverty level. A study of learning in early
childhood chooses a SRS of 300 children.
a) What is the mean number of children in the sample who come
from poverty-level households? What is the standard
deviation of this number?
b) Use the normal approximation to calculate the probability that
at least 80 of the children in the sample live in poverty.
week 8
17
The Chi-Square distribution
• The Chi-Squared densities are subsets of the gamma family of
distributions. They are obtained by letting α = υ/2 and λ = ½ where υ
is a positive integer.
• The parameter of the Chi-Squared distribution, υ, is called degrees of
freedom.
• The Chi-Squared density is given by
• The mean and variance of the Chi-Squared distribution are…
week 6
18
• Note: 
1
 
2
• We can use Table A.5 in Appendix to answer questions like:
Find the value k for which P210  k   0.975 . k is the upper 2.5
2
2
percentile of the  10 distribution. Notation:  0.025, 10 .
week 6
19
Weibull Distribution
• The continuous random variable X has a Weibull Distribution,
with parameters α and β if its density function is given by
 x  1e  x , x  0
f X x   
 0 ,
otherwise

• The mean and variance of the Weibull Distribution are…
• The Weibull distribution is applied to reliability and life-testing
problems such as time to failure or life length.
• The Weibull distribution does not have the lack of memory
property.
• The cumulative distribution function is given by…
STA286 week 6
20
Example
Service life, in years of an hearing aid battery is a random
variable having a Weibull distribution with α = ½ and β = 2.
a) How long can such battery be expected to last?
b) What is the probability that such a battery will be operating
after 2 years?
STA286 week 6
21
Failure Rate for the Weibull Distribution
• The time to failure, T, of a component is often described by the
Weilbull distribution.
• The Weilbull distribution is helpful in determining the failure rate
(also called hazard rate) in order to get a sense of wear or
deterioration of the component.
• The reliability of a component is the probability that it will last for at
least a specified time under specific experimental conditions.
• The reliability of a component at time t is given by

Rt   PT  t    fT x dx  1  FT t 
t
STA286 week 6
22
• The failure rate of a component is the change over time of the
conditional probability that the component last an additional ∆t units
of time given that it has lasted to time t.
• The failure rate at time t is given by:
Z t   t  1 t  0 .
• If β = 1, Z(t) = α which is a constant. This is a special case of the
Exponential distribution which has lack of memory.
• If β > 1, Z(t) is an increasing function of t indicating that the
components wears over time.
• If β < 1, Z(t) is a decreasing function of t indicating that the
components strengthens over time.
STA286 week 6
23
Example
• The live of a certain automobile seal has the Weibull distribution with
failure rate given by: Z t   1 / t .
• Find the probability that the seal is still intact after 4 year.
STA286 week 6
24