Probability Distributions
Download
Report
Transcript Probability Distributions
Normal Distribution
Introduction
Compare to
Discrete Variables
• No. of Doctor’s Visits During the Year
• No. of Patients
P
No. of Visits
400
0.14
0
950
0.34
1
850
0.30
2
600
0.21
3+
2800
0.99
Histograms
• The height of each bar represents the
probability of that event
• If each bar is one unit in width, then the
area also equals the probability
• The total area under all the bars has to
add to 1.
Doctor's Visits
Probability
1000
800
0
1
2
3+
600
400
200
0
No. of Visits
Continuous Variables
Patient’s Weight
300
290
280
Frequency
2
3
7
But… Can take on any value
• Can make the weight intervals as small as
we want: every 10 lbs or 5 or 1, or … 0.5,
0.1, 0.001
• Histogram: As the intervals get smaller, the
bars decrease in width
Line Graph
• Completely continuous, no width at all.
Just connected points
Line Graph
100
90
80
70
60
50
40
30
20
10
0
Infinitesimally Small Intervals
• Then really just points on a smooth curve.
• We can also have n, the number of cases,
increase to infinity.
• The total probability is still one.
Infinitesimally Small Intervals
Smoother Curve
Area under the curve = 1.
Probabilities
• Can no longer read the probability of a
single event.
• In a continuous distribution, can only
measure the probability of a value falling
within some range
Probability Within a Range
Probability of a value falling within the range is
equal to the area under the curve.
Bad News
• To calculate the area under the curve we
would need to use calculus
• But not so bad news, others have done
the calculations and set up tables for us
• Applause
Diversity of Continuous
Distributions
• Lots of different distributions
• Lots of different shaped curves
• Would need lots of different tables,
however….
The Most Important Distribution
Introducing the
Normal Distribution
“Bell-Shaped Curve”
What are its
characteristics?
Normal Distribution
• First described in 1754.
• A lot of the relevant math done by Carl
Gauss, therefore “Gaussian Curve”
Properties
• Symmetrical about the
mean
• Mean, Median & Mode
are all equal
• Asymptotic, height never
reaches zero.
• What’s the total area
under the curve?
Ranges & Probabilities
• 50% of all values fall above the mean &
50% below it.
• All probabilities depend on how far the
values lie from the mean
• Distance measured in number of standard
deviations from the mean
Probabilities related to S.D.
One S.D. on either side of the mean
Area =
Other Distances
• 1 S.D. on either side of the mean includes
68% of the cases
• 2 S.D. on either side of the mean includes
95% of the cases
• 3 S.D. on either side of the mean includes
99.7% of the cases
Many Different Normal Distributions
• Determined by their mean and standard
deviation
Mean gives location. Standard Deviation gives
shape – more or less dispersed.
Proportions remain Same
• Relationships between probability and
standard deviation are the same in all
Normal Distributions
• However in order to use the tables
provided, we have to convert to the
“Standard Normal Distribution”
The Standard Normal Distribution
Mean = 0.
Standard Deviation = 1.
Z-values
• Converts values in any normal distribution
to the standard normal distribution.
• It’s a way to express the distance from the
mean in units of S.D.
• Z=X–X
s.d.
Compare this to 18 eggs.
How many dozen?
From Z find Probabilities
Use Table A-3. Gives areas in the upper tail of the S.N.D.
What is the area above Z = 1.28? Go to the Table.
Go to 1.2 in Left-hand column & across to 0.08
A = 0.10.
The probability that a value will fall above Z = 1.28 is 10%
S.N.D.
mean = 0. S.D. = 1
Test It
•
•
•
•
•
•
Let’s look up the ones we already know.
Range = 1 S.D. on either side of the mean
Z = 1. Find 1.0 in the right hand column
Go across to 0.00
Reads 1.59. So area in the tail is 1.59.
What’s the area between 1.59 and the
mean?
Always draw the N.D.
A = .159
If Area above z = 1 is 0.159, what is the area between Z and the
mean?
A = 0.500 - 0.159 = 0.341
We need to add an equal area on the other side of the mean.
Total shaded area = 0.682
You Try It
• What is the probability
that a value will fall within
2 s.d. of the mean?
• Draw the N.D
• Look up area that
corresponds to Z = 2.
• A = 0.023
• Find the area between
mean & Z = 2.
• 0.500 – 0.023 = 0.477
• Double it. A = 0.954
Try the Reverse
• I want to find the value above which 10%
of the population falls.
• This time, area = 0.100
• Look in body of table for 0.100
• Read across and up. Z = 1.28
• Would have to use the formula for Z in
reverse in order to get the value for X
Finding X
Z=X–X
S.D.
1.28 = X – X
S.D.
S.D.
*
1.28 + X = X
To convert to X, have to know mean & S.D.
Example
• Weights of 40-yr old women are normally
distributed with a mean of 150 and an S.D.
of 10.
• What is the value above which the highest
10% of weights falls?
• X = 1.28
*
150 + 10 = 202
Application
• Studying a progressive neurological
disorder. At autopsy, we weigh the brains.
Find the wts are normally distributed with a
mean of 1100 grams and an S.D. of 100 g.
• Find the probability that one of the brains
weighs less than 850 g.
Draw the N.D.
800
1100
Z = (800 – 1100)/100 = -3
P(X<800) = Area = 0.0001
The End
For Now
More Ranges
• The cholesterol levels for a certain
population are approximately normally
distributed with a mean of 200 mg/100 ml
& an S.D. of 20 mg/100 ml.
• Find the probabilities for an individual
picked at random to have cholesterol
levels in the following ranges
Mean = 200 mg/100ml
S.D. = 20 mg/100 ml
A. Between 180 & 200
B. Greater than 225
C. Between 190 & 210
Mean = 200 mg/100ml
S.D. = 20 mg/100 ml
A. Between 180 & 200
• Z1 = 0. Z2 = (180 – 200)/20 = -1
So the area is from the mean to one S.D.
If it was both sides, would be .68. Since
only one side = 0.32.
P = 0.32.
Mean = 200 mg/100ml
S.D. = 20 mg/100 ml
B. Greater than 225
• Z = (225 – 200)/20 = 1.25
• Look it up. Area = 0.106
• P(X>225) = 0.106
Mean = 200 mg/100ml C. Between 190 & 210
S.D. = 20 mg/100 ml
• Z1 = (190 – 200)/20 = -0.5 Look up =
0.309. But that is the tail. What is Z = 0.5
to mean? 0.500 – 0.309 = 0.191
• Z2 = 0.5. Symmetrical. So Z2 to the
mean is also 0.191.
• P = 2 times 0.191 = 0.382