Transcript Chapter 6

Chapter 6
The Normal Distribution and Other
Continuous Distributions
6.1: Continuous Probability
Distributions
• Continuous Random Variables
– If X is a continuous RV, then P(X=a) = 0,
where “a” is any individual unique value
– Because X has  individual unique values
– P(a  X  b) = “something nonzero” where
“a” to “b” represents an interval
• Normal is most important continuous
probability distribution.
6.2: Normal Distribution
• Also known as “Gaussian Distribution”
• Works close enough for a lot of continuous
RVs.
• Works close enough for a few discrete RVs.
• Necessary for our inferential statistics.
• Bell-shaped and symmetric.
• All measures of central tendency are equal.
• In theory, X is continuous and unbounded.
Normal RV
• Probabilities for discrete RV were given
by a probability distribution function.
• Probabilities for continuous RV are given
by a probability DENSITY function
(pdf).
• Normal pdf requires you to know two
parameters to find probabilities:  and .
Finding Normal Probabilities
• Equation 6.1: fun but not useful.
• Like to have a table for each combination of 
and .
– Can’t.
• Generate 1 table that can be used by everyone.
– Get everyone to convert or transform data so
that it works with that one table!
– Transform X into Z
6.3: Evaluating Normality
• The assumption of Normality is
made all the time: sometimes
correctly so, and sometimes
incorrectly so.
• Said another way: not all
continuous random variables are
normally distributed.
Checking Normality
• Text discusses two ways in this section
(other ways discussed in Stat 2!)
1 Compare what you know about the
data to what you know about the
normal distribution.
2 Construct a normal probability plot.
Comparing actual data to theory
• Central tendency: actual data mean,
median, and mode should be similar.
• Variability:
– Is the interquartile range about equal
to 1.33*the standard deviation?
– Is the range about equal to 6 times
the standard deviation?
Comparing actual data to theory
• Shape:
– plot the data and check for symmetry.
– check to determine if the Empirical
Rule applies.
• Sometimes samples are small--is the
data non-normal or do you have a
non-representative sample?
Normal Probability Plot
• Best left to software.
• The straighter the line, the better the
sample approximates a normal
distribution.
• Systematic deviation from a straight
line indicates non-normality.
Plot Construction
• Order the data
• Use inverse normal scores
transformation to find the
standardized normal quantile for each
data point.
° P(Z < Oi) = i/(n+1)
° i.e. solve for Oi for the 1st data point and
the second data point, etc.
Plot Construction (cont.)
• Plot the data points:
– actual values on the Y axis
– Standardized Normal Quantiles on the X
axis
• A straight line demonstrates normality.
• A non-straight line demonstrates nonnormality.