Normal Distribution: Reinforcement & Application

Download Report

Transcript Normal Distribution: Reinforcement & Application

QM 2113 -- Fall 2003
Statistics for Decision Making
Normal Distribution: Reinforcement &
Applications
Instructor: John Seydel, Ph.D.
Student Objectives
Discuss the characteristics of normally
distributed random variables
Calculate probabilities for normal random
variables
Apply normal distribution concepts to practical
problems
Recognize other common probability
distributions
Summarize the concept of sampling
distributions
Administrative Items
Grading: haven’t gotten very far
Homework for today: not being collected
Next week: Exam 2
Now, a quiz . . .
Quiz #1
Put your name in the upper right corner of
the quiz
Answer Problems 1 & 2 on the back side of
the quiz
You may refer


To your homework
But not to your text, notes, neighbor, . . .
Normal probability table: on screen (ask if
you need it adjusted)
Normal Distribution Review
Description of the distribution


Basic concepts
Determining probabilities
Don’t forget the sketch
Some Quick Exercises:
Mechanics
Let x ~ N(34,3) as with the mpg problem
Determine

Tail probabilities
 F(30) which is the same as P(x ≤ 30)
 P(x > 40)

Tail complements
 P(x > 30)
 P(x < 40)

Other
 P(32 < x < 33)
 P(30 < x < 35)
 P(20 < x < 30)
Questions About the Homework?
Data analysis (Web Analytics case)


Univariate
Bivariate
 Quantitative variables
 Qualitative variables
Normal distribution problems



Basic mechanics
Applications (e.g., Handout)
Understanding the concepts
Midterm exam from spring
Other Distributions
Not everything is normally distributed
Consider data in claimdat.xls



Uniform distribution
Exponential distribution
Normal distribution
Should be able to recognize shapes
Also: be familiar with basic
characteristics
Keep In Mind
Probability = proportion of area under the
normal curve
What we get when we use tables is always the
area between the mean and z standard
deviations from the mean
Because of symmetry
P(x > m) = P(x < m) = 0.5000
Tables show probabilities rounded to 4 decimal
places


If z < -3.89 then probability ≈ 0.5000
If z > 3.89 then probability ≈ 0.5000
Theoretically, P(x = a) = 0
P(30 ≤ x ≤ 35) = P(30 < x < 35)
Now, An Application to Sampling
Descriptive numerical measures calculated
from the entire population are called
parameters.


Numeric data: m and s
Categorical data: p (proportion)
Corresponding measures for a sample are
called statistics.


Numeric data: x-bar and s
Categorical data: p-hat
A Demonstration
Draw a sample of 50 observations


x ~ N(100,20)
Calculate the average
Note that x-bar doesn’t equal m
Repeat multiple times


Average the averages
Look at the distribution of the averages
Take a look also at the variances and
standard deviations
Now consider x ~ Exponential(100)
Sampling Distributions
Quantitative data


Expected value for x-bar is the population or
process average (i.e., m)
Expected variation in x-bar from one sample
average to another is
 Known as the standard error of the mean
 Equal to s/√n

Distribution of x-bar is approximately normal (CLT)
Qualitative data: we’ll get to this another time
An Example
Supposedly, WNB executive salaries equal
industry on average (m = 80,000)
But sample results were


x-bar = $68,270
s = $18,599
If truly m = 80,000



Assume for now that s = s = 18599
What is P(x-bar < 68270)?
What is P(x-bar < 68270 or x-bar > 91730) ?
Some Answers
Given assumptions about m and s

Standard error:
s/√n = 18599/√15 = 4800

An x-bar value of 68270 is -2.44 standard errors
from the supposed population average
 Table probability = 0.4927
 Thus P(x-bar < 68270) = 0.5000 – 0.4927 = 0.7%
 And P(x-bar < 68270 or x-bar > 91730) = 1.4%
Now, consider how this might be put to use in
addressing the claim


Bring action against WNB (false claim?)
What’s the probability of doing so in error?
Maybe a confidence interval estimate could
be helpful . . .
Putting Sampling Theory to Work
We need to make decisions based on
characteristics of a process or population
But it’s not feasible to measure the entire
population or process; instead we do
sampling
Therefore, we need to make conclusions
about those characteristics based upon
limited sets of observations (samples)
These conclusions are inferences applying
knowledge of sampling theory
Summary of Objectives
Discuss the characteristics of normally
distributed random variables
Calculate probabilities for normal
random variables
Apply normal distribution concepts to
practical problems
Recognize other common probability
distributions
Summarize the concept of sampling
distributions
Appendix
Random Variables
Population or Process
Random Variable
(x)
Parameters
(m,s )
Sampling
Population
Sample
Statistic
Parameter
Schematic View
Statistics
Numeric Data
Informal
Summary Measures
Inferential Analyses
Categorical Data
Informal
Summary Measures
Inferential Analyses
Probability is what allows the linkage between descriptive and inferential
analyses
If Something’s Normally Distributed
It’s described by


m (the population/process average)
s (the population/process standard deviation)
Histogram is symmetric


Thus no skew (average = median)
So P(x < m) = P(x > m) = . . . ?
Shape of histogram can be described by
f(x) = (1/s√2p)e-[(x-m)2/2s 2]
We determine probabilities based upon
distance from the mean (i.e., the number of
standard deviations)
A Sketch is Essential!
Use to identify regions of concern
Enables putting together results of
calculations, lookups, etc.
Doesn’t need to be perfect; just needs
to indicate relative positioning
Make it large enough to work with;
needs annotation (probabilities,
comments, etc.)