Introduction to Inference (Hypothesis Testing)
Download
Report
Transcript Introduction to Inference (Hypothesis Testing)
Business Statistics
QM 2113 - Spring 2002
Introduction to
Inference:
Hypothesis Testing
Student Objectives
Review concepts of sampling
distributions
List and distinguish between the two
types of inference
Summarize hypothesis testing
procedures
Conduct hypothesis tests concerning
population/process averages
Understand how to use tables for the t
distribution
Recall: Parameters
versus Statistics
Descriptive numerical measures
calculated from the entire population
are called parameters.
– Quantitative data: m and s
– Qualitative data: p (proportion)
Corresponding measures for a sample
are called statistics.
– Quantitative data: x-bar and s
– Qualitative data: p
The Sampling Process
Population or Process
Sample
Statistic
Parameter
Sampling Distributions
Quantitative data
– Expected value for x-bar is the population or
process average (i.e., m)
– Expected variation in x-bar from one sample
average to another is
• Known as the standard error of the mean
• Equal to s/√n
– Distribution of x-bar is approx normal (CLT)
Qualitative data
– E(p) is p
– Standard error is √p(1-p)/n
– Distribution of p is approx normal (CLT)
A Review Example from
the Homework
Supposedly, WNB executive salaries
equal industry on average (m = 80,000)
But sample results were
– x-bar = $68,270
– s = $18,599
If truly m = 80,000
– Assume for now that s = s = 18599
– What is P(x-bar < 68270)?
– What is P(x-bar < 68270 or x-bar > 91730) ?
Some Answers
Given assumptions about m and s
– Standard error:
s/√n = 18599/√15 = 4800
– An x-bar value of 68270 is -2.44 standard
errors from the supposed population average
• Table probability = 0.4927
• Thus P(x-bar < 68270) = 0.5000 – 0.4927 = 0.7%
• And P(x-bar < 68270 or x-bar > 91730) = 1.4%
Now, consider how this might be put to
use in addressing the claim
– Bring action against WNB (false claim?)
– What’s the probability of doing so in error?
Putting Sampling Theory
to Work
We need to make decisions based on
characteristics of a process or
population
But it’s not feasible to measure the
entire population or process; instead we
do sampling
Therefore, we need to make conclusions
about those characteristics based upon
limited sets of observations (samples)
These conclusions are inferences
applying knowledge of sampling theory
The Sampling Process
Population or Process
Sample
Statistic
Parameter
Two Types of
Statistical Inference
Hypothesis testing
– Starts with a hypothesis (i.e., claim,
assumption, standard, etc.) about a population
parameter (m, p, s, b1, distribution, . . . )
– Sample results are compared with the
hypothesis
– Based upon how likely the observed results
are, given the hypothesis, a conclusion is made
Estimation: a population parameter is
concluded to be equal to a sample
result, give or take a margin of error,
which is based upon a desired level of
confidence
Hypothesis Testing
Start by defining hypotheses
– Null (H0):
• What we’ll believe until proven otherwise
• We state this first if we’re seeing if something’s
changed
– Alternate (HA):
• Opposite of H0
• If we’re trying to prove something, we state it as HA
and start with this, not the null
Then state willingness to make wrong
conclusion (a)
Determine the decision rule (DR)
Gather data and compare results to DR
The Logic Involved
Suppose someone makes a statement and
you wonder about whether or not it’s true
You typically do some research and get
some evidence
If the evidence contradicts the statement
but not by much, you typically let it slide
(but you’re not necessarily convinced)
However, if the evidence is overwhelming,
you’re convinced and you take action
This is hypothesis testing!
Statistics helps us to determine what is
“overwhelming”
Errors in Hypothesis
Testing
Type I: rejecting a true H0
Type II: accepting a false H0
Probabilities
a = P(Type I)
b = P(Type II)
Power = P(Rejecting false H0) = P(No error)
Controlling risks
– Decision rule controls a
– Sample size controls b
Worst error: Type III (solving the
wrong problem)!
Hence, be sure H0 and HA are correct
Stating the Decision
Rule
First, note that no analysis should take
place before DR is in place!
Can state any of three ways
– Critical value of observed statistic (x-bar)
– Critical value of test statistic (z)
– Critical value of likelihood of observed result
(p-value)
Generally, test statistics are used when
results are generated manually and pvalues are used when results are
determined via computer
Always indicate on sketch of distribution
Some Exercises
Addressing the Mean
Don’t forget to sketch distributions!
Large sample (CLT applies)
– One tail hypothesis (#8-3)
– Two tail hypothesis (#8-8)
Small sample (introducing the t
distribution)
– One tail hypothesis (#8-5)
– Note: we’re really always using the t
distribution
• Applies whenever s is used to estimate the standard
error
• It just becomes obvious when sample sizes are small
Homework
Section 8-1:
– Reread
– Rework exercises
Read Section 8-3
Work exercises: 28, 29, 30, 34