Introduction to Inference (Hypothesis Testing)

Download Report

Transcript Introduction to Inference (Hypothesis Testing)

Business Statistics
QM 2113 - Spring 2002
Introduction to
Inference:
Hypothesis Testing
Student Objectives
 Review concepts of sampling
distributions
 List and distinguish between the two
types of inference
 Summarize hypothesis testing
procedures
 Conduct hypothesis tests concerning
population/process averages
 Understand how to use tables for the t
distribution
Recall: Parameters
versus Statistics
 Descriptive numerical measures
calculated from the entire population
are called parameters.
– Quantitative data: m and s
– Qualitative data: p (proportion)
 Corresponding measures for a sample
are called statistics.
– Quantitative data: x-bar and s
– Qualitative data: p
The Sampling Process
Population or Process
Sample
Statistic
Parameter
Sampling Distributions
 Quantitative data
– Expected value for x-bar is the population or
process average (i.e., m)
– Expected variation in x-bar from one sample
average to another is
• Known as the standard error of the mean
• Equal to s/√n
– Distribution of x-bar is approx normal (CLT)
 Qualitative data
– E(p) is p
– Standard error is √p(1-p)/n
– Distribution of p is approx normal (CLT)
A Review Example from
the Homework
 Supposedly, WNB executive salaries
equal industry on average (m = 80,000)
 But sample results were
– x-bar = $68,270
– s = $18,599
 If truly m = 80,000
– Assume for now that s = s = 18599
– What is P(x-bar < 68270)?
– What is P(x-bar < 68270 or x-bar > 91730) ?
Some Answers
 Given assumptions about m and s
– Standard error:
s/√n = 18599/√15 = 4800
– An x-bar value of 68270 is -2.44 standard
errors from the supposed population average
• Table probability = 0.4927
• Thus P(x-bar < 68270) = 0.5000 – 0.4927 = 0.7%
• And P(x-bar < 68270 or x-bar > 91730) = 1.4%
 Now, consider how this might be put to
use in addressing the claim
– Bring action against WNB (false claim?)
– What’s the probability of doing so in error?
Putting Sampling Theory
to Work
 We need to make decisions based on
characteristics of a process or
population
 But it’s not feasible to measure the
entire population or process; instead we
do sampling
 Therefore, we need to make conclusions
about those characteristics based upon
limited sets of observations (samples)
 These conclusions are inferences
applying knowledge of sampling theory
The Sampling Process
Population or Process
Sample
Statistic
Parameter
Two Types of
Statistical Inference
 Hypothesis testing
– Starts with a hypothesis (i.e., claim,
assumption, standard, etc.) about a population
parameter (m, p, s, b1, distribution, . . . )
– Sample results are compared with the
hypothesis
– Based upon how likely the observed results
are, given the hypothesis, a conclusion is made
 Estimation: a population parameter is
concluded to be equal to a sample
result, give or take a margin of error,
which is based upon a desired level of
confidence
Hypothesis Testing
 Start by defining hypotheses
– Null (H0):
• What we’ll believe until proven otherwise
• We state this first if we’re seeing if something’s
changed
– Alternate (HA):
• Opposite of H0
• If we’re trying to prove something, we state it as HA
and start with this, not the null
 Then state willingness to make wrong
conclusion (a)
 Determine the decision rule (DR)
 Gather data and compare results to DR
The Logic Involved
 Suppose someone makes a statement and
you wonder about whether or not it’s true
 You typically do some research and get
some evidence
 If the evidence contradicts the statement
but not by much, you typically let it slide
(but you’re not necessarily convinced)
 However, if the evidence is overwhelming,
you’re convinced and you take action
 This is hypothesis testing!
 Statistics helps us to determine what is
“overwhelming”
Errors in Hypothesis
Testing
 Type I: rejecting a true H0
 Type II: accepting a false H0
 Probabilities
a = P(Type I)
b = P(Type II)
Power = P(Rejecting false H0) = P(No error)
 Controlling risks
– Decision rule controls a
– Sample size controls b
 Worst error: Type III (solving the
wrong problem)!
 Hence, be sure H0 and HA are correct
Stating the Decision
Rule
 First, note that no analysis should take
place before DR is in place!
 Can state any of three ways
– Critical value of observed statistic (x-bar)
– Critical value of test statistic (z)
– Critical value of likelihood of observed result
(p-value)
 Generally, test statistics are used when
results are generated manually and pvalues are used when results are
determined via computer
 Always indicate on sketch of distribution
Some Exercises
Addressing the Mean
 Don’t forget to sketch distributions!
 Large sample (CLT applies)
– One tail hypothesis (#8-3)
– Two tail hypothesis (#8-8)
 Small sample (introducing the t
distribution)
– One tail hypothesis (#8-5)
– Note: we’re really always using the t
distribution
• Applies whenever s is used to estimate the standard
error
• It just becomes obvious when sample sizes are small
Homework
 Section 8-1:
– Reread
– Rework exercises
 Read Section 8-3
 Work exercises: 28, 29, 30, 34