Hypothesis Testing - NYU Stern School of Business
Download
Report
Transcript Hypothesis Testing - NYU Stern School of Business
Statistics & Data Analysis
Course Number
Course Section
Meeting Time
B01.1305
31
Wednesday 6-8:50 pm
Hypothesis Testing
Class Outline
Review of midterm exam
Hypothesis Testing
One-sample tests
Two-sample tests
P-values
Relationship with Confidence Intervals
Professor S. D. Balkin -- July 1, 2002
-2-
Review of Last Class
Statistical Inference
Point Estimation
Confidence Intervals
Professor S. D. Balkin -- July 1, 2002
-3-
Reminder: Statistical Inference
Problem of Inferential Statistics:
• Make inferences about one or more population parameters based on
observable sample data
Forms of Inference:
• Point estimation: single best guess regarding a population parameter
• Interval estimation: Specifies a reasonable range for the value of the
parameter
• Hypothesis testing: Isolating a particular possible value for the
parameter and testing if this value is plausible given the available data
Professor S. D. Balkin -- July 1, 2002
-4-
Point Estimators
Computing a single statistic from the sample data to estimate
a population parameter
Choosing a point estimator:
• What is the shape of the distribution?
• Do you suspect outliers exist?
• Plausible choices:
•
•
•
•
Mean
Median
Mode
Trimmed Mean
Professor S. D. Balkin -- July 1, 2002
-5-
Confidence Intervals
Specification of a “probably range” for a parameter
Used to understand how statistics may vary from sample
to sample
States explicit allowance for random sampling error (not
selection biases)
We have 95% confidence that the population parameter
falls within the bounds of the interval
Or…the interval is the result of a process that in the long
run has a 95% probability of being correct
Professor S. D. Balkin -- July 1, 2002
-6-
Hypothesis Testing
Chapter 8
Overview
A research hypothesis typically states that there is a real change, a real
difference, or real effect in the underlying population or process. The the
opposite, null hypothesis, then states that there is no real change,
difference, or effect
The basic strategy of hypothesis testing is to try to support a research
hypothesis by showing that the sample results are highly unlikely,
assuming the null hypothesis, and more likely, assuming the research
hypothesis
The strategy can be implemented in equivalent to raise by creating a
formal rejection region, by obtaining a plea value, were like seeking
whether the null hypothesis value falls within a confidence interval
There are risks of false positive and a false negative errors
Tests of a mean usually are based on the t-distribution
Tests of the proportion may be done by using a normal approximation
Professor S. D. Balkin -- July 1, 2002
-8-
Overview
Very often sample data will suggest that something relevant is happening
in the underlying population
• A sample of potential customers may show that a higher proportion prefer a
new brand to the existing one
• A sampling of telephone response time by reservation clerks may show an
increase in average customer waiting time
• A sample of the service times may indicate customers are receiving poorer
service fan in the company thinks it is providing
The question of whether the apparent defects in the sample is an
indication of something happening in the underlying population and more if
he apparent effect is merely a fluke
Professor S. D. Balkin -- July 1, 2002
-9-
What is Hypothesis Testing
Method for checking whether an apparent result from a
sample could possibly be due to randomness
Checks on how strong the evidence is
Are sample data reflecting a real effect or random fluke?
Results of a hypothesis test indicate how good the
evidence is, not how important the result is
Professor S. D. Balkin -- July 1, 2002
- 10 -
Motivating Case Study #1
FCC has been receiving complaints from customers ordering
new telephone service
Big telecommunications company tells the FCC that the
average time a new customer has to wait for new service
installation is 72 hours (excluding weekends) with a standard
deviation of 24 hours
The FCC randomly samples 100 new customers from the
telecom company and asks how long each had to wait for
new service installation
Professor S. D. Balkin -- July 1, 2002
- 11 -
Testing Hypotheses
Research Hypothesis, or Alternative Hypothesis is what the is
trying to prove
• Denoted: Ha
Null Hypothesis is the denial of the research hypothesis. It is
what is trying to be disproved
• Denoted: H0
Professor S. D. Balkin -- July 1, 2002
- 12 -
Hypothesis Testing Components
Define research hypothesis direction:
• One-sided (< or >)
• Two-sided ()
Strategy is to attempt to support the research hypothesis by
contradicting the null hypothesis
• The null hypothesis is contradicted if when assuming it is true, the
sample data are highly unlikely and more likely given the research
hypothesis
Test Statistic: Summary of the sample data
Professor S. D. Balkin -- July 1, 2002
- 13 -
Basic Logic
1. Assume that H0: m=72 is true;
2. Calculate the value of the test statistic
Sample mean, proportion, etc.
3. If this value is highly unlikely, reject H0 and support Ha
We can use the sampling distribution to determine what
values of the test statistic are sufficiently unlikely given the
null hypothesis
Professor S. D. Balkin -- July 1, 2002
- 14 -
Rejection Region
Specification of the rejection region must recognize the possibility of
error
• Type I Error: Rejecting the null hypothesis when in fact it is true
• In establishing a rejection region, we must specify the maximum tolerable
probability of this type of error (denoted a)
• Type II Error: Failing to reject the null hypothesis when in fact it is false
(beyond scope)
Rejection region can be based on sampling distribution of the
sample statistic
• Remember, we want to reject the null hypothesis if the value of the test
statistic is highly unlikely assuming H0 is true
• Can uses the tails of a normal distribution
Professor S. D. Balkin -- July 1, 2002
- 15 -
Rejection Region
m=72
Professor S. D. Balkin -- July 1, 2002
- 16 -
Rejection Region (cont)
To determine whether or not to reject the null hypothesis, we
can compute the number of standard errors the sample
statistic lies above the assumed population mean
This is done by computing a z-statistic for the sample mean:
z
Professor S. D. Balkin -- July 1, 2002
Y m0
/ n
- 17 -
Rejection Region (cont)
For a 0.05 reject H 0 : m 72 if the observed value of
Y is more than 1.645 Y above m 72.
For a 0.05 reject H 0 : m 72 if computed
z statistic is greater th an 1.645
a=0.05
Rejection
m=72
Region
m+3.948
Professor S. D. Balkin -- July 1, 2002
- 18 -
Example
The FCC sample of 100 randomly selection new
service customers resulted in a mean of 80 hours.
Setup the hypothesis test
Calculate the test statistic
Interpret the hypothesis
Professor S. D. Balkin -- July 1, 2002
- 19 -
Example
A researcher claims that the amount of time urban preschool
children age 3-5 watch television has a mean of 22.6 hours
and a standard deviation of 6.1 hours.
A market research firm believes this is too low
The television habits of a random sample of 60 urban
preschool children are measured and resulted in the following
• Sample mean: 25.2
Should the researcher’s claim be rejected at an a value of
0.01?
Professor S. D. Balkin -- July 1, 2002
- 20 -
Summary for Z Test with Known
H 0 : m m0
H a : 1. m m 0
2. m m 0
3. m m 0
Test Statistic :
Y m0
z
/ n
Rejection Region :
1. z za
2. z za
3. z za / 2 or z za / 2
Professor S. D. Balkin -- July 1, 2002
- 21 -
Example
A researcher claims that the amount of time urban preschool children age
3-5 watch television has a mean of 22.6 hours and a standard deviation of
6.1 hours.
A market research firm believes this is incorrect, but does not know in
which direction
The television habits of a random sample of 60 urban preschool children
are measured and resulted in the following
• Sample mean: 25.2
Should the researcher’s claim be rejected at an a value of 0.01?
Professor S. D. Balkin -- July 1, 2002
- 22 -
Z-values Worth Remembering
z0.05
z0.025
z0.01
z0.005
Professor S. D. Balkin -- July 1, 2002
= 1.645
= 1.96
= 2.326
= 2.576
- 23 -
P-Value
Probability of a test statistic value equal to or more extreme
than the actual observed value
Recall basic strategy
• Hope to support the research hypothesis and reject the null hypothesis
by showing that the data are highly unlikely assuming that the null
hypothesis is true
• As the test statistic gets farther into the rejection region, the data
become more unlikely, hence the weight of evidence against the null
hypothesis becomes more conclusive and p-value become smaller
Professor S. D. Balkin -- July 1, 2002
- 24 -
P-Value (cont)
Small p-values indicate strong, conclusive evidence for
rejecting the null hypothesis
Computation is straightforward in our z-test example:
One tailed p - value
P(Z z)
Compute the p-value for our telecom example
Professor S. D. Balkin -- July 1, 2002
- 25 -
P-Value (cont)
P-value is also referred to as attained level of significance
• Results of a test are said to be statistically significant at the specified
p-value
Statistically significant says the difference between what is
observed and what is assumed correct is most likely not due
to random variation
It DOES NOT MEAN the difference is important!
It DOES NOT tell you that the difference is meaningful from
business perspective (practical significance)
With large enough sample size, any difference can become
meaningful
Professor S. D. Balkin -- July 1, 2002
- 26 -
P-Value for a z Test
The p - value is the probabilit y, assuming that
the null hypothesis is true, of obtaining a test
statistic at least as extreme as the observed value.
H a : m m0 , p value P( z zactual )
H a : m m0 , p value P( z zactual )
H a : m m 0 , p value 2 P( z | zactual |)
Professor S. D. Balkin -- July 1, 2002
- 27 -
Hypothesis Testing with the t Distribution
Population standard deviation is rarely known
Basic ideas of hypothesis testing are not changed, we simply
switch sampling distributions
t
Professor S. D. Balkin -- July 1, 2002
Y m0
s/ n
n 1 df
~ ta
- 28 -
T Test for Hypotheses about m
H 0 : m m0
H a : 1. m m 0
2. m m 0
3. m m 0
Test Statistic :
t
Y m0
s/ n
Rejection Region :
1. t t a
2. t ta
3. | t | ta / 2
where t α cuts off a right - tail area of a in a t distributi on
with n-1 degrees of freedom.
Professor S. D. Balkin -- July 1, 2002
- 29 -
Example
Airline institutes a ‘snake system’ waiting line at its counters to try to
reduce the average waiting time
Mean waiting time under specific conditions with the previous system was
6.1.
A sample of 14 waiting times is taken
• Sample mean: 5.043
• Standard deviation: 2.266
Test the null hypothesis of no change against an appropriate research
hypothesis using a=0.10.
•
•
•
•
Calculate the rejection region
Calculate the t-statistic
Perform and interpret the hypothesis test
Calculate the associated p-value
Professor S. D. Balkin -- July 1, 2002
- 30 -
Example
Performance based benefits are a way of giving employees more of
a stake in their work
A study was conducted to find out how managers of 343 firms view
the effectiveness of various kinds of employee relations programs
Each rated the effect of employee stock ownership on product quality
using a scale from –2 (large negative effect) to 2 (large positive
effect).
• Sample Mean: 0.35
• Standard Error: 0.14
Do managers view employee stock ownership as a worthwhile
technique?
• Create a 95% confidence interval for the population parameter
• Perform a hypothesis test that the population mean isn’t equal to zero
Professor S. D. Balkin -- July 1, 2002
- 31 -
Example
To help your restaurant marketing campaign target the
right age levels, you want to find out if there is a
statistically significant difference, on the average,
between the age of your customers and the age of the
general population in town, which is 43.1 years.
A random sample of 50 customers shows an average of
33.6 years with a standard deviation of 16.2 years
Perform a two-sided test at the 1% significance level
What is the p-value?
Professor S. D. Balkin -- July 1, 2002
- 32 -
t-Test Assumptions
Hypothesis tests allow for random variation, but not for bias
Measurements are statistically independent
Underlying population distribution should be symmetric
• Skewness affects p-value
Professor S. D. Balkin -- July 1, 2002
- 33 -
Hypothesis Testing a Proportion
We can also perform hypothesis tests for proportions /
percentages by using a normal approximation to the binomial
distribution
z
y n 0
n 0 (1 0 )
; where y is the number of successes
0
z
; where is the proportion of successes
0 (1 0 ) / n
Professor S. D. Balkin -- July 1, 2002
- 34 -
Testing a Population Proportion
H0 : 0
H a : 1. 0
2. 0
3. 0
Test Statistic :
z
y n 0
n 0 (1 0 )
Rejection Region :
1. z z a
2. z za
3. z za / 2 or z za / 2
Note : 0 is the null - hypothesis value of the population proportion .
Professor S. D. Balkin -- July 1, 2002
- 35 -
Example
A company figures out that the launch of their new product will
only be successful if more than 23% of consumers try the
product
Based on a pilot study based on 205 consumers, you expect
44.1% of consumers to try it
How sure are you that the percentage of people who will try
the new product is above the break-even point of 23%?
Professor S. D. Balkin -- July 1, 2002
- 36 -
Using A Confidence Interval
Construct a confidence interval (say at 95% confidence) in the usual way
If m0 is outside the interval, it is not a reasonable value for the population
parameter and you fail to reject the research hypothesis
Why does this work?
• Confidence interval says that the probability that the population parameter is in
the random confidence interval is 0.95.
• If the null hypothesis was true, then the probability that m0 is in the interval is
also 95%
• When the null is true, you will make the correct decision in 95% of all cases
Professor S. D. Balkin -- July 1, 2002
- 37 -
R Tutorial on Hypothesis Testing
Professor S. D. Balkin -- July 1, 2002
- 38 -
Testing Two Samples
Can test whether two samples are significantly different or
not, on the average
• Unpaired test: test whether two independent columns of numbers are
different
• Paired test: test whether two columns of numbers are different when
there is a natural pairing between them
Professor S. D. Balkin -- July 1, 2002
- 39 -
R Tutorial on Two Sample Hypothesis Testing
Professor S. D. Balkin -- July 1, 2002
- 40 -
Next Time…
Regression Analysis
Professor S. D. Balkin -- July 1, 2002
- 41 -