Confidence Interval Estimation for the Mean

Download Report

Transcript Confidence Interval Estimation for the Mean

Sampling and Statistical
Analysis for Decision Making
A. A. Elimam
College of Business
San Francisco State University
Chapter Topics
• Sampling: Design and Methods
• Estimation:
• Confidence Interval Estimation for the Mean
(s Known)
•Confidence Interval Estimation for the Mean
(s Unknown)
•Confidence Interval Estimation for the
Proportion
Chapter Topics
• The Situation of Finite Populations
• Student’s t distribution
• Sample Size Estimation
• Hypothesis Testing
• Significance Levels
• ANOVA
Statistical Sampling
• Sampling: Valuable tool
• Population:
• Too large to deal with effectively or practically
• Impossible or too expensive to obtain all data
• Collect sample data to draw conclusions
about unknown population
Sample design
• Representative Samples of the population
• Sampling Plan: Approach to obtain samples
• Sampling Plan: States
• Objectives
• Target population
• Population frame
• Method of sampling
• Data collection procedure
• Statistical analysis tools
Objectives
• Estimate population parameters such as a
mean, proportion or standard deviation
• Identify if significant difference exists
between two populations
Population Frame
• List of all members of the target population
Sampling Methods
• Subjective Sampling:
• Judgment: select the sample (best customers)
• Convenience: ease of sampling
• Probabilistic Sampling:
• Simple Random Sampling
• Replacement
• Without Replacement
Sampling Methods
• Systematic Sampling:
• Selects items periodically from population.
• First item randomly selected - may produce bias
• Example: pick one sample every 7 days
• Stratified Sampling:
• Populations divided into natural strata
• Allocates proper proportion of samples to each stratum
• Each stratum weighed by its size – cost or significance of
certain strata might suggest different allocation
• Example: sampling of political districts - wards
Sampling Methods
• Cluster Sampling:
• Populations divided into clusters then random sample each
• Items within each cluster become members of the sample
• Example: segment customers for each geographical location
• Sampling Using Excel:
• Population listed in spreadsheet
• Periodic
• Random
Sampling Methods: Selection
• Systematic Sampling:
• Population is large – considerable effort to randomly select
• Stratified
Sampling:
• Items in each stratum homogeneous - Low variances
• Relatively smaller sample size than simple random sampling
• Cluster Sampling:
• Items in each cluster are heterogeneous
• Clusters are representative of the entire Population
• Requires larger sample
Sampling Errors
• Sample does not represent target population
(e. g. selecting inappropriate sampling method)
• Inherent error:samples only subset of population
• Depends on size of Sample relative to population
• Accuracy of estimates
• Trade-off: cost/time versus accuracy
Sampling From Finite Populations
• Finite without replacement (R)
• Statistical theory assumes: samples selected with R
• When n < .05 N – difference is insignificant
• Otherwise need a correction factor
• Standard error of the mean
s
x

s
n
N n
N 1
Statistical Analysis of Sample Data
• Estimation of population parameters (PP)
• Development of confidence intervals for PP
• Probability that the interval correctly estimates
true population parameter
• Means to compare alternative decisions/process
(comparing transmission production processes)
• Hypothesis testing: validate differences among PP
Estimation Process
Population
Mean, m, is
unknown
Sample
Random Sample
Mean
X = 50
I am 95%
confident that m
is between 40 &
60.
Population Parameters
Estimated
Population Parameter
Point Estimate
_
Mean
m
X
Proportion
p
ps
Variance
s
Std. Dev.
s
2
s
2
s
Confidence Interval Estimation
• Provides Range of Values

Based on Observations from Sample
• Gives Information about Closeness
to Unknown Population Parameter
• Stated in terms of Probability
Never 100% Sure
Elements of Confidence Interval Estimation
A Probability That the Population Parameter
Falls Somewhere Within the Interval.
Sample
Confidence Interval
Statistic
Confidence Limit
(Lower)
Confidence Limit
(Upper)
Example of Confidence Interval Estimation
Example: 90 % CI for the mean is 10 ± 2.
Point Estimate = 10
Margin of Error = 2
CI = [8,12]
Level of Confidence = 1 -  = 0.9
Probability that true PP is not in this CI = 0.1
Confidence Limits for Population Mean
Parameter =
Statistic ± Its Error
m  X  Error
X  m
Z 
= Error =
m  X
X  m
Error
s X
s
X

Error  Z s
m  X  Zs X
x
Confidence Intervals
X  Z s X  X  Z 
s
sx_
n
_
X
m  1 .645 s x
m  1 .645 s x
90% Samples
m  1 . 96 s
x
m  1 . 96 s
x
95% Samples
m  2 .58s x
m  2 .58 s x
99% Samples
Level of Confidence
•
Probability that the unknown
population parameter falls within the
interval
•
Denoted (1 - ) % = level of confidence
e.g. 90%, 95%, 99%

 Is Probability That the Parameter Is Not
Within the Interval
Intervals & Level of Confidence
Sampling
Distribution of
the Mean
/2
Intervals
Extend from
s_
x
1-
mX  m
/2
_
X
(1 - ) % of
Intervals Contain m.
X  ZsX
 % Do Not.
to
X  ZsX
Confidence Intervals
Factors Affecting Interval Width
•
Data Variation
Intervals Extend from
measured by s
X - Zs
•
Sample Size
sX  sX / n
•
Level of Confidence
(1 - )
x
to X + Z s
x
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Confidence Intervals (s Known)
•
•
Assumptions

Population Standard Deviation is Known

Population is Normally Distributed

If Not Normal, use large samples
Confidence Interval Estimate
s  m 
X  Z / 2 
n
s
X  Z / 2 
n
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Confidence Intervals (s Unknown)
•
Assumptions
Population Standard Deviation is Unknown
 Population Must Be Normally Distributed

•
Use Student’s t Distribution
•
Confidence Interval Estimate
S
S  m  X t
X  t / 2 ,n1 
 / 2 ,n1 
n
n
Student’s t Distribution
•
•
•
•
•
•
•
Shape similar to Normal Distribution
Different t distributions based on df
Has a larger variance than Normal
Larger Sample size: t approaches Normal
At n = 120 - virtually the same
For any sample size true distribution of
Sample mean is the student’s t
For unknown s and when in doubt use t
Student’s t Distribution
Standard
Normal
Bell-Shaped
Symmetric
‘Fatter’ Tails
t (df = 13)
t (df = 5)
0
Z
t
Degrees of Freedom (df)
•
Number of Observations that Are Free to Vary
After Sample Mean Has Been Calculated
•
Example

Mean of 3 Numbers Is 2
X1 = 1 (or Any Number)
X2 = 2 (or Any Number)
X3 = 3 (Cannot Vary)
Mean = 2
degrees of freedom =
n -1
= 3 -1
=2
Student’s t Table
Assume: n = 3
=n-1=2
Upper Tail Area
df
.25
.10
.05
df
 = .10
/2 =.05
1 1.000 3.078 6.314
2 0.817 1.886 2.920
.05
3 0.765 1.638 2.353
0
t Values
2.920
t
Example: Interval Estimation s Unknown
A random sample of n = 25 has X = 50 and
s = 8. Set up a 95% confidence interval
estimate for m.
S
S
X  t / 2 ,n1 
 m  X  t / 2 ,n1 
n
n
50  2 . 0639 
8
25
m
46 . 69
 m 
50  2 . 0639 
53 . 30
8
25
Example: Tracway Transmission
Sample of n = 30, S = 45.4 - Find a 99 % CI for, m , the
mean of each transmission system process. Therefore  =
.01 and /2 = .005
t
/ 2, n 1
t
.005,29
 2.7564
S
 m  X  t / 2 ,n1  n
45.4  m  289.6  2.7564  45.4
289.6  2.7564 
30
30
S
X  t / 2 ,n1 
n
266.75
 m 
312.45
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Estimation for Finite Populations
•
Assumptions
 Sample Is Large Relative to Population
 n / N > .05
•
Use Finite Population Correction Factor
Confidence Interval (Mean, sX Unknown)
S
S
N n
N n
X  t / 2 ,n1 
X  t / 2,n1  


m

X
n
n N 1
N 1
•
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Finite
Population
Confidence Interval Estimate Proportion
•
Assumptions
 Two Categorical Outcomes
 Population Follows Binomial Distribution
 Normal Approximation Can Be Used

•
n·p  5
&
n·(1 - p)  5
Confidence Interval Estimate
ps ( 1  ps )
ps ( 1  ps )
 p  ps  Z / 2 
ps  Z / 2 
n
n
Example: Estimating Proportion
A random sample of 1000 Voters showed
51% voted for Candidate A. Set up a 90%
confidence interval estimate for p.
ps ( 1  ps )
ps ( 1  ps )  p 
ps  Z / 2 
ps  Z / 2 
n
n
.51(1  .51)  p 
.51  1.645 
1000
.51(1  .51)
.51  1.645 
1000
.484  p  .536
Sample Size
Too Big:
•Requires too
much resources
Too Small:
•Won’t do
the job
Example: Sample Size for Mean
What sample size is needed to be 90%
confident of being correct within ± 5? A
pilot study suggested that the standard
deviation is 45.
Z s
2
n
2
Error
2

1645
.
5
2
2
45
2
 219.2 @ 220
Round Up
Example: Sample Size for Proportion
What sample size is needed to be within ± 5 with
90% confidence? Out of a population of 1,000,
we randomly selected 100 of which 30 were
defective.
Z 2 p ( 1  p ) 1 . 645 2 (. 30 )(. 70 )
n 

 227 . 3
2
2
error
. 05
@ 228
Round Up
Hypothesis Testing
• Draw inferences about two contrasting
propositions (hypothesis)
•
Determine whether two means are equal:
1. Formulate the hypothesis to test
2. Select a level of significance
3. Determine a decision rule as a base to
conclusion
4. Collect data and calculate a test statistic
5. Apply the decision rule to draw conclusion
Hypothesis Formulation
• Null hypothesis: H0 representing status quo
• Alternative hypothesis: H1
• Assumes that H0 is true
• Sample evidence is obtained to determine
whether H1 is more likely to be true
Significance Level
TestTrue
Accept
Reject
False
Type II Error
Type I Error
Probability of making Type I error  = level of significance
Confidence Coefficient = 1- 
Probability of making Type II error  = level of significance
Power of the test = 1- 
Decision Rules
• Sampling Distribution: Normal or t distribution
• Rejection Region
• Non Rejection Region
• Two-tailed test , /2
• One-tailed test , 
•
P-Values
Hypothesis Testing: Cases
• Two-Sample Means
• F-Test for Variances
• Proportions
• ANOVA: Differences of several means
• Chi-square for independence
Chapter Summary
• Sampling: Design and Methods
• Estimation:
• Confidence Interval Estimation for Mean
(s Known)
• Confidence Interval Estimation for Mean
(s Unknown)
• Confidence Interval Estimation for Proportion
Chapter Summary
• Finite Populations
• Student’s t distribution
• Sample Size Estimation
• Hypothesis Testing
• Significance Levels: Type I/II errors
• ANOVA