Estimation Procedures

Download Report

Transcript Estimation Procedures

Healey Ch. 7 (1e) or Ch. 6 (2/3e)
Estimation Procedures
Using the Sampling
Distribution to Construct
Confidence Intervals
Outline:

The logic of estimation

How to construct and interpret confidence
interval estimates for:


Sample means
Sample Proportions
The Logic Behind Estimation

In estimation procedures, statistics calculated
from random samples are used to estimate
the value of population parameters.

Example:

If we know that 42% of a random sample drawn
from a city vote Liberal, we can estimate the
percentage of all city residents who vote Liberal.
Logic (cont.)


Information from
samples is used to
estimate information
about the population.
Statistics are used to
estimate parameters.
POPULATION
SAMPLE
PARAMETER
STATISTIC
Logic (cont.)


Sampling Distribution is the
link between sample and
population.
The value of the parameters
is unknown but
characteristics of the
Sampling Distribution are
defined by theorems.
POPULATION
SAMPLING DISTRIBUTION
SAMPLE
Two Estimation Procedures

1. A point estimate is a sample statistic
used to estimate a population value:


The London Free Press reports that “42% of a
sample of randomly selected city residents voted
Liberal.”
2. Confidence intervals (for means or
proportions) consist of a range of values:

…”between 38% and 46% of city residents voted
Liberal.”
Bias and Efficiency

Bias:


An estimator of a mean (or a proportion) is
unbiased if the mean of its sampling distribution is
equal to the population mean.
Efficiency:


The smaller the standard error (S.D. of the
sampling distribution,) the more the samples are
clustered about the mean of the sampling
distribution
This is known as efficiency.
Sample Size and Efficiency

Standard error of sampling distribution:
 



S
=
n 1
In looking at the formula, we can see that as sample
size N increases, the standard error (   ) will
decrease. The larger N is, the more efficient the
estimate will be. A larger sample size means that the
estimate is closer to the real population mean.
Confidence Levels



Our level of confidence has to be converted
into a Z-score that we will then use in our
formula to find the confidence interval.
The 95% confidence level means that we are
willing to accept a probability of being wrong
5% of the time (or alpha (α) = .05)
This probability (the area under the curve) will
be divided evenly between the upper and
lower tail of the distribution (.025 on either
side of the curve.)
Confidence Levels (cont.)
When α = .05…
c
c
…then .025 of the area is distributed on either side (C )
The .95 in the middle section is our confidence level.
The cut-off between our confidence level and +/- .025 is
represented by a Z-value of +/- 1.96.
Z-values for Various Alpha Levels
Confidence Level
90%
95%
99%
99.9%
α
α/2
Z-score
.10
.05
.01
.001
.0500
.0250
.0050
.0005
+/-1.65
+/-1.96
+/-2.58
+/-3.29
(Note: Z-scores are found in Appendix A using
the area for α/2)
Confidence Intervals For Means
Procedure:
 1. Set the alpha (the probability that the interval will
be wrong). Note that the symbol for alpha is a.


2. Find the Z-value associated with alpha.


Setting alpha equal to 0.05, a 95% confidence level, means
the researcher is willing to be wrong 5% of the time.
If alpha is equal to 0.05, we would place half (0.025) of this
probability in the lower tail and half in the upper tail of the
distribution.
3. Substitute values into formula and solve.
Formula:
c.i. =

  





n 1 
s
Example: Confidence Intervals For Means

Question:

For a random sample of 178 Canadian
households, average television viewing
time was 6 hours/day with s = 3. What
would be your estimate of the population
mean viewing time, at the 95%
confidence level (Alpha (α) = .05)
Example: Confidence Intervals For Means


Z-score for 95% confidence level (α+.05) is +/-1.96
Substitute all information into formula and solve:
c.i. =
 s 

  
 n 1 
= 6.0 ±1.96(3/√177)
= 6.0 ±1.96(3/13.30)
= 6.0 ±1.96(.23)
= 6.0 ± .44
Example (cont.)

We can estimate that households in this community
average 6.0 ± .44 hours of TV watching each day.

Another way to state the interval:
5.56 ≤ μ ≤ 6.44
Interpretation:
We estimate, with 95% confidence, that the population
mean for TV watching is greater than or equal to
5.56 and less than or equal to 6.44.
(This interval has a .05 chance of being wrong.)
Example (cont.)

In other words:

Even if the statistic is as much as ±1.96
standard deviations from the mean of the
sampling distribution the confidence interval
will still include the value of μ.

Only rarely (5 times out of 100) will the
interval not include μ.
Confidence Intervals For Proportions

Procedure:
 Set alpha = .05.
 Find the associated Z score.
 Substitute the sample information into formula:
c.i. =
Note:
u 1  u 
s  
n
s = sample proportion
u (when population proportion is not
known,) is set to .50
Example: Confidence Intervals For
Proportions

Question:

If 42% of a random sample of 764 people from
an Ontario city vote Liberal, what % of the entire
city vote Liberal?

Hint: Don’t forget to change the % to a
proportion.
Example for Proportions (cont.)
c.i. =
u 1  u 
s  
n
= .42 ±1.96 (√.25/764)
= .42 ±1.96 (√.00033)
= .42 ±1.96 (.018)
= .42 ±.04
Confidence Intervals For Proportions


Changing back to %, we estimate that 42% ± 4% of
the city residents vote Liberal.
Another way to state the interval:
38% ≤ Pu ≤ 46%
Interpretation: We estimate that the population value
is greater than or equal to 38% and less than or
equal to 46% for city residents who vote Liberal.
(This interval has a .05 chance of being wrong.)
EKOS* Research Poll (from CBC article)




N = 2934
Level of confidence: 19/20 or 95%
Alpha = .05 (z = +/- 1.96)
Ekos results:


Conservatives 30%, NDP 29%, Liberals 27%
Formula (for proportions):
u 1  u 
s  
n

* EKOS reports a margin of error of +/- 1.8%. Can you confirm this?
Calculating Sample Sizes
(note: Formula 6.4 and 6.5 in 2/3 edition)
Sample sizes (cont)


These formulae can be used to estimate the
minimum required sample size for means or
proportions.
Where…..




n = minimum required sample size
Z = determined by your alpha level
σ or Pu = population standard deviation (use s
if unknown) or population proportion
ME = margin of error (in +/- actual units of
your desired estimate)
Practice Questions:
Healey 1st Cdn #7.5, 7.7, 7.9
 Healey 2/3 Cdn #6.5, 6.7, 6.9
