Lecture 2a - San Jose State University
Download
Report
Transcript Lecture 2a - San Jose State University
MET 136 Statistical Climatology - Lecture 11
Confidence Intervals
Dr. Marty Leach
San Jose State University
Reading:
Gonick Chapter 7
1
Sampling
We previously studied how samples of
large populations were distributed.
Now, we’ll look at one sample, and study
what we can determine from this alone.
2
Confidence Intervals
Are used extensively in science
Used in election polls (watch it!)
Example:
The average global air temperature near the
Earth's surface increased 0.74 0.18ºC (1.33
0.32 º F) during the 100 years ending in
2005. (IPCC 2007)
4
Example 1
Election Numbers
http://www.surveyusa.com/client/PollPrint.asp
x?g=252060cf-f1d3-49bc-80ed24d0c9122b49&d=0
Let’s look at the numbers
5
Poll
Surveyed 661 likely to vote people
N=661
Randomly selected
Result:
p 0.53
7
Standard deviation of normal
To determine the accuracy of this probability, we
need to calculate the standard deviation:
p
p(1 p)
n
Only problem…we don’t know true probability, p.
9
Standard Error
Only thing we can do is use the standard
error (which uses the sampled probability (phat)
This is called the standard error
p(1 p)
SEp
n
11
Standard Error
So now we can estimate the confidence interval at
the 95% level
.95 Pr(1.96 Z 1.96)
.95 Pr p 1.96SE( p) p p 1.96SE( p)
This says that 95% of the time, the true
probability p will fall within these two values.
13
Calculate confidence interval
Let’s calculate the 95% confidence interval for the
presidental poll in CA.
N=661
p 0.53
0.53(0.47)
SE p
0.019
661
So that now, p is within the range:
0.53±1.96*0.019
p=0.53 ± 0.038
15
Interpretation
So what does this mean?
p=0.53 ± 0.038
0.492 ≤ p ≤ 0.568
Slight oversight…
Obama: 53 McCain: 43 Undecided/other: 4
17
20 samples with n=1000; assume true value p=0.5. Shown are 95%
confidence interval. On average 1 in 20 will not cover 0.5
18
Improve the results
Suppose we want more confidence, say 99%,
what can we do?
Widen the confidence interval
Increase the sample size
20
Example
Redo the confidence interval at the 99% level
Result:
0.53±2.58*0.019
p=0.53 ± 0.049
0.481 ≤ p ≤ 0.579
But now our margin of error is larger… (e.g. I’m
100% confident the probability will be between 0
and 1!
22
Sample Size
But what if we are not happy that our error
has gone up. The other way to keep the error
down and the confidence high is to increase
the sample size.
2
Z p * (1 p*)
n
2
E
2
Where Z is from the normal table (pg 84), p* is the
estimate of the probability and E is the margin of
error.
24
Example
So now calculate the sample size required to
produce a margin of error of 0.01 and a 99%
confidence level.
Result
More then 16,000 respondents!
Limits to polling…
26
Confidence intervals for the mean
Now, we’ll look at confidence intervals for the
mean, not the probability.
x z SE(x )
2
s
x 1.96
n
28
Standard Error
The standard error of the mean is defined as:
s
SE(x )
n
Where
s is the sample standard deviation
30
Example
Suppose that you calculate the average
winter low temperature in Silicon Valley
during the last 25 years to be 41.5F and the
standard deviation is 3.2F.
Compute the 95% confidence interval for the
mean temperature.
If temperatures below 40F are required for
fruit to start growing in the valley, would you
expect this to happen in a typical winter?
31
Student’s t
We’ve discussed that as the sample size increases,
the distribution approaches a normal distribution.
We can quantify this using the degrees of freedom.
If you have x1, x2, …xn data points, then you have
n-1 degrees of freedom.
So, we can choose a t-distribution for n-1 degrees
of freedom.
32
t-distribution
33
Mean using a t-distribution
So, using a t-distribution, the mean and the
confidence interval is given by:
x t SE(x )
2
t a is the critical value of the t
- distribution
2
with n 1 degrees of freedom.
35
Notation:
36
t-distribution table
37