Analytical Methods I
Download
Report
Transcript Analytical Methods I
Chapter 9: Introduction to
Inference
1
Thumbtack Activity
Toss your thumbtack in the air and record
whether it lands either point up (U) or point
down (D). Do this 25 times (n=25).
Calculate p-hat.
Repeat the above process two more times,
for a total of three estimates.
Record your p-hat on a separate post-it note.
2
We’ve just begun a sampling
distribution.
Strictly speaking, a sampling distribution is:
A theoretical distribution of the values of a statistic
(in our case, the proportion) in all possible
samples of the same size (n=25 here) from the
same population.
Sampling Variability:
The value of a statistic varies from sample-tosample in repeated random sampling.
We do not expect to get the same exact value for
the statistic for each sample!
3
Definitions
Parameter:
A number that describes the population of interest.
Rarely do we know its value, because we do not (typically)
have all values of all individuals from a population.
We use µ and σ for the mean and standard deviation of a
population.
P and σp for proportions.
Statistic:
A number that describes a sample. We often use a statistic
to estimate an unknown parameter.
We use x-bar and s for the mean and standard deviation of
a sample.
P-hat and σp-hat for proportions.
4
Sampling Distribution
The sampling distribution answers the
question, “What would happen if we
repeated the sample or experiment
many times?”
Formal statistical inference is based on the
sampling distribution of statistics.
5
Inference
Inference is the statistical process by
which we use information collected
from a sample to infer something about
the population of interest.
Two main types of inference:
Interval estimation (Section 9.1)
Tests of significance (Section 9.2)
6
Constructing Confidence Intervals
Back to the thumbtack activity …
Interpretation of 95% C.I.:
If the sampling distribution is approximately
normal, then the 68-95-99.7 rule tells us that
about 95% of all p-hat values will be within two
standard deviations of p (upon repeated
samplings). If p-hat is within two standard
deviations of p, then p is within two standard
deviations of p-hat. So about 95% of the time, the
confidence interval will contain the true population
parameter p.
7
Internet Demonstration, C.I.
http://bcs.whfreeman.com/yates/pages/bcsmain.asp?s=00020&n=99000&i=99020.01&v=category&o=&ns
=0&uid=0&rau=0
8
Interpretation of 95% CI (Commit to memory!)
95% of all confidence intervals
constructed in the same manner will
contain the true population parameter.
5% of the time they will not.
9
10
p. 492
11
12
Finding a 95% C.I.
13
Practice
See example 9.3, p. 495
Exercises 9.1-9.4, p. 495
14
Creating the C.I.
Estimate +/- Margin of error
15
Another practice problem
9.5, p. 496
16
p. 496
17
Finding a confidence interval, general form
18
Figure 9.5, p. 502
19
20
Practice
9.9 and 9.10, p. 505
21
Confidence intervals with the calculator
22
9.2 Significance Testing
An evolutionary psychologist at Harvard University
claims that 80% (p=0.80) of American adults
believes in the theory of evolution. To test his claim,
he takes an SRS of 1,120 adults. Here are the
results:
851 said “Yes” when asked, “Do you believe in the theory of
evolution?”
What is the proportion who said yes?
Is this enough evidence to say that the proportion of adults
who do not believe in the theory of evolution is different
from 0.80?
23
Example, cont.
This requires a significance test:
Hypotheses:
Ho: p=0.80
Ha: p≠0.80
Let’s use our calculators to conduct the
appropriate test:
5: 1-prop ztest
24
Example Results
P-value
25
p. 516
26
Hypotheses
Alternate hypothesis Ha:
Can be one-sided (Ha: p> some number or p< some number)
or two-sided (Ha: p≠ some number)
27
HW
9.24-9.26, p. 521
Reading: pp. 509-525
28
p. 519
29
Sampling Applet
http://www.ruf.rice.edu/~lane/stat_sim/
sampling_dist/
30