Analytical Methods I

Download Report

Transcript Analytical Methods I

Chapter 9: Introduction to
Inference
1
Thumbtack Activity

Toss your thumbtack in the air and record
whether it lands either point up (U) or point
down (D). Do this 25 times (n=25).


Calculate p-hat.
Repeat the above process two more times,
for a total of three estimates.

Record your p-hat on a separate post-it note.
2
We’ve just begun a sampling
distribution.

Strictly speaking, a sampling distribution is:


A theoretical distribution of the values of a statistic
(in our case, the proportion) in all possible
samples of the same size (n=25 here) from the
same population.
Sampling Variability:


The value of a statistic varies from sample-tosample in repeated random sampling.
We do not expect to get the same exact value for
the statistic for each sample!
3
Definitions

Parameter:



A number that describes the population of interest.
Rarely do we know its value, because we do not (typically)
have all values of all individuals from a population.
We use µ and σ for the mean and standard deviation of a
population.


P and σp for proportions.
Statistic:


A number that describes a sample. We often use a statistic
to estimate an unknown parameter.
We use x-bar and s for the mean and standard deviation of
a sample.

P-hat and σp-hat for proportions.
4
Sampling Distribution

The sampling distribution answers the
question, “What would happen if we
repeated the sample or experiment
many times?”

Formal statistical inference is based on the
sampling distribution of statistics.
5
Inference


Inference is the statistical process by
which we use information collected
from a sample to infer something about
the population of interest.
Two main types of inference:


Interval estimation (Section 9.1)
Tests of significance (Section 9.2)
6
Constructing Confidence Intervals


Back to the thumbtack activity …
Interpretation of 95% C.I.:

If the sampling distribution is approximately
normal, then the 68-95-99.7 rule tells us that
about 95% of all p-hat values will be within two
standard deviations of p (upon repeated
samplings). If p-hat is within two standard
deviations of p, then p is within two standard
deviations of p-hat. So about 95% of the time, the
confidence interval will contain the true population
parameter p.
7
Internet Demonstration, C.I.

http://bcs.whfreeman.com/yates/pages/bcsmain.asp?s=00020&n=99000&i=99020.01&v=category&o=&ns
=0&uid=0&rau=0
8
Interpretation of 95% CI (Commit to memory!)

95% of all confidence intervals
constructed in the same manner will
contain the true population parameter.

5% of the time they will not.
9
10
p. 492
11
12
Finding a 95% C.I.
13
Practice


See example 9.3, p. 495
Exercises 9.1-9.4, p. 495
14
Creating the C.I.

Estimate +/- Margin of error
15
Another practice problem

9.5, p. 496
16
p. 496
17
Finding a confidence interval, general form
18
Figure 9.5, p. 502
19
20
Practice

9.9 and 9.10, p. 505
21
Confidence intervals with the calculator
22
9.2 Significance Testing

An evolutionary psychologist at Harvard University
claims that 80% (p=0.80) of American adults
believes in the theory of evolution. To test his claim,
he takes an SRS of 1,120 adults. Here are the
results:


851 said “Yes” when asked, “Do you believe in the theory of
evolution?”
What is the proportion who said yes?

Is this enough evidence to say that the proportion of adults
who do not believe in the theory of evolution is different
from 0.80?
23
Example, cont.


This requires a significance test:
Hypotheses:



Ho: p=0.80
Ha: p≠0.80
Let’s use our calculators to conduct the
appropriate test:

5: 1-prop ztest
24
Example Results
P-value
25
p. 516
26
Hypotheses
Alternate hypothesis Ha:
Can be one-sided (Ha: p> some number or p< some number)
or two-sided (Ha: p≠ some number)
27
HW


9.24-9.26, p. 521
Reading: pp. 509-525
28
p. 519
29
Sampling Applet

http://www.ruf.rice.edu/~lane/stat_sim/
sampling_dist/
30