Transcript day9

Introduction to Statistical
Inference
Chapter 11
Announcement: Read chapter 12
to page 299
Populations vs. Samples
• Population
– The complete set of individuals
• Characteristics are called parameters
• Sample
– A subset of the population
• Characteristics are called statistics.
– In most cases we cannot study all the
members of a population
Inferential Statistics
• Statistical Inference
– A series of procedures in which the data
obtained from samples are used to make
statements about some broader set of
circumstances.
Two different types of procedures
• Estimating population parameters
– Point estimation
• Using a sample statistic to estimate a population parameter
– Interval estimation
• Estimation of the amount of variability in a sample statistic
when many samples are repeatedly taken from a population.
• Hypothesis testing
– The comparison of sample results with a known or
hypothesized population parameter
These procedures share a
fundamental concept
• Sampling distribution
– A theoretical distribution of the possible
values of samples statistics if an infinite
number of same-sized samples were taken
from a population.
Example of the sampling
distribution of a discrete variable
p(x)
Binomial sampling distribution of an
unbiased coin tossed 10 times
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
4
5
6
7
8
Number of heads in 10 tosses
9
10
Continuous Distributions
• Interval or ratio level data
– Weight, height, achievement, etc.
• JellyBlubbers!!!
Histogram of the Jellyblubber
population
Repeated sampling of the
Jellyblubber population (n = 3)
Repeated sampling of the
Jellyblubber population (n = 5)
Repeated sampling of the
Jellyblubber population (n = 10)
Repeated sampling of the
Jellyblubber population (n = 40)
For more on this concept
• Visit
– http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html
Central Limit Theorem
• Proposition 1:
– The mean of the sampling
distribution will equal the
mean of the population.
x  
• Proposition 2:
– The sampling distribution of
means will be approximately
normal regardless of the
shape of the population.
• Proposition 3:
– The standard deviation
(standard error) equals the
standard deviation of the
population divided by the
square root of the sample
size. (see 11.5 in text)
x 

N
Application of the sampling
distribution
• Sampling error
– The difference between the sample mean and the population
mean.
• Assumed to be due to random error.
• From the jellyblubber experience we know that a
sampling distribution of means will be randomly
distributed with
x  
x 

N
Standard Error of the Mean and
Confidence Intervals
• We can estimate how
much variability there
is among potential
sample means by
calculating the
standard error of the
mean.
s.e.   x 

N
Confidence Intervals
• With our Jellyblubbers
– One random sample (n = 3)
• Mean = 9
– Therefore;
• 68% CI = 9 + or – 1(3.54)
• 95% CI = 9 + or – 1.96(3.54)
• 99% CI = 9 + or – 2.58(3.54)
6.132
s.e.   x 
 3.54
3
Confidence Intervals
• With our Jellyblubbers
– One random sample (n = 30)
• Mean = 8.90
– Therefore;
• 68% CI = 8.90 + or – 1(1.11)
• 95% CI = 8.90 + or – 1.96(1.11)
• 99% CI = 8.90 + or – 2.58(1.11)
6.132
s.e.   x 
 1.11
30
Hypothesis Testing (see handout)
1.
2.
3.
4.
5.
6.
State the research question.
State the statistical hypothesis.
Set decision rule.
Calculate the test statistic.
Decide if result is significant.
Interpret result as it relates to your
research question.