Psyc 235: Introduction to Statistics
Download
Report
Transcript Psyc 235: Introduction to Statistics
Psyc 235:
Introduction to Statistics
http://www.psych.uiuc.edu/~jrfinley/p235/
DON’T FORGET TO SIGN IN FOR CREDIT!
Announcements (1of2)
• Early Informal Feedback
https://webtools.uiuc.edu/formBuilder/Secure?id=974
8379
Open until Sat March 15th
• Special Lecture Thurs March 13th:
Conditional Probability (incl. Law of Total
Prob., Bayes’ Theorem)
Mandatory for invited students
Anyone can come
No OH; Go to lab for Qs/help.
Announcements (2of2)
• Target Dates: STAY ON TARGET!
You should be finishing the Distributions slice
VoD “5. Normal Calculations, 17. Binomial
Distributions,” and “18. The Sample Mean and
Control Charts,”
• Quiz 3: Thurs-Fri March 13th-14th
Population
“Standard Error”
Sampling
Distribution
X
(of the mean)
n
X
Sample
size = n
X
sample statistic (a random variable!)
Shape of the Sampling
Distribution?
• If population distribution is normal:
Sampling distribution is normal (for any n)
• If sample size (n) is large:
Sampling distribution approaches normal
Central Limit Theorem
• As sample size (n) increases:
Sampling distribution becomes more normal
Variance (and thus std. dev.) decreases
Great, Normal Distributions!
• Can now calculate probabilities like:
• Just convert values of interst to z scores
x
(standard normal distribution)
z
• And then look up probabilities for that z score in
ALEKS (calculator)
• Or vice versa…
So far…
• We’ve been doing things like:
Given a certain population, what’s prob of
getting a sample statistic above/below a
certain value?
Population--->Sample
• How can we shift to …
Using our Sample to reason about the
POPULATION?
Sample--->Population
INFERENTIAL STATISTICS!
• Estimating a population parameter (e.g.,
the mean of the pop.: )
• How to do it:
Take a random sample from the pop.
Calculate sample statistic (e.g., the mean of
the sample: ) X
That’s your estimate.
• Class dismissed.
No, wait!
• The sample statistic
is a point estimate of
X
the population parameter
by a little, or by a lot!
• It could be off,
Population
Sampling
Distribution
(of the mean)
X
Sample
size = n
X
We only have one
sample statistic.
And we don’t know
where in here it falls.
Interval Estimate
• Point estimate (sample statistic) gives us
no idea of how close we might be to the
true population parameter.
• We want to be able to specify some
interval around our point estimate that will
have a high prob. of containing the true
pop parameter.
Confidence Interval
• An interval around the sample statistic that
would capture the true population
parameter a certain percent of the time
(e.g., 95%) in the long run.
(i.e., over all samples of the same size, from
the same population)
Note: True Population
Parameter is constant!
This is the mean
from one sample.
Let’s put a 90%
Confidence Interval
around it.
X
Note that this particular
interval captures
the true mean!
Let’s consider other possible samples
(of the SAME SIZE)
X
So does this one.
This one too.
This interval misses
the true mean!
The mean
from another
possible sample.
This one captures
the true mean too.
And this one.
Yep.
…
But this one’s alright.
A 90% Confidence
Interval means that
for 90% of all
possible samples
(of the same size),
that interval around
the sample statistic
will capture the true
population parameter
(e.g., mean).
X
Only sample statistics
in the outer 10% of
the sampling
distribution have
confidence intervals
that “miss” the true
population parameter.
…
But, remember…
X
…
But, remember…
All that we have is our sample.
Sample
size = n
X
Still, a Confidence Interval is more useful
in estimating the population parameter
than is a mere point estimate alone.
So, how do we make ‘em?
Sample
size = n
X
CONFIDENCE INTERVAL
(1 - )% confidence interval for a population parameter
P( C. I. encloses true population parameter ) = 1 -
Note: = P(Confidence Interval misses true population parameter )
“Proportion of times such a CI misses the population parameter”
Margin of Error
Point
estimate
±
sample statistic
ex:
X
z / 2
critical
value
or
·
t / 2
Std. dev. of
point estimate
standard deviation of
sampling distribution
(aka “Standard Error”)
Decision Tree
for Confidence Intervals
Population
Standard Deviation
known?
Yes
Pop. Distribution
normal?
n large?
(CLT)
Yes
No
Standard normal
distribution
Yes
No
No
Note: ALEKS…
Critical
Score
z-score
Yes
z-score
Can’t do it
t-score
t distribution
No
Yes
No
t-score
Can’t do it
C.I. using Standard Normal
Distribution
For the Population Mean
When known.
First, choose an level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
Point
estimate
±
critical
value
·
Std. dev. of
point estimate
C.I. using Standard Normal
Distribution
For the Population Mean
When known.
First, choose an level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
X
±
critical
value
·
Std. dev. of
point estimate
C.I. using Standard Normal
Distribution
For the Population Mean
When known.
First, choose an level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
X
±
critical
value
·
n
C.I. using Standard Normal
Distribution
For the Population Mean
When known.
First, choose an level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
X
±
z / 2 ·
n
Lookup value
(ALEKS calculator,
Z tables)
Handy Zs
(Thanks, Standard Normal Distribution!)
if .10 90% Confidence
upper .05
z / 2 1.645
critical value
if .05 95% Confidence
upper .025
z / 2 1.960
critical value
if .01 99% Confidence
upper .005
z / 2 2.576
critical value
C.I. using Standard Normal
Distribution
For the Population Mean
When known.
Margin of Error
X
±
z / 2 ·
n
X z / 2 is a 1 confidence interval of
n
Remember:
random variable
Furthermore, in that case,
PX z / 2 X z / 2 1
n
n
C.I. using t Distribution
For the Population Mean
When unknown!
Margin of Error
X
±
·
C.I. using t Distribution
For the Population Mean
When unknown!
Margin of Error
X
·
±
s
n
We use the standard deviation from our sample (s)
to estimate the population std. dev. ().
s
x
x
2
i
n 1
The “n-1” is an adjustment to
make s an unbiased estimator
of the population std. dev.
C.I. using t Distribution
For the Population Mean
When unknown!
Margin of Error
X
±
t / 2
·
s
n
Critical value taken from a t distribution, not standard normal.
The goodness of our estimate of will depend on our sample size (n).
So the exact shape of any given t distribution
depends on degrees of
freedom (which is derived from sample size: n-1, here).
Fortunately, we can still just LOOK UP the critical values…
(just need to additionally plug in degrees freedom)
Behavior of C.I.
• As Confidence (1-) goes UP
Intervals get WIDER
(ex: 90% vs 99%)
• As Population Std. Dev. () goes UP
Intervals get WIDER
• As Sample Size (n) goes UP
Intervals get NARROWER
n
Std dev of sampling
distribution of the mean
C. I. for Differences
(e.g., of Population Means)
• Same approach.
• Key is:
Treat the DIFFERENCE between sample
means as a single random variable, with its
own sampling distribution & everything.
X1 X 2
The difference between population means is a
constant (unknown to us).
Remember
• Early Informal Feedback
• Special Lecture Thursday
No OH; Go to lab for Qs/help.
• Stay on target
Finish Distributions
VoDs
• Quiz 3
X