Psyc 235: Introduction to Statistics

Download Report

Transcript Psyc 235: Introduction to Statistics

Psyc 235:
Introduction to Statistics
http://www.psych.uiuc.edu/~jrfinley/p235/
DON’T FORGET TO SIGN IN FOR CREDIT!
Announcements (1of2)
• Early Informal Feedback
 https://webtools.uiuc.edu/formBuilder/Secure?id=974
8379
 Open until Sat March 15th
• Special Lecture Thurs March 13th:
Conditional Probability (incl. Law of Total
Prob., Bayes’ Theorem)
 Mandatory for invited students
 Anyone can come
 No OH; Go to lab for Qs/help.
Announcements (2of2)
• Target Dates: STAY ON TARGET!
 You should be finishing the Distributions slice
 VoD “5. Normal Calculations, 17. Binomial
Distributions,” and “18. The Sample Mean and
Control Charts,”
• Quiz 3: Thurs-Fri March 13th-14th
Population


“Standard Error”
Sampling
Distribution
X 
(of the mean)

n
X

Sample
size = n
X
sample statistic (a random variable!)
Shape of the Sampling
Distribution?
• If population distribution is normal:
 Sampling distribution is normal (for any n)
• If sample size (n) is large:
 Sampling distribution approaches normal
Central Limit Theorem
• As sample size (n) increases:
 Sampling distribution becomes more normal
 Variance (and thus std. dev.) decreases
Great, Normal Distributions!
• Can now calculate probabilities like:
• Just convert values of interst to z scores
x 
(standard normal distribution)
z

• And then look up probabilities for that z score in
ALEKS (calculator)
• Or vice versa…

So far…
• We’ve been doing things like:
 Given a certain population, what’s prob of
getting a sample statistic above/below a
certain value?
 Population--->Sample
• How can we shift to …
 Using our Sample to reason about the
POPULATION?
 Sample--->Population
INFERENTIAL STATISTICS!
• Estimating a population parameter (e.g.,
the mean of the pop.:  )
• How to do it:
 Take a random sample from the pop.
 Calculate sample statistic (e.g., the mean of
the sample: ) X
 That’s your estimate.
• Class dismissed.

No, wait!
• The sample statistic
is a point estimate of
X
the population parameter 
 by a little, or by a lot!
• It could be off,
Population


Sampling
Distribution
(of the mean)
X
Sample
size = n
X
We only have one
sample statistic.
And we don’t know
where in here it falls.
Interval Estimate
• Point estimate (sample statistic) gives us
no idea of how close we might be to the
true population parameter.
• We want to be able to specify some
interval around our point estimate that will
have a high prob. of containing the true
pop parameter.
Confidence Interval
• An interval around the sample statistic that
would capture the true population
parameter a certain percent of the time
(e.g., 95%) in the long run.
 (i.e., over all samples of the same size, from
the same population)
Note: True Population
Parameter is constant!


This is the mean
from one sample.
Let’s put a 90%
Confidence Interval
around it.

X
Note that this particular
interval captures
the true mean!
Let’s consider other possible samples
(of the SAME SIZE)


X
So does this one.
This one too.
This interval misses
the true mean!

The mean
from another
possible sample.
This one captures
the true mean too.
And this one.
Yep.
…
But this one’s alright.

A 90% Confidence
Interval means that
for 90% of all
possible samples
(of the same size),
that interval around
the sample statistic
will capture the true
population parameter
(e.g., mean).

X

Only sample statistics
in the outer 10% of
the sampling
distribution have
confidence intervals
that “miss” the true
population parameter.
…


But, remember…
X

…
But, remember…
All that we have is our sample.
Sample
size = n
X
Still, a Confidence Interval is more useful
in estimating the population parameter
than is a mere point estimate alone.
So, how do we make ‘em?
Sample
size = n
X
CONFIDENCE INTERVAL
(1 - )% confidence interval for a population parameter
P( C. I. encloses true population parameter ) = 1 - 
Note:  = P(Confidence Interval misses true population parameter )
“Proportion of times such a CI misses the population parameter”
Margin of Error
Point
estimate
±
sample statistic
ex:
X
z / 2
critical
value
or
·
t / 2
Std. dev. of
point estimate
standard deviation of
sampling distribution
(aka “Standard Error”)
Decision Tree
for Confidence Intervals
Population
Standard Deviation
known?
Yes
Pop. Distribution
normal?
n large?
(CLT)
Yes
No
Standard normal
distribution
Yes
No
No
Note: ALEKS…
Critical
Score
z-score
Yes
z-score
Can’t do it
t-score
t distribution
No
Yes
No
t-score
Can’t do it
C.I. using Standard Normal
Distribution
For the Population Mean

When  known.
First, choose an  level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
Point
estimate
±
critical
value
·
Std. dev. of
point estimate
C.I. using Standard Normal
Distribution
For the Population Mean

When  known.
First, choose an  level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
X
±
critical
value
·
Std. dev. of
point estimate
C.I. using Standard Normal
Distribution
For the Population Mean

When  known.
First, choose an  level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
X
±
critical
value


·
n
C.I. using Standard Normal
Distribution
For the Population Mean

When  known.
First, choose an  level.
For ex., α=.05 gives us a 95% confidence interval.
Margin of Error
X

±
z / 2 ·


n
Lookup value
(ALEKS calculator,
Z tables)
Handy Zs
(Thanks, Standard Normal Distribution!)
if   .10 90% Confidence
upper .05
z / 2 1.645
critical value
if   .05 95% Confidence

upper .025
z / 2 1.960
critical value
if   .01 99% Confidence

upper .005
z / 2  2.576
critical value

C.I. using Standard Normal
Distribution
For the Population Mean

When  known.
Margin of Error
X
±
z / 2 ·

n
  
X  z / 2   is a 1  confidence interval of 
 n 
Remember:
random variable

Furthermore, in that case,


  
  
PX  z / 2     X  z / 2   1 
 n 
 n 


C.I. using t Distribution
For the Population Mean

When  unknown!
Margin of Error
X
±
·

C.I. using t Distribution
For the Population Mean

When  unknown!
Margin of Error
X
·
±
s
n
We use the standard deviation from our sample (s)
to estimate the population std. dev. ().
s
x
 x

2
i
n 1
The “n-1” is an adjustment to
make s an unbiased estimator
of the population std. dev.

C.I. using t Distribution
For the Population Mean

When  unknown!
Margin of Error
X
±
t / 2
·
s
n
Critical value taken from a t distribution, not standard normal.
The goodness of our estimate of  will depend on our sample size (n).

So the exact shape of any given t distribution
depends on degrees of
freedom (which is derived from sample size: n-1, here).

Fortunately, we can still just LOOK UP the critical values…
(just need to additionally plug in degrees freedom)
Behavior of C.I.
• As Confidence (1-) goes UP
 Intervals get WIDER
 (ex: 90% vs 99%)
• As Population Std. Dev. () goes UP
 Intervals get WIDER
• As Sample Size (n) goes UP
 Intervals get NARROWER

n
Std dev of sampling
distribution of the mean
C. I. for Differences
(e.g., of Population Means)
• Same approach.
• Key is:
 Treat the DIFFERENCE between sample
means as a single random variable, with its
own sampling distribution & everything.
X1  X 2 
 The difference between population means is a
constant (unknown to us).

Remember
• Early Informal Feedback
• Special Lecture Thursday
 No OH; Go to lab for Qs/help.
• Stay on target
 Finish Distributions
 VoDs
• Quiz 3



X