Social Science Reasoning Using Statistics

Download Report

Transcript Social Science Reasoning Using Statistics

Statistics for the Social Sciences
Psychology 340
Spring 2010
Making estimations
PSY 340
Statistics for the
Social Sciences
Statistical analysis
follows design
• Are you looking for a
difference between
groups?
• Are you estimating the
mean (or a mean
difference)?
• Are you looking for a
relationship between
two variables?
PSY 340
Statistics for the
Social Sciences
Estimation
• So far we’ve been dealing with situations where we know
the population mean. However, most of the time we don’t
know it.
• Two kinds of estimation
– Point estimates
• A single score
– Interval estimates
• A range of scores
μ=?
PSY 340
Statistics for the
Social Sciences
Estimation
μ=?
Advantage
Disadvantage
Two kinds of estimation
– Point estimates
A single score
“the mean is 85”
– Interval estimates
Confidence of
“the mean is somewhere
the estimate
between 81 and 89”
Little confidence of
the estimate
A range of scores
PSY 340
Statistics for the
Social Sciences
Estimation
• Both kinds of estimates use the same basic procedure
– The formula is a variation of the test statistic formula (so far we
know the z-score)
zX 
X  X
X

zX ( X )  X  X

X  X  zX ( X )
PSY 340
Statistics for the
Social Sciences
Estimation
• Both kinds of estimates use the same basic procedure
– The formula is a variation of the test statistic formula (so far we
know the z-score)
X  X  zX ( X )
Why the sample mean?
1) It is often the only piece of evidence that we have, so it is our best guess.
2) Most sample means will be pretty close to the population mean, so we
have a good chance that our sample mean is close.

PSY 340
Statistics for the
Social Sciences
Estimation
• Both kinds of estimates use the same basic procedure
– The formula is a variation of the test statistic formula (so far we
know the z-score)
X  X  zX ( X )
Margin of error
1) A test statistic value (e.g., a z-score)
2) The standard error (the difference that you’d expect by chance)

PSY 340
Statistics for the
Social Sciences
Estimation
• Both kinds of estimates use the same basic procedure
X  X  zX ( X )
– Step 1: You begin by making a reasonable estimation of what the z
(or t) value should be for your estimate.
• For a point estimation, you want what? z (or t) = 0, right in the middle
• For an interval, your values will depend on how confident you want to be in
your estimate
– What do I mean by “confident”?
» 90% confidence means that 90% of confidence interval estimates
of this sample size will include the actual population mean

PSY 340
Statistics for the
Social Sciences
Estimation
• Both kinds of estimates use the same basic procedure
X  X  zX ( X )
– Step 1: You begin by making a reasonable estimation of what the z
(or t) value should be for your estimate.
• For a point estimation, you want what? z (or t) = 0, right in the middle
• For an interval, your values will depend on how confident you want to be in
your estimate

– Step 2: You take your “reasonable” estimate for your test statistic,
and put it into the formula and solve for the unknown population
parameter.
PSY 340
Statistics for the
Social Sciences
Estimates with z-scores
Make a point estimate of the population mean given a sample with a X =
85, n = 25, and a population σ = 5.
 5 
X  X  zX ( X )  85  (0)
 85
 25 


So the point estimate
is the sample mean
PSY 340
Statistics for the
Social Sciences
Estimates with z-scores
Make an interval estimate with 95% confidence of the population mean
given a sample with a X = 85, n = 25, and a population σ = 5.
What two z-scores
do 95% of the data
lie between?
X  X  zX ( X )
-2
-1

95%
1
2
PSY 340
Statistics for the
Social Sciences
Estimates with z-scores
Make an interval estimate with 95% confidence of the population mean
given a sample with a X = 85, n = 25, and a population σ = 5.
 5  86.96
X  X  zX ( X )  85  (1.96)

 25  83.04
So the confidence interval is: 83.04 to 86.96
± 1.96
or 85

What two z-scores
do 95% of the data
lie between?
From the table:
z(1.96) =.0250

2.5%
2.5%
-2
-1

95%
1
2
PSY 340
Statistics for the
Social Sciences
Estimates with z-scores
Make an interval estimate with 90% confidence of the population mean
given a sample with a X = 85, n = 25, and a population σ = 5.
 5  86.65
X  X  zX ( X )  85  (1.65)

 25  83.35
So the confidence interval is: 83.35 to 86.65
± 1.65
or 85

What two z-scores
do 90% of the data
lie between?
From the table:
z(1.65) =.0500

5%
5%
-2
-2
-1
-1


90%
11
22
PSY 340
Statistics for the
Social Sciences
Estimates with z-scores
Make an interval estimate with 90% confidence of the population mean
given a sample with a X = 85, n = 4, and a population σ = 5.
 5   89.13
X  X  zX ( X )  85  (1.65) 
 4  80.88
So the confidence interval is: 80.88 to 89.13
± 4.13
or 85

What two z-scores
do 90% of the data
lie between?
From the table:
z(1.65) =.0500

5%
5%
-2
-2
-1
-1


90%
11
22
PSY 340
Statistics for the
Social Sciences
Estimation in other designs
Estimating the mean of the population from one sample,
but we don’t know the σ
Confidence
interval
Diff.
Expected by
chance
 X  X  (t crit )(s X )
s
sX 
n
How do we find
this?
Use the t-table
How do we find
this?
PSY 340
Statistics for the
Social Sciences
Estimates with t-scores
Confidence intervals always involve + a margin of error
This is similar to a two-tailed test, so in the t-table, always use
the “proportion in two tails” heading, and select the α-level
corresponding to (1 - Confidence level)
What is the tcrit needed for a 95% confidence interval?
so two tails with
2.5%+2.5% = 5% or
95% in middle 2.5% in each
α = 0.05, so look here
0.10
df
:
5
6
:
0.20
:
1,476
1.440
:
Proportion in one tail
0.05
0.025
Proportion in two tails
0.10
0.05
:
:
2.015
2.571
1.943
2.447
:
:
0.01
0.005
0.02
:
3.365
3.143
:
0.01
:
4.032
3.707
:
2.5%
2.5%
-2
-1

95%
1
2
PSY 340
Estimates with t-scores
Statistics for the
Social Sciences
Make an interval estimate with 95% confidence of the population mean
given a sample with a X = 85, n = 25, and a sample s = 5.
two critical t5   87.06 What

scores do 95% of the
 X  X  t crit (s X )  85  (2.064) 

 25   82.94 data lie between?
df  n  1  25  1  24
From the table:
So the confidence interval is: 82.94 to 87.06
95% confidence
or 85 ± 2.064
0.10
df
:
24
25
:
0.20
:
1.318
1.316
:
Proportio n in one tail
0.05
0.025
Proportio n in two tails
0.10
0.05
:
:
1.711
2.064
1.708
2.060
:
:
0.01
0.005
0.02
:
2.492
2.485
:
0.01
:
2.797
2.787
:
tcrit =+2.064
2.5%
2.5%
-2
-1

95%
1
2
PSY 340
Statistics for the
Social Sciences
Estimation in other designs
Estimating the difference between two population
means based on two related samples
Confidence
interval
Diff.
Expected by
chance 

D  D  (tcrit )(sD )
sD
sD 
nD
PSY 340
Statistics for the
Social Sciences
Estimation in other designs
Estimating the difference between two population means
based on two independent samples
Confidence
interval

A  B  (X A  X B )  (t crit )(sX
Diff.
Expected by
chance
sXA  XB 
sP2 sP2

nA nB
A X B
)
PSY 340
Estimation Summary
Statistics for the
Social Sciences
Design
Estimation
One sample, σ
known
X  X  zX ( X )
One sample, σ
unknown
 X  X  (t crit )(s X )
Two related
samples, σ
unknown

Two independent
samples, σ

unknown
(Estimated) Standard error
X 
n
s
sX 
n
sD
sD 
nD
D  D  (tcrit )(sD )
A  B  (X A  X B )  (t crit )(sX

)
A X B

sXA  XB 
sP2 sP2

nA nB
PSY 340
Statistics for the
Social Sciences
Statistical analysis
follows design
• Questions to answer:
• Are you looking for a
difference, or estimating
a mean?
• Do you know the pop.
SD (σ)?
• How many samples of
scores?
• How many scores per
participant?
• If 2 groups of scores, are
the groups independent
or related?
• Are the predictions
specific enough for a 1tailed test?
PSY 340
Statistics for the
Social Sciences
Design Summary
• Questions to answer:
• Are you looking for a
difference, or estimating
a mean?
• Do you know the pop.
SD (σ)?
• How many samples of
scores?
• How many scores per
participant?
• If 2 groups of scores, are
the groups independent
or related?
• Are the predictions
specific enough for a 1tailed test?
Design
One sample, σ
known, 1 score
per sub
One sample z
One sample, σ
unknown, 1 score
per
One sample t
2 related samples, σ
unknown, 1 score per
- or –
1 sample, 2 scores per
sub, σ unknown
Two independent
samples, σ unknown,
1 score per sub
zX 
X  X
X
X
t   X
sX
D  D
sD
n
X  X  zX ( X )
sX 
s
n
 X  X  (t crit )(s X )
Related samples t
t 

X 
sD 
sD
nD
D  D  (tcrit )(sD )
Independent
samples-t
(X  X )  (  A   B )
t  A B
s XA  XB
sXA  XB 
A  B  (X A  X B )  (t crit )(sX
sP2 sP2

nA nB
A X B
)
PSY 340
Statistics for the
Social Sciences
Estimates with z-scores
• Questions to answer:
• Are you looking for a
difference, or estimating
a mean?
• Do you know the pop.
SD (σ)?
• How many samples of
scores?
• How many scores per
participant?
• If 2 groups of scores, are
the groups independent
or related?
• Are the predictions
specific enough for a 1tailed test?
Researchers used a sample of n = 16 adults. Each
person’s mood was rated while smiling and frowning. It
was predicted that moods would be rated as more
positive if smiling than frowning. Results showed Msmile
= 7 and Mfrown = 4.5.
D  D
Related samples t t 
Are the groups different?
sD
Researcher measures reaction time for n = 36
participants. Each is then given a medicine and reaction
time is measured again. For this sample, the average
difference was 24 ms, with a SD of 8. With 95%
confidence estimate the population mean difference.
Related samples CI D  D  (tcrit )(sD )
A teacher is evaluating the effectiveness of a new way
of presenting material to students.
 A sample of 16
students is presented the material in the new way and
are then given a test on that material, they have a mean
of 87. How do they compare to past
X  X
1 sample z
zX 
X
classes with a mean of 82 and SD = 3?