Social Science Reasoning Using Statistics

Download Report

Transcript Social Science Reasoning Using Statistics

Statistics for the Social Sciences
Psychology 340
Spring 2010
Probability &
The Normal Distribution
PSY 340
Statistics for the
Social Sciences
Reminders
• Quiz 2 due Thursday
• Homework 3 due Tues, Feb 2
• Exam 1 Thurs Feb. 11
PSY 340
Statistics for the
Social Sciences
Basics of Probability
Possible successful outcomes
Probability 
All possible outcomes
• Probability
– Expected relative frequency of a particular outcome
• Outcome
– The result of an experiment
PSY 340
Statistics for the
Social Sciences
Flipping a coin example
What are the odds of getting a “heads”?
n = 1 flip
Possible successful outcomes
Probability 
All possible outcomes
One outcome classified as heads
Total of two outcomes
=
1
2
= 0.5
PSY 340
Statistics for the
Social Sciences
Flipping a coin example
n=2
Number of heads
2
1
1
What are the odds of
getting two “heads”?
One 2 “heads”
outcome
Four total
outcomes
= 0.25
0
This situation is known as the binomial
# of outcomes = 2n
PSY 340
Statistics for the
Social Sciences
Flipping a coin example
n=2
Number of heads
2
1
1
0
What are the odds of
getting “at least one
heads”?
Three “at least one
heads” outcome
Four total
outcomes
= 0.75
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
n=3
3=
n
=
2
2
8 total outcomes
HHH
Number of heads
3
HHT
2
HTH
2
HTT
1
THH
2
THT
1
TTH
1
TTT
0
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Number of heads
3
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
2
X
f
p
3
1
.125
2
2
1
3
3
.375
.375
1
0
1
.125
1
2
1
0
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
Can make predictions about
likelihood of outcomes based on
this distribution.
What’s the probability of
flipping three heads in a
row?
p = 0.125
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
Can make predictions about
likelihood of outcomes based on
this distribution.
What’s the probability of
flipping at least two heads
in three tosses?
p = 0.375 + 0.125 = 0.50
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
Can make predictions about
likelihood of outcomes based on
this distribution.
What’s the probability of
flipping all heads or all tails
in three tosses?
p = 0.125 + 0.125 = 0.25
PSY 340
Statistics for the
Social Sciences
Hypothesis testing
Distribution of possible outcomes
(of a particular sample size, n)
Can make predictions about
likelihood of outcomes based on
this distribution.
• In hypothesis testing, we
compare our observed samples
with the distribution of possible
samples (transformed into
standardized distributions)
• This distribution of possible
outcomes is often Normally
Distributed
PSY 340
Statistics for the
Social Sciences
The Normal Distribution
• The distribution of days before and after due date (bin
width = 4 days).
-14
0
14
Days before and after due date
PSY 340
Statistics for the
Social Sciences
The Normal Distribution
• Normal distribution
PSY 340
The Normal Distribution
Statistics for the
Social Sciences
• Normal distribution is a commonly found
distribution that is symmetrical and unimodal.
– Not all unimodal, symmetrical curves are Normal, so be careful
with your descriptions
• It is defined by the following equation:
1
2 2

Z-scores
-3
-2
-1
0
1
2
3
e (X  )
2
/ 2 2
Estimating Probabilities in a Normal
Distribution
PSY 340
Statistics for the
Social Sciences
probability
Same logic as before
.
4
.
3
.
2
.
1
50%-34%-14% rule
.125
0
.375 .375
1
2
.125
3
50%
Number of heads
-3
-2
-1
0
1
2
3
PSY 340
Statistics for the
Social Sciences
Estimating Probabilities in a Normal
Distribution
50%-34%-14% rule
50%
34.13%
13.59%
-3
-2
-1
0
1
2
3
PSY 340
Statistics for the
Social Sciences
Estimating Probabilities in a Normal
Distribution
Similar to the 68%-95%-99% rule
34.13% 34.13%
2.28%
-3
13.59%
-2
-1
68%
0
2.28%
13.59%
1
2
3
PSY 340
Statistics for the
Social Sciences
Estimating Probabilities in a Normal
Distribution
Similar to the 68%-95%-99% rule
34.13% 34.13%
2.28%
-3
13.59%
-2
-1
95%
0
2.28%
13.59%
1
2
3
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
z
0
:
:
1.00
:
:
2.31
2.32
z
0
• The normal distribution is often
transformed into z-scores.
• Gives the precise proportion of scores (in
z-scores) above or below a given score in
a Normal distribution
• There are many ways that this table gets
organized
• Learn to understand what is in the table
• What do the numbers represent?
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
Z
0.00
0.01
0.02
:
:
1.0
:
1.3
:
:
4.00
Prop in
Body
Prop in
tail
Prop
btwn
mean
and z
.5000 .5000 .0000
.5040 .4960 .0040
.5080 .4920 .0080 •
:
:
:
:
:
:
.8413 .1587 .3413
:
:
:
.9032 .0968 .4032
:
:
:
:
:
:
.99997 .00003 .49997
From the
left side of
the dist.
In tail
z
0
The normal distribution is often
transformed into z-scores.
– Contains the proportions of a Normal
distribution
– Proportion between the z-score and left
side of the distribution
– Proportion in the tail to the right of
corresponding z-scores
– Proportion between the z-score and the
mean
• Note: This means that this table lists
only positive Z scores
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
z
.00
.01
0
:
:
1.0
:
:
2.3
2.4
:
0.5000
:
:
0.1587
:
:
0.0107
0.0082
:
0.4960
:
:
0.1562
:
:
0.0104
0.0080
:
In tail
z
0
• The normal distribution is often
transformed into z-scores.
– Contains the proportions in the tail to the
left of corresponding z-scores of a
Normal distribution
• This means that the table lists only
positive Z scores
• The different columns give the second
decimal place of the z-score
The unit normal table I have provided online (see ‘statistical tables’ link at top of
labs)
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
Mean to Z
z
Mean to Z
In tail
0
:
:
1.00
:
:
2.31
2.32
:
0.0000
:
:
0.3413
:
:
0.4896
0.4898
:
0.5000
:
:
0.1587
:
:
0.0104
0.0102
:
In tail
z
0
• The normal distribution is often
transformed into z-scores.
– Contains the proportions
– Proportion between the z-score and the
mean
– Proportion in the tail to the left of
corresponding z-scores of a Normal
distribution
• Note: This means that this table lists
only positive Z scores
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
z
.00
.01
-3.4
-3.3
:
:
0
:
:
1.0
:
:
3.3
3.4
0.0003
0.0005
:
:
0.5000
:
:
0.8413
:
:
0.9995
0.9997
0.0003
0.0005
:
:
0.5040
:
:
0.8438
:
:
0.9995
0.9997
From the
left side of
the dist.
0
z
• The normal distribution is often
transformed into z-scores.
– Contains the proportions to the left of
corresponding z-scores of a Normal
distribution
• This table lists both positive and
negative Z scores
Another common way the unit normal table
is presented in textbooks
PSY 340
Statistics for the
Social Sciences
Z
0.00
0.01
0.02
:
:
1.0
:
1.3
:
:
4.00
Using the Unit Normal Table
Prop in
Body
Prop in
tail
Prop
btwn
mean
and z
.5000 .5000 .0000
.5040 .4960 .0040
.5080 .4920 .0080
:
:
:
:
:
:
.8413 .1587 .3413
:
:
:
.9032 .0968 .4032
:
:
:
:
:
:
.99997 .00003 .49997
• Steps for figuring the
percentage below a particular
raw or Z score:
1. Convert raw score to Z score
(if necessary)
XM
z
SD
2. Draw normal curve, where the
Z score falls on it, shade in
the area for which you are
finding the
percentage
3. Make rough estimate of
shaded area’s percentage
(using 50%-34%-14% rule)
PSY 340
Statistics for the
Social Sciences
Z
0.00
0.01
0.02
:
:
1.0
:
1.3
:
:
4.00
Using the Unit Normal Table
Prop in
Body
Prop in
tail
Prop
btwn
mean
and z
.5000 .5000 .0000
.5040 .4960 .0040
.5080 .4920 .0080
:
:
:
:
:
:
.8413 .1587 .3413
:
:
:
.9032 .0968 .4032
:
:
:
:
:
:
.99997 .00003 .49997
• Steps for figuring the
percentage below a particular
raw or Z score:
4. Find exact percentage using
unit normal table
–
Use your sketch and understanding
of the table
5. Check the exact percentage is
within the range of the estimate
from Step 3
PSY 340
SAT Example problems
Statistics for the
Social Sciences
• The population parameters for the SAT are:
μ = 500, σ = 100, and it is Normally distributed
Suppose that you got a 630 on the SAT. What percent of
the people who take the SAT get your score or worse?
z
X 


630  500
From the table:
1.3
100
z(1.3) =.0968
So 90.32% got your
score or worse
-2
-1
That’s 9.68%
above this score

1
2
PSY 340
Statistics for the
Social Sciences
The Normal Distribution
• You can go in the other direction too
– Steps for figuring Z scores and raw scores from
percentages (or proportions):
1. Draw normal curve, shade in approximate area for the
percentage (using the 50%-34%-14% rule)
2. Make rough estimate of the Z score where the shaded area
starts
3. Find the exact Z score using the unit normal table
- So now you’re looking for a percentage/proportion in the body of the table,
and then looking to see what z-score it corresponds to
4. Check that your Z score is similar to the rough estimate from
Step 2
5. If you want to find a raw score, change it from the Z score
PSY 340
Statistics for the
Social Sciences
Testing Hypotheses
• Looking ahead:
How do we determine this?
– Core logic of hypothesis testing
• Considers the probability that the result of a study could have
come about if the experimental procedure had no effect
• “Studies” typically look not at single scores, but rather samples
of scores. So we need to think about the probability of getting
samples with particular characteristics (means).
observed difference
test statistic 
difference expected by chance
Z
(X   X )
X
Based on standard error or an
• Next time:
estimate of the standard error
– The distribution of sample means