Social Science Reasoning Using Statistics
Download
Report
Transcript Social Science Reasoning Using Statistics
Statistics for the Social Sciences
Psychology 340
Spring 2010
Probability &
The Normal Distribution
PSY 340
Statistics for the
Social Sciences
Reminders
• Quiz 2 due Thursday
• Homework 3 due Tues, Feb 2
• Exam 1 Thurs Feb. 11
PSY 340
Statistics for the
Social Sciences
Basics of Probability
Possible successful outcomes
Probability
All possible outcomes
• Probability
– Expected relative frequency of a particular outcome
• Outcome
– The result of an experiment
PSY 340
Statistics for the
Social Sciences
Flipping a coin example
What are the odds of getting a “heads”?
n = 1 flip
Possible successful outcomes
Probability
All possible outcomes
One outcome classified as heads
Total of two outcomes
=
1
2
= 0.5
PSY 340
Statistics for the
Social Sciences
Flipping a coin example
n=2
Number of heads
2
1
1
What are the odds of
getting two “heads”?
One 2 “heads”
outcome
Four total
outcomes
= 0.25
0
This situation is known as the binomial
# of outcomes = 2n
PSY 340
Statistics for the
Social Sciences
Flipping a coin example
n=2
Number of heads
2
1
1
0
What are the odds of
getting “at least one
heads”?
Three “at least one
heads” outcome
Four total
outcomes
= 0.75
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
n=3
3=
n
=
2
2
8 total outcomes
HHH
Number of heads
3
HHT
2
HTH
2
HTT
1
THH
2
THT
1
TTH
1
TTT
0
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Number of heads
3
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
2
X
f
p
3
1
.125
2
2
1
3
3
.375
.375
1
0
1
.125
1
2
1
0
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
Can make predictions about
likelihood of outcomes based on
this distribution.
What’s the probability of
flipping three heads in a
row?
p = 0.125
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
Can make predictions about
likelihood of outcomes based on
this distribution.
What’s the probability of
flipping at least two heads
in three tosses?
p = 0.375 + 0.125 = 0.50
PSY 340
Flipping a coin example
Statistics for the
Social Sciences
Distribution of possible outcomes
probability
(n = 3 flips)
.4
.3
.2
.1 .125
.375 .375 .125
0 1 2 3
Number of heads
Can make predictions about
likelihood of outcomes based on
this distribution.
What’s the probability of
flipping all heads or all tails
in three tosses?
p = 0.125 + 0.125 = 0.25
PSY 340
Statistics for the
Social Sciences
Hypothesis testing
Distribution of possible outcomes
(of a particular sample size, n)
Can make predictions about
likelihood of outcomes based on
this distribution.
• In hypothesis testing, we
compare our observed samples
with the distribution of possible
samples (transformed into
standardized distributions)
• This distribution of possible
outcomes is often Normally
Distributed
PSY 340
Statistics for the
Social Sciences
The Normal Distribution
• The distribution of days before and after due date (bin
width = 4 days).
-14
0
14
Days before and after due date
PSY 340
Statistics for the
Social Sciences
The Normal Distribution
• Normal distribution
PSY 340
The Normal Distribution
Statistics for the
Social Sciences
• Normal distribution is a commonly found
distribution that is symmetrical and unimodal.
– Not all unimodal, symmetrical curves are Normal, so be careful
with your descriptions
• It is defined by the following equation:
1
2 2
Z-scores
-3
-2
-1
0
1
2
3
e (X )
2
/ 2 2
Estimating Probabilities in a Normal
Distribution
PSY 340
Statistics for the
Social Sciences
probability
Same logic as before
.
4
.
3
.
2
.
1
50%-34%-14% rule
.125
0
.375 .375
1
2
.125
3
50%
Number of heads
-3
-2
-1
0
1
2
3
PSY 340
Statistics for the
Social Sciences
Estimating Probabilities in a Normal
Distribution
50%-34%-14% rule
50%
34.13%
13.59%
-3
-2
-1
0
1
2
3
PSY 340
Statistics for the
Social Sciences
Estimating Probabilities in a Normal
Distribution
Similar to the 68%-95%-99% rule
34.13% 34.13%
2.28%
-3
13.59%
-2
-1
68%
0
2.28%
13.59%
1
2
3
PSY 340
Statistics for the
Social Sciences
Estimating Probabilities in a Normal
Distribution
Similar to the 68%-95%-99% rule
34.13% 34.13%
2.28%
-3
13.59%
-2
-1
95%
0
2.28%
13.59%
1
2
3
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
z
0
:
:
1.00
:
:
2.31
2.32
z
0
• The normal distribution is often
transformed into z-scores.
• Gives the precise proportion of scores (in
z-scores) above or below a given score in
a Normal distribution
• There are many ways that this table gets
organized
• Learn to understand what is in the table
• What do the numbers represent?
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
Z
0.00
0.01
0.02
:
:
1.0
:
1.3
:
:
4.00
Prop in
Body
Prop in
tail
Prop
btwn
mean
and z
.5000 .5000 .0000
.5040 .4960 .0040
.5080 .4920 .0080 •
:
:
:
:
:
:
.8413 .1587 .3413
:
:
:
.9032 .0968 .4032
:
:
:
:
:
:
.99997 .00003 .49997
From the
left side of
the dist.
In tail
z
0
The normal distribution is often
transformed into z-scores.
– Contains the proportions of a Normal
distribution
– Proportion between the z-score and left
side of the distribution
– Proportion in the tail to the right of
corresponding z-scores
– Proportion between the z-score and the
mean
• Note: This means that this table lists
only positive Z scores
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
z
.00
.01
0
:
:
1.0
:
:
2.3
2.4
:
0.5000
:
:
0.1587
:
:
0.0107
0.0082
:
0.4960
:
:
0.1562
:
:
0.0104
0.0080
:
In tail
z
0
• The normal distribution is often
transformed into z-scores.
– Contains the proportions in the tail to the
left of corresponding z-scores of a
Normal distribution
• This means that the table lists only
positive Z scores
• The different columns give the second
decimal place of the z-score
The unit normal table I have provided online (see ‘statistical tables’ link at top of
labs)
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
Mean to Z
z
Mean to Z
In tail
0
:
:
1.00
:
:
2.31
2.32
:
0.0000
:
:
0.3413
:
:
0.4896
0.4898
:
0.5000
:
:
0.1587
:
:
0.0104
0.0102
:
In tail
z
0
• The normal distribution is often
transformed into z-scores.
– Contains the proportions
– Proportion between the z-score and the
mean
– Proportion in the tail to the left of
corresponding z-scores of a Normal
distribution
• Note: This means that this table lists
only positive Z scores
PSY 340
Statistics for the
Social Sciences
The Unit Normal Table
Understand your table
z
.00
.01
-3.4
-3.3
:
:
0
:
:
1.0
:
:
3.3
3.4
0.0003
0.0005
:
:
0.5000
:
:
0.8413
:
:
0.9995
0.9997
0.0003
0.0005
:
:
0.5040
:
:
0.8438
:
:
0.9995
0.9997
From the
left side of
the dist.
0
z
• The normal distribution is often
transformed into z-scores.
– Contains the proportions to the left of
corresponding z-scores of a Normal
distribution
• This table lists both positive and
negative Z scores
Another common way the unit normal table
is presented in textbooks
PSY 340
Statistics for the
Social Sciences
Z
0.00
0.01
0.02
:
:
1.0
:
1.3
:
:
4.00
Using the Unit Normal Table
Prop in
Body
Prop in
tail
Prop
btwn
mean
and z
.5000 .5000 .0000
.5040 .4960 .0040
.5080 .4920 .0080
:
:
:
:
:
:
.8413 .1587 .3413
:
:
:
.9032 .0968 .4032
:
:
:
:
:
:
.99997 .00003 .49997
• Steps for figuring the
percentage below a particular
raw or Z score:
1. Convert raw score to Z score
(if necessary)
XM
z
SD
2. Draw normal curve, where the
Z score falls on it, shade in
the area for which you are
finding the
percentage
3. Make rough estimate of
shaded area’s percentage
(using 50%-34%-14% rule)
PSY 340
Statistics for the
Social Sciences
Z
0.00
0.01
0.02
:
:
1.0
:
1.3
:
:
4.00
Using the Unit Normal Table
Prop in
Body
Prop in
tail
Prop
btwn
mean
and z
.5000 .5000 .0000
.5040 .4960 .0040
.5080 .4920 .0080
:
:
:
:
:
:
.8413 .1587 .3413
:
:
:
.9032 .0968 .4032
:
:
:
:
:
:
.99997 .00003 .49997
• Steps for figuring the
percentage below a particular
raw or Z score:
4. Find exact percentage using
unit normal table
–
Use your sketch and understanding
of the table
5. Check the exact percentage is
within the range of the estimate
from Step 3
PSY 340
SAT Example problems
Statistics for the
Social Sciences
• The population parameters for the SAT are:
μ = 500, σ = 100, and it is Normally distributed
Suppose that you got a 630 on the SAT. What percent of
the people who take the SAT get your score or worse?
z
X
630 500
From the table:
1.3
100
z(1.3) =.0968
So 90.32% got your
score or worse
-2
-1
That’s 9.68%
above this score
1
2
PSY 340
Statistics for the
Social Sciences
The Normal Distribution
• You can go in the other direction too
– Steps for figuring Z scores and raw scores from
percentages (or proportions):
1. Draw normal curve, shade in approximate area for the
percentage (using the 50%-34%-14% rule)
2. Make rough estimate of the Z score where the shaded area
starts
3. Find the exact Z score using the unit normal table
- So now you’re looking for a percentage/proportion in the body of the table,
and then looking to see what z-score it corresponds to
4. Check that your Z score is similar to the rough estimate from
Step 2
5. If you want to find a raw score, change it from the Z score
PSY 340
Statistics for the
Social Sciences
Testing Hypotheses
• Looking ahead:
How do we determine this?
– Core logic of hypothesis testing
• Considers the probability that the result of a study could have
come about if the experimental procedure had no effect
• “Studies” typically look not at single scores, but rather samples
of scores. So we need to think about the probability of getting
samples with particular characteristics (means).
observed difference
test statistic
difference expected by chance
Z
(X X )
X
Based on standard error or an
• Next time:
estimate of the standard error
– The distribution of sample means