Transcript File

Research Methods & Statistics
Methods Is Like Whac-A-Mole…
a fun game for which there is no perfect solution
Whac-A-Mole
• If I’m conducting my own research, the goal is
to minimize the moles.
• If I’m considering the research of others, the
goal is to identify as many moles as I can.
Things to Keep in Mind about Statistics
Our focus should be conceptual, not computational.
Statistics are necessary to understand the meaning of a
set of numbers.
We need to demonstrate the importance of statistics
throughout the entire course, not just in the methods
unit.
Frequency Distributions
Putting scores in order adds meaning
Bar graphs (histograms) are visual
representations of frequency
distributions.
A 40 4
39 7
38 10
37 8
36 15
B 35 8
34 8
33 8
32 7
C 31 4
30 5
29 7
28
D 27
26 2
25 1
24 2
F <24 1
45%
32%
16%
5%
1%
What is the center of the distribution?
Measures of Central Tendency
Mode
--Most common = 4
Mean
--Arithmetic avg = 20/5 = 4
Median
--Middle score = 4
Quiz Scores
4
3
5
4
4
Central Tendency: Mean vs. Median
1968 TOPPS Baseball Cards
Nolan Ryan
Billy Williams
Luis Aparicio
Harmon Killebrew
Orlando Cepeda
Maury Wills
Jim Bunning
Tony Conigliaro
Tony Oliva
Lou Pinella
Mickey Lolich
$1500
$8
$5
$5
$3.50
$3.50
$3
$3
$3
$3
$2.50
Elston Howard
Jim Bouton
Rocky Colavito
Boog Powell
Luis Tiant
Tim McCarver
Tug McGraw
Joe Torre
Rusty Staub
Curt Flood
With Ryan:
Median=$2.50
Mean=$74.14
$2.25
$2
$2
$2
$2
$1.75
$1.75
$1.5
$1.25
$1
Without Ryan:
Median=$2.38
Mean=$2.85
The median is a better measure of central tendency than
the mean when there are extreme scores.
How spread out are the data?
Measures of variation
Range
•
•
The spread between
the highest number &
the lowest number.
Only considers two
numbers
Standard deviation
Calculation Example for Standard Deviation
Punt
Distance
36
38
41
45
Deviation
from Mean
-4
-2
+1
+5
Deviation
Squared
16
4
1
25
std. dev. =
Variance =
11.5 = 3.4 yds
Mean =
160/4 =
40 yds
46
46/4 = 11.5 =
variance
Properties of the Normal Curve
In a large, randomly distributed data set
• 68% of scores will be within 1 SD of the mean.
• 95% of scores will be within 2 SDs of the mean.
• 99.7% of scores will be withing 3 SDs of the mean.
Properties of the Normal Curve
Marilyn vos Savant: claimed IQ of 228.
Is it more meaningful to express her IQ as points
above average or as standard deviations above
average?
Correlation
• A measure of the strength of the relationship
between two variables.
• Can be positive or negative.
• Useful for making predictions.
• You can calculate correlations with Excel or
Google Docs.
Correlation
What does a correlation looks like?
Scatterplots
Positive Correlation
Negative Correlation
Correlation
No Correlation
Correlation
How do you express a correlation numerically?
The Correlation Coefficient
Correlation
A strong correlation is not enough to establish a
cause and effect relationship.
Example: There is a correlation between TV
watching and grades.
Do you think it’s positive, or negative?
From this, what do we know about cause-andeffect.
Correlation
Even correlations that are clearly not cause-andeffect relationships can be used for prediction.
Ex: College entrance exams and freshman GPA.
Ex: Shoe size and vocabulary size in elementary
school children.
Ex: Ice cream sales and the rate of violent crimes.
Statistical Significance
• A measure of the
likelihood that a result is
caused by chance.
• In an experiment, we want that likelihood to be
low so we can conclude a cause-and-effect
relationship exists between the IV and the DV.
Statistical Significance
• Several statistics (e.g., chi square, t-test) can be
used to calculate statistical significance, but our
students don’t need to know these 
• They do need to know how to interpret the
results of these tests—the p value.
Statistical Significance
• P value is an estimate of the probability that a
result was caused by chance.
• In an experiment, it’s the likelihood that the
difference between the experimental and control
conditions as measured by the DV was caused
by chance.
• We want this difference to be caused by our
manipulation—the IV—not by chance.
Statistical Significance
• To say that the results of an experiment are
statistically significant means that there is a
small likelihood that the results were caused by
chance; that is, a high likelihood they were
caused by the IV.
• The threshold for statistical significance is no
more than a 5% likelihood the results were
caused by chance.
• We express this: p ≤ .05
Experimentation
The purpose of an experiment is to establish a
cause-and-effect relationship.
Experiments are the only research method that
can establish cause-and-effect.
Experimental Design Terms
•
•
•
•
•
•
•
•
•
•
•
Hypothesis
Operational definitions
Participant selection
IV & DV
Experimental & control groups
Confounding variables
Random assignment
Placebo control
Double blind procedure
Statistical significance (p value)
Replication
Importance of Operational Definitions
Students are more likely to smile for their senior
pictures if they have a friendly photographer.
IV?
Photographer friendliness
DV? Smiling
Operational definitions are needed for both of these
variables. To illustrate the importance of this, have
students determine how many of the students on the
following slide are smiling.
How many smiles?
Importance of Operational Definitions
If we want our students to be critical consumers
of research, we need to teach them to always
ask how research variables were operationalized
(“What do they mean by ‘best school’?”).
Research cannot be replicated without
operational definitions.
Replication
Research results that haven’t been replicated
are termed “preliminary.”
We cannot conclude cause-and-effect from
preliminary results because the p value can’t be
reduced to 0.
Random Sampling vs. Random Assignment
Random Sampling
• To select participants from population
without bias
• Allows you to generalize results from the
research to the population
Random Assignment
• To divide participants into groups
• Controls confounding variables—an
experimental requirement to “balance” the
effects of individual differences
Importance of Replication
In 1970, Linus Pauling conducted a famous
experiment indicating that vitamin C prevents
colds. Over a dozen replication attempts failed.
IV
DV
Expt. Gp.
Vit C
Cntrl. Gp.
Placebo
Expt. Gp.
45%
Fewer colds
Overconfidence
Have students answer the questions on the following
slide by writing a small number and a large number such
that they are at least 98% certain that the correct answer
is in between. Provide answers and by show of hands
determine how many students get each wrong. It’s
usually about half—far more than the 2% you’d expect if
they were 98% certain of their correctness.
The obvious best strategy is to answer with extreme
numbers, but few students do because of
overconfidence. They are too certain that they can
narrow it down.
98% Certainty
1. The area of the US in square miles?
2. The population of Australia 2007?
3. American battle deaths in Spanish-American
War?
4. Female psychiatrists in the US in 2005?
5. Operating nuclear plants worldwide in 2007?
98% Certainty
1.
2.
3.
4.
5.
Area of US:
Australian pop.:
Battle deaths:
Female psychiatrists:
Nuclear plants:
3.6 million sq. miles
20.4 million
385
13,079
435
Difficulties with Surveys
Have one side of your class put their heads down while
the other side writes the answer to the two questions on
the next slide. Then, have the other side answer the
questions on second slide.
Students who are primed with the 500 mile question
have much smaller estimates of the length of the river
than students who are primed with the 3,000 mile
question.
Students know that the wording of a question can
influence the answers to that question, but they are
shocked to learn that the way a first question is worded
can influence answers to a second question.
Survey Demonstration
1. Is the Mississippi River longer or shorter
than 500 miles?
2. How many miles long is it?
Survey Demonstration
1. Is the Mississippi River longer or shorter
than 3,000 miles?
2. How many miles long is it?