Research Design and Analysis
Download
Report
Transcript Research Design and Analysis
RESEARCH DESIGN AND ANALYSIS
Jan B. Engelmann, Ph.D.
Department of Psychiatry and Behavioral Sciences
Emory University School of Medicine
Contact: [email protected]
A note on the course
Your grade will be composed of your:
Participation
in classroom discussions and contribution to
experimental design for our experiment (20%).
Quizzes (20%).
There
will be 2 of those, one each on Tuesday and
Wednesday.
A
very brief 3-page paper on the experiment we are
going to conduct in class.
Page
limit applies to text only, you can add as many pages
as you like for figures.
Course webpage:
http://web.me.com/jan.engelmann/jbe/MBRS-RISE.html
Let’s jump right in: an experiment
Does the effect of drugs of abuse depend on
context?
Morphine is commonly used for pain treatment.
Repeated
administration causes tolerance.
In animals the analgesic effects of drugs can be tested
using the plantar test:
Application
of heat source to paw.
Measurement of rat’s reaction time until paw withdrawal.
Reaction time is slowed after morphine administration.
∆ RTdrug – RTnodrug = analgesic effects of drug.
The effects of accute morphine
Seconds till Paw Withdrawal
After first administration of drug.
7
6
5
4
3
2
1
0
Saline
Morphine
3mg/kg
Morphine
6mg/kg
Hypothetical data
The effects of chronic morphine
Seconds till Paw Withdrawal
After 2 weeks of daily morphine.
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
Saline
Morphine
3mg/kg
Morphine
6mg/kg
Hypothetical data
Why does tolerance develop?
One factor important in the development of tolerance is
context (Siegel, 1975).
Cues associated with morphine administration
(conditioned stimuli or CS) elicit a compensatory
response that counteracts drug effects.
Associative tolerance is a compensatory response.
Key brain structure: amygdala.
Hypothesis: If an animal that has developed tolerance
to morphine in one context receives a dose of morphine
in an unfamiliar context, the effects of morphine should
be amplified.
Increased paw withdrawal latencies in novel compared to
familiar context.
Associative tolerance
Mitchell, Nature Neuroscience, 2000
Why is this important?
Based on these findings a theory of drug overdose
was developed:
Heroin
is a common drug of abuse in humans.
Heroin is a derivative of morphine (both are opioids).
Heroin addicts show tolerance to the drug’s effects and
compensate by administering larger doses.
Associative
Most
This
compensatory brain mechanisms at work.
heroin overdoses occur in novel contexts
is when the “standard” dose becomes fatal.
The Scientific Method
Identify a problem / a question we want to answer.
Is
the current flu epidemic different from the typical flu?
Does smoking lead to cancer?
Does studying lead to better grades?
Are brains of homosexuals different from heterosexuals?
Formulate a hypothesis.
What
is a hypothesis?
Collect and analyze empirical data to test our
hypothesis.
How
would we go about doing that?
That
depends on the question …
Hypotheses
A scientific hypothesis is a testable and falsifiable
question or prediction that is designed to explain a
phenomenon.
It
is the starting point for research design.
Scientific hypotheses must be testable and
falsifiable:
Infants
prefer beautiful faces over average faces.
Drinking alcohol / smoking marijuana impairs driving.
The dopamine system is involved in reward processing.
Some counter examples
Untestable hypotheses:
There
are many other parallel universes with which we
cannot have any contact.
75 million years ago, Xenu, an alien ruler of the
Galactic Confederacy, brought billions of people to
earth in a spacecraft.
How
about this one?
http://www.youtube.com/watch?v=cc_wjp262RY
Scientific Theory
A collection of related facts that were derived from
hypothesis testing using the scientific method.
Usually, evidence collected from a number of
experiments lead to the development of a theory.
E.g.
associative morphine tolerance.
A theory forms a coherent explanation for a larger
phenomenon.
E.g.
the emotional and the cognitive brain (Joseph
LeDoux).
Inductive vs. Deductive Reasoning
Inductive reasoning:
Generalizing
from a few observations in the
development of theory.
This
process requires empirical data.
Typically
employed in Psychology, behavioral and
social sciences that lack unified theories.
Deductive reasoning:
The
use of existing theories to make predictions about
how an unknown phenomenon is likely to operate.
This can then be tested using empirical methods.
Typically employed in natural sciences, e.g. physics.
A scientific way to reason inductively
Statistics:
1. A piece of information that is presented in numerical
form.
2. A set of procedures and rules for managing and
analyzing data.
A statistic: mean age of women in this class.
also known as data analysis.
3. Arithmetic or algebraic manipulations applied to data,
e.g. the mean.
Importantly:
Statistical methods are tightly related to the questions we
ask and the experimental methods we use.
To understand that we need to review some concepts …
Variables
A variable is a factor that can be measured and whose
value can change, e.g. from person to person.
Contrast: a constant is a number whose values does not
change, e.g. π = 3.1416.
But we also manipulate variables when we design and
conduct experiments.
The variables we manipulate are called independent
variables:
What variables were manipulated in the morphine tolerance
experiments?
Our treatment conditions are the different levels of the
independent variable.
E.g. different levels of a drug, different mood manipulations,
different amounts of money paid.
Independent vs. Dependent Variables
When we analyze an experiment, we work with
variables we recorded.
These are called dependent variables:
Constitute
our data.
It is what we record/observe.
E.g. reaction time, errors on a task, scores on a test, etc.
We can then investigate the effects of the
independent variable on the dependent variable:
The
effect of our treatment on the variable/behavior
of interest.
Variables
There are different types of variables that
expeirmenters work with:
Continuous variables:
Interval/ratio
25
scales.
Categorical variables:
Nominal
scale.
4.5
4
3.5
3
15
10
GPA
Mean DA efflux
20
2.5
2
1.5
5
0
y = 0.3037x + 1.8757
R² = 0.6736
1
0.5
0
0
1
2
3
4
5
Mean hours of studying/day
6
7
Scales of measurement
Data can be qualitative and quantitative.
Qualitative
= descriptive.
Quantitative = numerical.
Nominal scales:
Qualitative
differences are expressed as numbers.
Recoding a qualitative variable into numeric values to
allow summary information in statistical software.
These
values are quantitatively meaningless, they are simply
a means to distinguish between categories.
E.g. females = 1, males = 0;
Others: race, ethnicity, religion, etc.
Scales of measurement
Ordinal scales:
Rank
or order observations based on whether they are
greater than or less than other observations.
No information about distance between data points is
provided.
E.g. Phelps ranked first in the 200m freestyle at the
2008 Olympics in Beijing.
And
many other events…
Improvement over nominal scales: we can identify if a
data point is > or < another data point.
Scales of measurement
Quantitative scales - Interval and Ratio scales:
Most precise measurements, as the exact distance between 2 data points can
be quantified.
Interval scales do not have a true zero point.
E.g. Temperature: temperatures of –x degrees Fahrenheit are still meaningful.
However, the absence of a zero point does not allow us to talk about ratios,
e.g. some observation being 4 times greater than another.
30 degrees is not half as hot as 60 degrees.
Someone with a BDI score of 15 is not half as depressed as someone with a score
of 30
Ratio scales do have a true zero point.
Zero point indicates a true absence of information.
0 miles/hour means there is no movement.
This allows researchers to use ratios to describe the relationship between 2
data points.
120 miles/hour is twice as fast as 60 miles/hour.
300 pounds is twice as heavy as 150 pounds.
PART II
Basic concepts in experimental design
and analysis
Statistics and data analysis are only tools, a means to
answer our questions about the world.
Experimental design goes hand in hand with statistics.
If
we miss important factors during the design process of
our experiment:
e.g.
a confound: a uncontrolled variable that systematically
and unknowingly affects our data and prevents a clear
interpretation.
We may be able to control for it statistically, but,
That is never as good as controlling for it at the onset of the
experiment.
So,
what exactly is an experiment?
Experiments
An experiment introduces intentional change into
some situation so that reactions to this change can
be systematically measured.
As experimenters we manipulate variables of
interest to see whether our manipulation has any
effect on behavior.
E.g.
does the administration of a drug (ritalin) cause
changes in our ability to perform on exams ?
This could be studied empirically – can you tell me
how?
Populations and samples
First, we need participants.
Then we need to decide on our experimental design:
These need to be randomly sampled from our population of
interest.
What are samples, what is a population?
Why random sampling?
Within vs. between subjects design.
In BSD we assign half of our participants to a treatment
condition.
Treatment level 1 is our control condition: administration of placebo.
Treatment level 2 is our experimental condition: administration of
ritalin.
Then we measure the behavior of interest (performance on exam).
Population
A population is a complete set of data possessing
some observable characteristic.
Developmental
psychologists may study populations of
children 5 years or younger.
Gerontologists may study populations of older adults
ages 70 and above.
Addiction researchers may study cocaine addicts.
Clinical psychologists may study people with anxiety
disorders or depression.
Population refers to the data points produced by
these groups.
Samples
In research, we rely on samples to say something
about the larger population.
Water droplets:
Sample
Population
Sample
A sample is a subset of the population bearing the same
characteristic as the population of interest.
We need to collect data from a sample, because it is often not
feasible to test the entire population.
The sample therefore has to be representative of the population.
This allows us to draw conclusions about the population based on
our experimental results.
But, how do we know that we obtained a representative
sample?
Statistical procedures help us answer whether a sample is
representative of a population.
Do sample parameters reflect population parameters?
Population parameters and sample
statistics
Population parameters are values summarizing a
measurable characteristic of a population.
The characteristic of interest for our experiment.
E.g. average size of all water droplets.
They are constants.
Sample statistics are used to estimate population
parameters:
A summary value based on some measurable characteristic
of the sample.
Values of sample statistics can vary from sample to sample.
E.g. Repeatedly conducting the same experiment with different
participants will lead to different results.
Sample statistics can vary
Sample
Statistic 1
Sample
Population
Sample
Statistic 2
Sample
Statistic 3
Sample
Population
parameter
Sample
Sample Stat 1 ≠ Sample Stat 2 ≠ Sample Stat 3
Sampling error
This means that using a sample statistic, instead of the
population parameter introduces error.
Sampling error is the difference between our sample
statistic and the true population parameter.
Our measurements are somewhat imprecise.
There are experimental methods to reduce/ minimize
sampling error.
1. Use the biggest samples you can possible get in your
studies.
Larger samples are naturally more representative of the population
and exhibit smaller sampling error.
Simple random sampling
2. Randomly sampling from the population reduces
sampling error and creates a more representative
sample.
Simple random sampling from a population is a
process that gives each member of the population the
same opportunity (an equal chance) of being part of
the subset included in our experiment.
The instance of IQ:
Would
it be valid to estimate the average IQ for the
entire US population from a sample of college students?
It would be highly biased.
Random sampling example 2
We want to estimate the average level of sexual
activity of a population of high school students at
high school X.
We
go into the gym and happen to run into a ninth
grade gym class.
We survey the entire class for their level of sexual
activity.
Is this valid?
The data would greatly underestimate the average
value that would be expected from a truly random
sample.
A side note
Simple random sampling is difficult to achieve in
reality.
The majority of published studies in psychology and
related fields relies on 18-22 year old college
students.
More
so, most participants are drawn from introductory
psychology classes with a research requirement.
What does this mean?
Convenience
sample.
Researchers relying on convenience samples use a
procedure called random assignment to establish
equivalent groups before the experiment.
Random assignment
Between subjects experiments have groups:
Group A = control.
Group B = experimental treatment.
We have 40 participants that want to take part in our
study (20 males and 20 females).
We want to have an equal number of females in each
group.
We want participants to be assigned to each group at
random.
This substantially reduces the possibility that our groups
differed in a characteristic that could influence our dependent
measure.
Descriptive and Inferential Statistics
Descriptive Statistics:
Describing a set of data / a sample. What do the data say?
Mean and variability.
Mean length of time taken to withdraw paw from heat
source.
Variability of change in PWL after morphine administration
across animals.
GPA, Crime rates, drug use, etc.
Inferential Statistics:
Inferring characteristics of populations from those of samples
based on comparisons of descriptive statistics using
probability theory.
The likelihood of our observations given that our Null Hypothesis is
true.
Operational definitions
Operational definitions render hypothetical variables
into concrete operations that can be manipulated or
measured empirically.
Goal: to make concepts and terms used in research
more objective and quantifiable.
Example operational definitions of dependent
variables commonly used in memory research:
Memory
recall is typically defined as the ability to freely
produce items previously learned;
Recognition is the ability to distinguish old from new
items.
Reliability
Reliability refers to a measure’s consistency across
time.
For instance,
If
you administered a personality questionnaire to the
same person twice, with a gap of 1 month in between,
would you expect to get the same score?
No, but if your inventory is reliable, scores should not
vary much.
Any measure will introduce some measurement error
(∆ true score - observed score)
but a reliable one will show less error.
Validity
Does a measure actually measure what it is supposed to?
Validity is the degree to which an observation / a
measurement corresponds to the construct that was
supposed to be observed.
Some example topics from recent neuroscience papers:
Love, fairness, altruism, dread.
How would you measure these?
There are various types of validity that are of concern to
researchers:
Construct validity;
Convergent validity;
Discriminant validity;
Internal validity;
External validity.
Construct validity
Construct validity examines how well a variable’s
operational definition reflects the actual nature and
meaning of the theoretical variable.
Intelligence:
There
are many aspects of intelligence. Do intelligence tests
measure intelligence correctly?
Verbal comprehension, math skills, pattern analysis, memory.
Intelligence
should be a reflection of how well we can
function in our environment.
Emotional intelligence, social intelligence, the ability to
adapt to novel situations.
Convergent and discriminant validity
Convergent validity relates a novel measure to
already established measures.
Correlation
with already established measures is an
indication that it measures a similar aspect of human
behavior.
Discriminant validity is a reflection of how unrelated
a measure is to other measures.
Some
measures are expected to be unrelated to other
measures.
E.g. would you expect intelligence to be related to
depression, aggression, or anxiety?
Internal and external validity
Internal validity is the degree to which the effect of
the independent variable on the dependent variable
is unambiguous and not influenced by other factors.
E.g.
confounding variables that systematically vary with
our measure.
Ice cream consumption and murder rate are highly
correlated.
Both
increase in hot weather.
External validity is the degree to which research
findings can be generalized to other people, other
places, other times, etc.
Do
studies conducted in the US generalize to other
cultures?
ETHICS
Ethics – just a quick note
Ethical treatment of participants.
Our
primary concern is the safety of research participants.
We
do this by examining the risks and benefits associated with
participation.
Informed
consent. We need to inform our participants of
all the procedures that she will undergo when
participating in our experiments.
The
participant needs to have time to ask questions and
understand all the risks and benefits involved in participation.
Participants need to be informed of the right to withdraw at
any time.
Ensure
MRI
privacy and confidentiality of the data we collect.
data, measures of depression, IQ, etc.
LAB SECTION
Your creativity is needed now!
Potential research topics for this class
1. The effect of mood induction on memory.
2. The effect of personality on the efficacy of mood
induction techniques.
Mood induction:
Inducing a mood is possible via various methods:
E.g. viewing of a sad vs. happy movie clips.
Recall of sad vs. happy semantic memory episodes.
1. Memory.
Test memory recall of sad vs. happy items.
Hypothesis: People in sad mood recall sad items better than
happy items and vice versa.
2. Personality.
Are some people more resilient to mood induction than others?