Exploring Relationships - Gail Johnson`s Research Demystified
Download
Report
Transcript Exploring Relationships - Gail Johnson`s Research Demystified
Data Analysis: Exploring
Relationships
Research Methods for Public
Administrators
Dr. Gail Johnson
Dr. G. Johnson,
www.ResearchDemystified.org
1
Relationships
The most interesting data analysis often involves
looking at relationships.
Do children who attend a pre-school program have
higher academic performance than children who do
not?
Do some neighborhoods in the city have less crime than
others?
Is there a relationship between the wealth of a
neighborhood and satisfaction with city services?
Does sunspot activity affect stock market activity?
Does human activity cause climate change?
Dr. G. Johnson,
www.ResearchDemystified.org
2
NASA: Charts Temperature
Anomalies Over Time
Source: http://www.nasa.gov/topics/earth/features/temp-analysis-2009.html
Dr. G. Johnson,
www.ResearchDemystified.org
3
NASA: Temperature
Anomalies Over Time
This is a time series analysis.
The program calculates trends in temperature
anomalies -- not absolute temperatures — but
changes relative to the average temperature for the
same month during the period of 1951-1980.
Find this article at:
http://www.nasa.gov/topics/earth/features/tempanalysis-2009.html
Dr. G. Johnson,
www.ResearchDemystified.org
4
NASA: Temperature
Anomalies Over Time
NASA article states: “Climate scientists
agree that rising levels of carbon dioxide
and other greenhouse gases trap incoming
heat near the surface of the Earth and are
the key factors causing the rise in
temperatures since 1880, but these gases are
not the only factors that can impact global
temperatures.”
Dr. G. Johnson,
www.ResearchDemystified.org
5
NASA: Temperature
Anomalies Over Time
“There's a contradiction between the results
shown here and popular perceptions about
climate trends," Hansen said. "In the last
decade, global warming has not stopped."
Dr. G. Johnson,
www.ResearchDemystified.org
6
Tough Questions About
Climate Change
To what extent does human activity
contribute to increased CO2 levels?
To what extent does increased CO2 levels
impact the global climate?
Is there an association? Is there a causal
relationship?
Remember:
association is necessary but not
sufficient proof of a causal relationship.
Dr. G. Johnson,
www.ResearchDemystified.org
7
Climate Change:
Tough Questions
What does science tell us?
Does it present specifics about measurement,
models, assumptions and limitations of the
research
Does it provide the ranges for their estimates?
What is the political spin?
Who are the interest groups lining up on either
side and what is their financial stake in winning
the policy argument?
Dr. G. Johnson,
www.ResearchDemystified.org
8
Tough Questions About
Climate Change
Policy Analysis: What are the costs of doing nothing
versus taking various policy actions?
Exactly what did they measure—and what specific
assumptions did they use to arrive at their costs?
Remember: cost-benefit analysis can include direct
monetary costs and benefits, opportunity costs—but
also social costs and benefits that are not easily reduced
to dollars
Remember: who pays and who gains are not necessarily
the same
Environment and debt: we gain, future generations
pay
Dr. G. Johnson,
www.ResearchDemystified.org
9
Current Controversy:
Climate Change
My View: sophisticated users should be wary of
an advocate for a policy position who uses a
message of fear supported by “facts” that are, at
best, rough estimates.
None of us can pinpoint exactly how much we will
spend on food next year.
How can anyone make exact predictions about when
we will reach the point of no return for earth’s survival
or state that we will spend $40 trillion on climate
change “prevention” by 2099 (as some pundit stated on
TV)?
Dr. G. Johnson,
www.ResearchDemystified.org
10
Current Controversy:
Climate Change
Best estimates are inherently open to debate and
criticism.
To treat them as solid and indisputable fact is to be
seduced by political spin.
In my view, there are many reasons besides climate
change to switch to renewable energy and sustainable
manufacturing anyway.
Common sense tells us that at some point, we will run out of
non-renewable energy even if we can’t predict the exact date
Do no harm is a good ethical value, whether in
conducting research or living one’s life.
Dr. G. Johnson,
www.ResearchDemystified.org
11
Current Controversy:
Climate Change
Nor I am convinced the economy will be destroyed by that
transition.
Money is fluid: if we move it from oil to solar, the money will still
circulate in the economy. New jobs will be created as old jobs
disappear, new entrepreneurs will emerge as old ones fade.
It is economic change.
Although businesses and employees in those oil-dependent
industries will feel the brunt of that transition.
What is likely harm, and if likely to be great, are there policy
approaches to minimize harm during the transition?
Change is a constant reality of life-as is choice
Dr. G. Johnson,
www.ResearchDemystified.org
12
Two Types of Relationships
Association: two variables that appear to be related but are
not causally connected
Associations can identify “risk factors”, they are like a
canary in a mineshaft
Risk factors: early alcohol use by youth is associated with
illegal drug use but it is an indicator of risk rather than the
causal factor
Useful in identifying youth likely to experience difficulties
later and who may benefit from early intervention programs
Association is necessary but not sufficient to
demonstrate a cause-effect relationship
Dr. G. Johnson,
www.ResearchDemystified.org
13
Two Types of Relationships
Correlation: two variables appear to be in a linear
relationship. Implies a causal relationship but 4
conditions must be met for cause-effect
relationship:
Logical theory
Time-order
Co-variation
Elimination of rival explanations
Dr. G. Johnson,
www.ResearchDemystified.org
14
Review of Basic Concepts:
Independent and Dependent Variables
When seeking to determine whether there is a cause-effect
relationship between two variables, researchers need to
identify the independent and dependent variables
The context plays a role so you should expect the
researchers to be clear about their rationale for deciding
which one is the independent variable and which is the
dependent variable
Their choices should make sense
But sometimes it is hard to determine which is causal:
sometimes all we can tell is that there is a relationship
Dr. G. Johnson,
www.ResearchDemystified.org
15
Example: Happiness and TV
A study reported that a relationship between
self-reported levels of happiness and the
amount of TV watched per day
Unhappy people were more likely to watch
more hours of TV than happy people
Does unhappiness cause people to watch more
TV or does watching more TV cause people to
experience unhappiness?
Dr. G. Johnson,
www.ResearchDemystified.org
16
Review of Concepts:
Dependent Variable (DV)
Variable the researchers want to explain
The variable expected to change because changes
in the independent variable
In program evaluation: the outcome measure
Dr. G. Johnson,
www.ResearchDemystified.org
17
Review of Concepts:
Independent Variable (IV)
Variable which occurred first and which the
researchers believe explains a change in the
dependent variable
It could be a specific treatment in an experiment
It could be a characteristic (like age, gender, etc)
used as a statistical control
In program evaluation: the program
Dr. G. Johnson,
www.ResearchDemystified.org
18
Data Analysis Techniques:
Working With 2 Variables
Descriptive statistics
Cross-tabulations (Crosstabs)
Also
called contingency tables
Comparison of means
Dr. G. Johnson,
www.ResearchDemystified.org
19
Data Analysis Toolkit: Crosstabs
Used when working with nominal and
ordinal data
Can be used with interval/ratio data that
have been categorized into ordinal data
Dr. G. Johnson,
www.ResearchDemystified.org
20
Using Crosstabs
The question:
Are boys more likely to take hands-on classes than
girls?
We are testing whether there is a difference based on
gender.
Put another way is type of class associated with gender?
The independent variable is gender.
The dependent variable is the types of classes:
Hands-on.
Traditional.
Dr. G. Johnson,
www.ResearchDemystified.org
21
Setting up the Analysis
The analysis always looks at the different
categories of the independent variable (boys and
girls) and at what percent distribution in the
dependent variable, which is the type of classes.
The percent distribution of the dependent variable
always totals 100%. (See Table 13.1)
It shows the percent of boys in each the two classes.
It shows the percent of girls in each of the two classes.
Dr. G. Johnson,
www.ResearchDemystified.org
22
Crosstab Table 13.1
Boys
Girls
Hands-on
Classes
45%
35%
Traditional
Classes
55%
100%
65%
100%
Dr. G. Johnson,
www.ResearchDemystified.org
23
Interpreting the Crosstabs Table
Boys are somewhat more likely (45%) to
take the hands-on classes as compared to
girls (35%).
Dr. G. Johnson,
www.ResearchDemystified.org
24
Table 13.2 Gender and Attitude
About the Death Penalty
Men
Women
Favor
80%
68%
Oppose
20%
32%
100%
n=644
100%
n=747
Dr. G. Johnson,
www.ResearchDemystified.org
25
Gender: Attitude
About the Death Penalty
Gender is the independent variable
Attitude about the death penalty is the
dependent variable
Analyzing the table:
Of
the 644 men who responded to the survey
question, 80 percent favor the death penalty as
compared to 68 percent of the 747 women.
Dr. G. Johnson,
www.ResearchDemystified.org
26
Does Gender Make a Difference in
Attitude About the Death Penalty?
Interpreting crosstab table 13.2
The
short answer: While the majority of both
men and women favor the death penalty, a
greater percent of the men reported favoring it
(80 percent) as compared to women (68
percent).
Dr. G. Johnson,
www.ResearchDemystified.org
27
Crosstab Tables
Takeaway lesson
When
testing for a relationship using crosstabs,
the independent variable can be placed in either
the column or in the row.
Remember: it is essential that the percent
distribution of each category of the independent
variable total 100%.
Dr. G. Johnson,
www.ResearchDemystified.org
28
Gail’s Analysis Guidelines
Determine which variable is the independent
variable and which is the dependent variable.
The percentages for each category of the
independent variable will add to 100%.
The total number of respondents for each category
of the independent variable is shown since this is
the basis for the percentage calculations.
This can be helpful if you are analyzing survey data and
some questions have fewer respondents because a large
number took an “exit.”
Dr. G. Johnson,
www.ResearchDemystified.org
29
Gail’s Analysis Guidelines
Data is presented in percents (or percents
and counts but not just the counts).
For survey data: round percentages to the
nearest whole number because it makes the
data easier to remember and avoids giving a
false sense of precision.
Rounding
rule: less than .5, round down. If .5
or higher, round up.
Dr. G. Johnson,
www.ResearchDemystified.org
30
Adding Complexity:
“Controlling For Stuff”
Controlling for a Third Variable
Two variables could be associated but are not causal
connected. A third factor may be causing both:
Classic example: As ice cream consumption rises,
drowning rises.
Does ice cream consumption cause drownings?
No—the hidden variable here is summertime. Both ice
cream consumption and downings increase in the
summertime.
Dr. G. Johnson,
www.ResearchDemystified.org
31
Controlling For A Third Variable
http://pewsocialtrends.org/pubs/750/new-economics-ofmarriage#prc-jump
Dr. G. Johnson,
www.ResearchDemystified.org
32
Controlling For A Third
Variable
This analysis looks at education
(whether they have are a college
graduate or not). And marriage,
controlling for gender
The data, shown as a bar graph, shows
the that the percent married in 1970 and
2007 based on whether they had a
college degree or not
Dr. G. Johnson,
www.ResearchDemystified.org
33
Controlling For A Third
Variable
There are noticeable declines in marriage,
but the declines are sharper for those without
a college education
Those without a college degree were less
likely to be married in 2007 as compared to
1970.
There was also a decline in percent married
for those with college degrees.
This was true for both men and women.
Dr. G. Johnson,
www.ResearchDemystified.org
34
Discussion
What might explain why those without a
college degree experienced a larger
decline in marriage?
Are there possible economic
implications, and if so, what might they
be?
Dr. G. Johnson,
www.ResearchDemystified.org
35
State Employee Survey:
“Controlling For Stuff”
Results from a state-wide survey can hide
important information
Perhaps there are differences between the
agencies; Some might be better managed
than others
Dr. G. Johnson,
www.ResearchDemystified.org
36
State Employee Survey:
“Controlling For Stuff”
Perhaps there are differences in satisfaction with
specific management practices based on age or
gender?
Perhaps there are differences in views about
diversity practices based on gender, race or sexual
orientation?
Thinking about your organization: what might
explain differences in employee satisfaction?
Dr. G. Johnson,
www.ResearchDemystified.org
37
Data Analysis Toolkit:
Comparison of Means
Do men earn more than women?
Dependent variable:
income
Independent variable:
gender
Mean income
Men
$37,685
Women $34,566
Dr. G. Johnson,
www.ResearchDemystified.org
38
Does Some Other Variable Explain
Income Differences?
Is salary inequality based on gender?
Another rival explanation might be education: so
we need to control for that. be another variable—
say education level—that might really be the
factor that explains (is related to, is correlated
with) difference in men and women’s income.
We can “control for” a third variable by having the
computer to analyze the mean income of men and
women for each level of education.
Dr. G. Johnson,
www.ResearchDemystified.org
39
Controlling for a
rd
3
Variable
Average Salaries for Men and Women, Controlling for
Education
Low
Medium
High Education
Men
$25,000
$35,000
$75,000
Women
$25,000
$35,000
$75,000
Dr. G. Johnson,
www.ResearchDemystified.org
40
What Happened?
In this fake example, the relationship
between gender and income disappears.
It is very clear that education is the big
causal factor.
Dr. G. Johnson,
www.ResearchDemystified.org
41
Real Data: Gender and Grade
Level, Controlling for Education
The federal civil service is organized by
grade levels, going from 1-15
The top level—Senior Executives-are
grades 16-18.
Grades 9 and above are typically held by
people in the professional occupations—
including supervisors and managers.
Dr. G. Johnson,
www.ResearchDemystified.org
42
Real Data: Average Grade Level,
Controlling for Education
Overall Women
Without 4-year
college degree
Bachelor’s
10.86
11.27
11.46
12.10
12.45
11.79
12.65
Doctorate
13.40
13.20
13.43
Professional
13.62
13.44
13.67
Master’s
11.08
Men
11.94
Source: merit systems protection board survey, 1991-1992.
Dr. G. Johnson,
www.ResearchDemystified.org
43
Interpretation?
As educational levels increase, so do grade
levels for both men and women.
One
conclusion: education appears to matter.
But it is also true that women’s grade levels
are lower than men’s at the same level of
education.
Dr. G. Johnson,
www.ResearchDemystified.org
44
Note: Decimal Places Used
Note: in this analysis, decimal places reveal useful
information that would be lost if rounded.
Rounding does not make sense when working
with real numbers that have a very limited range
(1-18, for example).
When analyzing a limited range of real numbers
using means, you want to preserve the decimal
places.
The is also true with grade point averages and
faculty evaluations (rating scales of 1-5 or 1-10)
Dr. G. Johnson,
www.ResearchDemystified.org
45
Research Design:
Correlation with Statistical Controls
Early I presented a design called “correlation with
statistical controls”. This is how it is done
Controlling for other variables is a way to try to
eliminate possible rival explanations
Does gender explain differences in salaries or are
differences explained by differences in education?
If the dataset has other variables, such as years of
experience or breaks in employment, these can be used
as control variables to test for rival explanations besides
discrimination as the cause of salary differences.
Dr. G. Johnson,
www.ResearchDemystified.org
46
“Controlling for Stuff”—Part
Of Planning
The researchers need to consider possible
control variables when they are developing
their research design and data collection
tools.
If they think age, education, or race might be
important, they need to build that into their data
collection.
If they think traffic will vary by weather conditions,
day of the week or time of the day, they need to
collect that data.
Dr. G. Johnson,
www.ResearchDemystified.org
47
Takeaway Lesson
Always Ask: Has their purported relationship met
all the criteria for asserting causality?
Relationships are difficult to measure and it is
hard to demonstrate cause-effect relationships
A single variable rarely causes anything in itself—the
world that public administrators deal with is more
complex
Exercise healthy skepticism whenever anyone asserts
they know the single cause of a complex phenomenon
Dr. G. Johnson,
www.ResearchDemystified.org
48
Creative Commons
This powerpoint is meant to be used and
shared with attribution
Please provide feedback
If you make changes, please share freely
and send me a copy of changes:
[email protected]
Visit www.creativecommons.org for more
information
Dr. G. Johnson,
www.ResearchDemystified.org
49