Three Basic Principles of Social Science Research
Download
Report
Transcript Three Basic Principles of Social Science Research
Three Basic Principles of
Social Science Research
Yu Xie
University of Michigan
Conceptual versus Technical
Knowledge
Technical knowledge is important once you
understand how to conduct empirical research.
More often than not, sociologists don’t know how
to conceptualize a research problem that is
empirically testable.
The most difficult, and the most important, part
of methodological training in sociology is
conceptual rather than technical.
Be a thinker, not merely a technician.
Inspired by Otis Dudley Duncan
“But
sociology is not like physics. Nothing
but physics is like physics, because any
understanding of the world that is like the
physicist’s understanding becomes part of
physics…”
(Otis Dudley Duncan. 1984. Notes on Social
Measurement. p.169)
Definition of Terms
By
“social science research,” I mean
quantitative social science research.
By “basic principles,” I mean general
concepts that can be used in actual
research, not generalizations from
research.
Lessons from the History of Science
This
field is mostly dominated by the
history of physical science.
Plato has a long-lasting influence in
science and western philosophy in
general.
“The safest general characterization of the
European philosophical tradition is that it
consists in a series of footnotes to Plato.”
(Whitehead) (Mayr 1982, p.38)
What Made Plato so Important in
the History of Science?
The separation between the “world of being” and
the “world of becoming.”
The scientist’s (philosopher’s) task is to go
beyond observables (world of becoming) to gain
understanding of the world of being. [I.e., need
for abstract thinking]
True knowledge lies in universal, unchanging
laws, not in concrete objects.
Laws are assumed to exist, created rationally by
the Creator. Thus the word “discovery.” This is
the teleological aspect of science.
Plato’s Typological Thinking
Plato’s account of variation: poor replicas of the
world of being.
The world of being consists of discontinuous,
abstract ideas (or forms).
Great success story of following Plato’s
typological thinking in physics.
It also resolved the potential conflict between
science and religion (sufficient, physical, or
immediate causes versus “final causes”).
Examples: Copernicus, Galileo, and Newton.
Deviations
According to typological thinking, deviations are
nothing more than undesirable aberrations.
We attain true knowledge after getting rid of
apparent deviations through abstract thinking.
Bernoulli’s law of large numbers and Laplace’s
central limit theorem provided the mathematical
solution to the measurement of uncertainty.
Remove uncertainty through repeated
observations and assess uncertainty through a
probability (normal) distribution.
Deviation is “error,” undesirable but manageable
with repeated observations (due to “statistical
compensation”)
Difficulties in Social Science
and Quetelet’s Solution
Plato’s typological thinking never worked well for
the study of humans.
There is simply too much variation and
uncertainty.
Measurement theory provided a possible
solution: attaining reliable measurements in the
social world.
Quetelet's social physics was premised on the
“average man,” which seems to satisfy Plato’s
criteria as “true knowledge.”
Quetelet’s Social Physics
Measurement theory, when applied to social
phenomena, became the “law of accidental causes”
because they also follow normal distributions.
“The law of accidental causes is a general law that
applies to individuals as well as peoples and that rules
our moral and intellectual qualities no less than our
physical qualities.” (Kruger 1987, p.76)
He paid attention to variations in averages, such as by
nation, location, age, and race.
Regularities in averages => constant causes => laws.
Moral standard: “The average man…would represent
all that is great, good, or beautiful.” (Stigler, 1986, p.171)
Darwin’s Population Thinking
Variation is reality, not undesirable error on the
part of the observer.
What is important is the individual, not the type.
Offspring of the same parents are different from
each other.
Variation is inheritable from generation to
generation.
Variation is fundamental to natural selection:
abundant genetic variation is produced in every
generation, but only relatively few individuals
survive and reproduce.
Population Thinking and Statistics
In typological thinking, deviations from the mean
are nothing but “errors”, with the mean
approaching the true cause. (Example,
measurement of the speed of sound.)
In population thinking, deviations are the reality
of substantive importance; the mean is a
property of a population.
Distinction between “mean” and “average” by
Jevons, and that between “mean of
observations” and “mean of statistics” by
Edgeworth. (Duncan, 1984, p. 108)
Galton and Social Science
Francis Galton, Darwin’s cousin, introduced
population thinking to social science.
To Galton, the value of averages is limited.
“Individual differences… were almost the only
thing of interest.” (Hilts, 1973, p.221)
Thus, Quetelet’s social physics is of little value.
Scientific inquiry should focus on variations and
covariations.
What is Unique about Variability
in Social Science?
More variability, since unit of analysis is not the
individual, but an individual’s act at a given time.
Variability for human behavior does not
necessarily have a physical agent and is
(largely) not inheritable.
Humans can and do change surrounding
conditions that affect them.
“Men make their own history, but they do not make it
as they please” -- Karl Marx.
Humans are rational in the sense that they may
base behaviors on anticipated consequences.
Past events, even those due to chance, affect
future events (path-dependency).
First Principle
Variability
is the very essence of social
science research.
“Variability Principle.”
Second Principle
Social
grouping reduces such variability.
“Social Grouping Principle.”
What is a Social Group?
I do not take a stand between a nominalist view versus a
realist view.
Social grouping is meaningful only in terms of a social
outcome.
Thus, social grouping may have different meanings when
applied to different social outcomes.
Social grouping reduces variability in a social outcome.
More reduction, more significance is a social grouping.
There are always within-group variation -- variability not
explained by social grouping.
There is a tradeoff between parsimony (of social grouping)
and accuracy (reduced variability): a more detailed
grouping scheme results in a larger reduction of variability.
Third Principle
Patterns
of population variability vary with
social context, which is often defined by
time and space.
“Social Context Principle”
Different “Regimes” of Variability
Social contexts are different from social groups in that
the former are self-contained social systems with natural
boundaries, for example by time and space.
Patterns of individual variability may be governed by
“relationships” between individuals that are not reducible
to individuals’ attributes.
Patterns of individual variability may be governed by
macro-level conditions such as “social structure,”
“political structure,” or “culture,” which may be
discontinuous and fixed.
Collective action may lead to changes of macro-level
conditions and human relationships –major sources of
social change. [Premise of Marxism.]
A Detailed Look: Implications for
Regression Analysis with Survey Data
Setup:
A population with N individuals.
There is an outcome of interest, sat Y that is
measured on the real line.
There is an independent variable of interest,
say D. For simplicity, let us assume that D is
a binary “treatment,” D=1 (T), D=0 (C). This is
the simplest case. Let us call it “canonical
case”
Canonical Case Examined
What is the causal effect of treatment D?
It is the counterfactual effect for the ith individual:
YiT - YiC
However, we either observe
YiT when Di =1 or
YiC when Di =0.
Conclusion: it is not possible to identify individuallevel causal effect without assumptions.
At Another Extreme
We can impose a strong, unrealistic assumption
that all individuals are homogeneous (an
assumption often made in physical science),
then we have
YiT = YT ; YiC = YC
We only need two observations to identify the causal
effect:
YT when D =1 and
YC when D =0.
Implication: it is because of population variability
that makes “scientific sampling” necessary.
Now Consider the Usual Case
Population is divided into two subpopulations: P1 if
Di =1, P0 if Di=0.
Use the following notations:
q = proportion of P0 in P
E(Y1T) = E(YT|D=1) , E(Y1C) = E(YC|D=1)
E(Y0T) = E(YT|D=0) , E(Y0C) = E(YC|D=0)
By total expectation rule:
E(YT - YC) = E(Y1T – Y1C)(1-q) + E(Y0T – Y0C)q
= E(Y1T – Y0C) - E(Y1C – Y0C) - (d1-d0)q,
where d1 = E(Y1T – Y1C), d0 = E(Y0T – Y0C).
Or:
E(Y1T – Y0C) = E(YT - YC) + E(Y1C – Y0C) + (d1-d0)q.
In Other Words
The “simple” estimator E(Y1T – Y0C) contains two
sources of biases:
The average difference between P1 and P0 in
the absence of treatment. ( “heterogeneity
bias.”)
The difference in the average treatment effect
between P1 and P0. ( “endogeneity bias.”)
Both sources of bias average to zero under
randomized assignment.
In Regression Language
Yi = a + diDi + ei
Two types of variability:
Heterogeneity bias: ei. If corr(e,,D)≠0, =>
heterogeneity bias.
Endogeneity bias: di If corr(d,,D)≠0, =>
endogeneity bias.
Comments
Comment 1: D is a random variable containing
heterogeneity
Comment 2: Heterogeneity bias may result from “omitted
variable biases.”
Comment 2: Endogeneity bias may result from rational
“anticipatory behavior.”
Comment 3: Endogeneity means that variability in Y
could be enhanced (or reduced) by treatment D.
Comment 4: this model is not estimable. Needs to be
“constrained.” (Assumptions)
Common assumptions:
corr(e,,D)≠0
di = d.
A General Lesson
In
reviewing Manski's book, I stated (1996,
AJS):
“When observed data are thin, it takes
strong assumptions to yield sharp results.
There is no free information in statistics.
Either you collect it, or you assume it.”
Using Social Grouping to Control
for Heterogeneity
Assuming no endogeneity bias (which is more
difficult to handle).
Social grouping always reduces variability =>
implies more within-group homogeneity.
We may assume that meaningful heterogeneity
can be captured by social grouping.
Assumption of conditional independence:
e ┴ D|X
Change regression to: Yi = a + dDi +b’Xi + ei
Estimable (via OLS or ML)
Comments
Comment 1: For X to do this, it needs to be correlated
with D (“correlation condition”) and affects Y (“relevance
condition”).
Comment 2: X should be pre-treatment, determining both
D and Y structurally.
Comment 3: There are other research designs:
Propensity score.
Instrumental variable estimation.
Fixed effects model.
Heckman-type selection models (which also handles
endogeneity bias).
Quasi-experiments.
They all depend on strong, untestable assumptions.
Accounting for Heterogeneous
Responses
More difficult to handle, a degree of freedom
problem.
Possible with nested data, assuming that patterns
of relationships are homogeneous (or following a
distribution) within social contexts (by time or
space).
dk is allowed to vary across k (k=1,…K), social
context, but is fixed within k. For example:
Yik = ak + dkDik + eik
ak = l+fzk+mk
dk = g1+szk+nk
Application of the Social Context Principle.
Comments
Comment 1: It is possible to impose a parametric
assumption on individual-level di, but the results are
dependent on the assumption. (Bayesian approach.)
Comment 2: Nested (or hierarchical) structure could be
used for studying time variation or spatial variation. A
key assumption: there are common features that are
shared by different elements in a common social context.
Comment 3: If variations across social contexts are
systematic, they can be modeled (i.e., multi-level
models, or hierarchical linear models, random-coefficient
models, and growth-curve models). If such variations
are left as observed (or saturated), we have the fixedeffects model.
Concluding Remarks
(1) Sampling is important, since we can only
discuss aggregated properties of a population.
(2) Descriptive studies are informative and
arguably the only achievable thing we can do,
without strong assumptions.
(3) Randomized experiments can’t solve our
problems entirely, because it’s hard to
generalize experimental results to the
population.
(4) Statistics, while imperfect, are the only tools
in social science to characterize heterogeneity.
I.e., counterfactual effects are inherently not estimable at
individual level.
Concluding Remarks (Continued)
(5) Statistical results are meaningful only when
interpreted in reference to a population of interest.
(6) Causality is always probabilistic.
(7) The distinction between the “effects of causes”,
and the “causes of effects.” (Identification problem).
(8) There is an asymmetry between causes and
consequences.
I.e., aggregate results are essentially weighted.
Effects are always populational attributes
Causes may not be social – thus non-populational.
(9) Theory is important because we always need to
make assumptions in statistical analysis (e.g., econ).