Carl-Erik Särndal - Quality on Statistics 2010
Download
Report
Transcript Carl-Erik Särndal - Quality on Statistics 2010
The Probability Sampling Tradition
in a period of crisis
Q2010 Keynote speech
Carl-Erik Särndal
Université de Montréal
The Probability Sampling Tradition
has governed surveys at National Statistical
Institutes (NSI:s) for decades
Breaking a tradition : Not easy …
Background
The merits of probability sampling, also known
as scientific sampling, are put in question by
severe imperfections : non-sampling errors,
economic pressures etc.
The problem not new – but more and more
compelling
Background
The probability sampling process
• is expensive (through follow-ups);
• its theoretical merits are compromised
(by nonresponse, etc.)
• “a few extra %” amount to very little
• alternative data collection methods exist
Yet probability sampling continues to be
practiced. Wasteful ? Can we do without
probability sampling?
My view
is a (Canadian) theoretician’s view
on (official) statistics production
To what extent guided by (statistical science)
theory ?
Theory as a basis for science (knowledge)
Something we admire:
Being able to predict facts about the world we
live in by theoretical arguments and deduction
This is the predictive power of science
In statistics: Want precise statements, backed by
convincing theory, of level of unemployment,
of industrial production, and so on
Theory as a basis for science
Gérard Jorland : How is it possible that one can
predict, merely by theoretical deductions,
the existence of a new planet, or a new
chemical element, or a new elementary
particle?
Based only on a calculus, on a set of
mathematical equations ... remarkable
achievement of the human mind.
Famous example: Planet Neptune was “found”
by mathematical prediction by Le Verrier 1846,
then empirically observed by Galle, at the
position given by Le Verrier
Many other examples come from physics,
astronomy, chemistry
A hypothesis to test:
The sciences are predictive to the extent that they
are mathematically formulated.
But that hypothesis is rejected : Today,
Economics is highly mathematical and
theoretical, but such arguments did not predict
the current economic crisis, for example.
The contrast
Physics:
Predictive power of formal theory very high
Economics:
Predictive power of formal theory low
So “science formulated mathematically” does
not guarantee “predictive power of theory”
Why then are Physics and Economics different?
Both are theoretical (mathematical) .
Contrasts
Physics : the objects (planets, elementary
particles, and so on) are inanimate ;
predictive power very high
Economics : the objects and the participants
(human beings) are unpredictable, relationships
highly complex;
predictive power very low
Theory as a guide in statistics production
Our ambition :
Create knowledge (predictions) about our
world through statistical surveys .
To what extent is this activity supported by
theory ? To what extent scientific ?
Legitimate questions !
Some NSI:s take pride in “scientific principles”.
Sampling = Limiting attention to a small subset
To what extent scientific ?
We accept without hesitation that observing only
n = 1,000 (or a few thousand) is enough but provided the sample is “scientific”
What is a scientific sample ?
RoperCentre, Univ. of Connecticut, says :
A scientific sample is a process in which
respondents are chosen randomly by one of
several methods.
The key component in the scientific sample is
that everyone within the designated group
(sample frame) has a chance of being selected.
We may add : Such a sample also known as a
probability sample
It is not necessarily a representative sample
in the sense “all have the same probability”.
scientific sample
probability sample
representative sample
around these terms, unfortunate ambiguity and
confusion reigns in literature, in conversation
Ask, and you get a variety of responses
Sampling = Limiting attention to a small subset
Two contrasting examples:
Sampling trees in a forest - to predict volume
Sampling human beings in a country - to
predict (assess) unemployment, or health
conditions, or expenditures
Estimating volume of wood on a sample of trees
With classical probability sampling theory,
we get not only a figure for the total volume of
wood in the forest, but also a statement of its
margin of error, free of any assumptions.
We can determine exactly the accuracy we want.
Estimating unemployed on a sample of people
We get from the LFS a figure, but we cannot
quantify its margin of error. There is no
objective declaration of numerical quality
because unmeasured are : nonresponse error,
measurement error, frame error, recording and
data handling error, and so on
The contrast
Trees are inanimate objects, like planets
Human beings, they are precisely that, human,
inconsistent, emotional, prone to error
The contrast
Trees : Predictive power of probability sampling
theory very high – objects do not “cause trouble”
People : Predictive power of sampling theory very
low - the survey is complex; human beings are
involved
A large scale statistical investigation (survey) :
“Unpredictable people are involved at so many
points of this incredibly complex process”
so we will never have a theory that will allow
precise measurement of total survey error
(Stanley McCarthy 2001)
Producing numbers is (relatively) easy ;
by comparison, stating their accuracy is difficult
Article by Platek and Särndal :
Can a statistician deliver ?
J. Official Statistics vol. 17 (2001), pp. 1 – 127
with 16 discussions
and a rejoinder by the authors
Can a statistician fulfill the promise (to society) ?
Upon rereading : Have we advanced any, in 10
years ?
The title : Can a statistician deliver ?
“Statistician” may denote
the head of a National Statistical Institute (NSI)
or
a person expert in the subject (labour market, or
health issues, or manufacturing industry, etc.)
or
a person trained in statistical science
(methodologist)
As expected, feelings conveyed were of two
kinds:
high ranking NSI officials: “Keep the ship
sailing”, despite difficult times
academics and researchers: Regret the absence
of a more solid (theoretical) base for (national)
statistics production
Three themes are prominent in the 16
discussions (summarized in the authors’
rejoinder) :
The role of theory
The scientific and professional credo of the NSI
The concept of quality in regard to the NSI’s
activity
The uncertain future of the NSI
I. Fellegi (Statistics Canada) on survival of the
NSI. “Survival beyond quality” depends on
•
•
Respect for respondents, and
Credibility of information; Accuracy is an
important part, but so are Relevance,
Transparency & others
The uncertain future of the NSI
I. Fellegi :
A life and death question for the NSI is
credibility :
Information that is not believed will not be used,
and the NSI has no function any more.
Can the NSI count on future high co-operation
and truthful response ? More and more doubtful.
Believing numerical information
We have no objective measures of “margin of
error”
But what about the Total Survey Error model ?
(US Bureau of the Census, around 1950)
It recognizes total error as a sum of a number of
components.
Can we not use these equations, this theory ?
Believing information
The Total Survey Error model
• helped us to focus on specific components of
total error
• disappointed us by failing to provide routine
measures for the numerical quality of
published statistics.
Believing information
Discussants of Can a statistician deliver ?
deliver “a death sentence” on the TSE model :
“Unattainable and unrealistic ideal”
“Utopian project”
“Unrealistic utopian dream”
Theory is there, but it does not work
Some say: We choose not to use it
In question are the notions of “probability” and
“probable error”
Statistics Canada Quality Guidelines (1998)
describes Survey Methodology as :
“A collection of practices, backed by some
theory and empirical evaluation, among which
practitioners have to make sensible choices in
the context of a particular application”
A patchwork of theories, one for questionnaire
design, one for motivating response, one for data
handling and editing, one for imputation, one for
estimation in small areas, and so on
Fragmentation …
European Statistics Code of Practice (2005)
Sound methodology must underpin quality statistics.
This requires adequate tools, procedures and
expertise. The overall methodological framework of
the statistical authority follows European and other
international standards, guidelines, and good
practices ... Survey designs, sample selections, and
sample weights are well based and regularly
reviewed, revised or updated …
(Emphasis is mine.) A “be-good” encouragement;
what about “scientific underpinnings” ?
The stark reality
“Good practice” is the guide, not theory .
Numerical quality is not assured .
Large errors probably not infrequent; most go
undetected .
So what ? - Other important professions are also
guided by a bunch of “good practices”
The NSI:s situation
Its work is guided by “a collection of practices
supported by some theory” plus requirement to
keep response burden low
With this frail and fragmented base, the NSI
must produce reliable Official Statistics, for
the good of the nation, a solid basis for policy
decisions
Not an enviable situation and a threat to NSI’s
existence…
The Probability Sampling Tradition (born in 1930’s)
created the concept of Nonresponse Rate :
“the selected objects” (the probability sample)
as compared with
“the data delivering objects” (the respondents)
We measure, steadfastly, sometimes misguidedly,
the size ratio of those two sets
Our obsession with the Nonresponse Rate
When NR rate was 2%, nobody worried
When NR rate is now around 50%, we worry
•
Intuitively because the non-responding may be
systematically related to target variable values
•
Probabilistically because “making the
observation” (getting the response) has an
unknown probability; the theory capsizes
The believers in Probability Sampling regret
that the theory cannot cope
The non-believers : Why worry about the NR
rate ? Just collect some reasonably good data
from a reasonably representative set of objects.
Our obsession with Nonresponse Rates
Why not (in the manner of some private survey
institutes) just get data from “a reasonably
representative set of co-operative objects”, and not
bother with this stifling concept of the Nonresponse
Rate ?
It is time that NSI:s deliver a strong endorsement of
the Probability Sampling Tradition – if this is what
they really believe in; otherwise, act accordingly
Our obsession with Nonresponse Rates
NR rate itself is a poor indicator of NR bias,
of “accuracy of estimates”
See for ex. Groves (2006), Schouten (2009)
Särndal and Lundström (2008)
Conclusions
What options remain for the NSI today, to show their
superior capacity to produce “serious numbers”
amidst a deluge of “junk information” ?
The underpinnings may be just “a collection of
practices”, but still, the NSI is the model of
statistical competence in the nation - and it must
demonstrate this !
Media criticism of the NSI sometimes harsh.
•
•
•
•
Conclusions
The NSI’s delicate balancing act
vis-à-vis
The national government : fulfill the mandate
The world of theory and learning : show
“scientific credibility”
The other (private) producers of statistics :
tough competition
The supra-agency (EuroStat) : dictates
Conclusions
A fact is that the quality component accuracy
cannot be measured (probabilistically).
Yet this is what users want desperately to have
measured.
When important numbers are proven wrong (by
users), trust in the NSI suffers
Other numbers may be wrong, but go unnoticed
- and may not matter much .
Conclusions
The Probability Sampling (Scientific Sampling)
tradition,
is a reflection of an idyllic past now we are 2010 , not 1950
On what grounds is it still defendable, in our
time?
It is a challenge to the NSI, and to the academics
(the theoreticians), to provide the answers
Conclusions
The NSI vis-à-vis the scientific world : a
sometimes hesitant relationship:
Most NSI:s have a scientific (academic) advisory
board
NSI:s look to the learned world for support and
acceptance
NSI:s own investment in research may
(understandably) be limited.
Implementing new theory into the NSI's
production has met with obstacles
Conclusions
Relationship of the NSI to the world of learning;
an empirical investigation, see
Risto Lehtonen and Carl-Erik Särndal :
Research and Development in Official
Statistics and Scientific Co-operation with
Universities: A Follow-Up Study ,
J. Official Statistics (2010)
Conclusions
Debate article :
S. Lundström and C.E. Särndal (2010):
The devastating consequences of nonresponse :
Probability sampling in question at Statistics
Sweden . (In Swedish; internal report).
Credit goes to Statistics Sweden for their courage
to debate a sensitive issue.