Alternative approaches

Download Report

Transcript Alternative approaches

Working with “loose” Theories of
Change
Alternative approaches to exploring and testing
complex causal models of development
interventions
21/10/2015
Rick Davies @ DFID EvD
1
Some proposals, informed by…
• ITAD’s macro-evaluation of empowerment and
accountability projects
• Analysis of data on CSCF projects for Triple Line
• Portfolio analysis exercises with Comic Relief
• Literature on QCA applied to evaluation (Befani &
others)
• Literature on predictive analytics, a form of data
mining
• Current experiences with development of an Excel
application to operationalise related tools
21/10/2015
Rick Davies @ DFID EvD
2
Common features of the approaches
1. Data, on what has already happened
2. A view of causation that has a reasonable fit
with complexity of the world as we see it
3. Methods, to analyse this data in the light of
this view
21/10/2015
Rick Davies @ DFID EvD
3
Why bother with alternatives?
• It would be useful to be able to triangulate
findings using the same data, using different
methods but which have a consistent view on
causality
• The different approaches have different
strengths and weaknesses, providing a wider
range of possible uses overall
21/10/2015
Rick Davies @ DFID EvD
4
1. What sort of data?
• Rows = cases e.g. projects
• Columns =
• Attributes of projects and their context, and
• Outcomes of these projects
• Cells
• Presence/absence of these attributes, and/or
• Scales e.g. achievement ratings, and/or
• Numeric values e.g. costs
21/10/2015
Rick Davies @ DFID EvD
5
Examples
One sub-set
chosen for
analysis
21/10/2015
Rick Davies @ DFID EvD
Data from the
Civil Society
Challenge
Fund,
Managed by
Triple Line &
6
Crown Agents
2. What sort of view of causality?
• Conjunctural causes
•
Many events are caused by combinations of factors,
rather than single factors.
• Multiple conjunctural causes (equifinality)
•
Events can arise as a result of different conjunctions of
conditions*.
• Asymmetric causes
•
21/10/2015
The causes of events may not be simply the absence of
conditions that cause them, but the occurrence of other
additional conditions which complicate, block or deflect
change
Rick Davies @ DFID EvD
7
Multiple types of causal conditions
Causes are not simply present or absent, strong or weak, as in a
statistically’ significant correlation.
Individual causal conditions can be
• Necessary but insufficient causes
• Sufficient but unnecessary
• Necessary and sufficient
• Neither necessary or sufficient
There are other kinds as well, describing their role within configurations
of conditions e.g.
•
•
21/10/2015
INUS
SUIN
Rick Davies @ DFID EvD
8
3. Methods combining…
• Cross-case analysis (“Quant”)
B. Pattern finding: Attributes associated with
outcomes
• Within-case knowledge and analysis (“Qual”)
A. Defining the boundaries of the cross-case
analysis
• Based on “loose” theories of what is happening
C. Investigating the results of cross-case analysis
• Looking for causal mechanisms within individual cases
21/10/2015
Rick Davies @ DFID EvD
9
A. Defining the boundaries
• The problem – too many attributes vs cases
•
•
Predictive analytics: “Curse of dimensionality”
QCA: The problem of “limited diversity”
• Solutions (both can be used)
•
Feature selection algorithms, used in data mining
• E.g. minimise correlations between attributes
• Genetic algorithm built into an Excel app.
•
Choices of sub-sets of attributes informed by “loose”
theory
• E.g. Two-step or nested analyses in QCA
21/10/2015
Rick Davies @ DFID EvD
10
A range of loose Theories of Change
Categories of project attributes (52>7)
Types of project outcomes
21/10/2015
Rick Davies @ DFID EvD
11
B. The pattern finding challenge
• A huge number of possibilities remain even
after using loose theories to focus our
attention
• A simple data set with 15 project attributes
has 215 = 32,768 possible combinations
• Which of these provide will the best predictors of the
outcomes we are interesting?
• How do we search this space to find out?
• How do we evaluate the results?
21/10/2015
Rick Davies @ DFID EvD
12
6+ types of search strategies
1. Theory-led hypothesis testing
•
•
•
•
Prior research and theory increases chances of useful
findings
But narrow in focus, because testability requires
specifics
High likelihood of missing unexpected solutions
But can be speeded up with a simple Excel app*
2. Exhaustive search:
•
•
•
21/10/2015
Every possibility is tested, but time consuming
Feasible with many small data sets / binominal data
Can be speeded up with R or Excel app
Rick Davies @ DFID EvD
13
Algorithmic searches
3. QCA (Quine-McCluskey) *
•
•
•
Origins in political science, now also used for
evaluation
Technically demanding method to master and to
communicate
Specialist software available (fs/QCA, etc)
4. Decision Tree algorithms***
•
•
•
21/10/2015
Origins in data mining, more widely used, but not yet
in evaluations
Easy to read visual display of results
Open source software available (RapidMiner)
Rick Davies @ DFID EvD
14
Example of QCA results display
Outcome 1: Users use ICTs to report rural water supply functionality to the local government
(A*B*C*D*E)+(A*B*c*D*E)+(a*B*c*d*e) = Outcome present
(a*b*c*d*E)+(A*B*c*d*E) = Outcome absent
Testing the Waters:
A Qualitative
Comparative
Analysis of the
Factors Affecting
Success in
Rendering Water
Services Sustainable
Based on ICT
Reporting, 2015.
Itad.
21/10/2015
Rick Davies @ DFID EvD
Model uses 5
of 9 attributes
15
Decision Tree results display
Nodes = project attributes
Branch values = attribute
present or absent
Leaves = associated
outcomes – higher
effectiveness
1.0 = predicted present
2.0 = predicted absent
Blue = observed present
Red = observed absent
21/10/2015
Rick Davies @ DFID EvD
Data from the Civil Society
Challenge Fund, Managed by
16
TripleLine
Algorithmic searches
5. Genetic algorithms**
• Widely used to solve business and engineering
optimisation processes. Not at all in evaluations
• Available as a free add-in to Excel. Easy to use.
• Can triangulate results from QCA and/or Decision
Trees
• Can discover new/alternate solutions
• E.g. 3 rather than 5 configurations in Water Aid data, using 2
vs 5 conditions
21/10/2015
Rick Davies @ DFID EvD
17
Solver Excel add-in
OBJECTIVE: Value that needs
to be maximised e.g.
accuracy of prediction
VARIABLES: Values that can
be varied e.g. attributes of
project to be present or
absent
CONSTRAINTS on how
project attributes can be
varied
SOLUTIONS for 60 x 20 data
sets usually found within 1
minute
21/10/2015
Rick Davies @ DFID EvD
18
6. Ethnographic and participatory methods
• Ethnographic Decision Tree Modelling (Chr. Gladwin
1989)
•
•
Time consuming development process
Performance is measurable and independently verifiable
• Hierarchical card sorting (Davies, 1993)**
•
•
•
Much quicker to use
Results in the form of a readable tree structure
Performance is measurable and independently verifiable
• Both tap into existing and often informal and semitacit knowledge and make it more explicit and testable
21/10/2015
Rick Davies @ DFID EvD
19
Ethnographic Decision Tree example
Model of factors
affecting farmers
decisions of
whether or not
to harvest
thatching
materials
21/10/2015
Rick Davies @ DFID EvD
20
Hierarchical Card Sort example
Comic Relief
CYPAR project
portfolio
2015
Classification
and prediction
of their
relative
achievements
21/10/2015
Rick Davies @ DFID EvD
21
Assessing model performance
• Results of models produced by all methods can be
collated and compared using a Confusion Matrix (i.e. a
simple truth table)
Insert number
of cases fitting
into each result
category
• Numbers within the matrix form the basis of a menu
of different performance measures. See wikipedia
21/10/2015
Rick Davies @ DFID EvD
22
Performance measure examples
• Support
•
The % of cases that a prediction rule applies to
• Prevalence
•
The % of cases which have the outcome present
• Accuracy
•
The % of cases that are True Positives and True
Negatives
• Lift
•
The incidence of True Positives compared to
Prevalence of the outcome
• And many others…
21/10/2015
Rick Davies @ DFID EvD
23
Simplicity also matters
• Defined as
• Number of prediction rules need to account for
all outcomes, and
• Number of attributes within all of those rules
• Why bother:
• Simple rules would be easier to put to use in
project selection or design
• Simpler rules tend to have wider applicability
21/10/2015
Rick Davies @ DFID EvD
24
Algorithms augment but don’t replace
• Algorithms don’t always produce a unique best
answer, because
• Limitation of the algorithm e.g. QCA, GA
• Nature of the data (e.g. many attributes vs cases).
• More than one attribute can be an accurate predictor of
outcome
• Manual tweaking of predictive models helps to
• Find simpler but equally good performing models
• Explore “adjacent possible” models that also do well
• Identify the relative importance of parts of the model
21/10/2015
Rick Davies @ DFID EvD
25
3. Investigating the results of crosscase analysis
• Two steps
• Case selection - subject of increased interest
• Gerring, J., Cojocaru, L., 2015. Case-Selection: A Diversity of
Methods and Criteria.
• Need to be transparent and replicable
• Otherwise risk of confirmation bias
• Within-case analysis
• Process tracing methods are the most cited method
• (No further discussion here)
21/10/2015
Rick Davies @ DFID EvD
26
One case selection strategy
• Current experiment via ITAD’s Macro-evaluation
of Empowerment & Accountability
• Use Hamming distance to measure case similarity
• Attributes of project A: 000110100
• Attributes of project B: 011101100 = 5 commonalities
• Two kinds of similarity measures
• Similarity of any two cases
• Average similarity of one case with all others
• Case with highest average similarity = “modal” case
21/10/2015
Rick Davies @ DFID EvD
27
Use Confusion Matrix to find and
compare 3 types of modal cases
• Select a modal True Positive case
•
to find any likely causal mechanisms connecting the
conditions that make up the configuration
• Select a modal False Positive case
•
Given the presence of the same configuration of
conditions one would expect the SAME mechanism to be
present BUT some other factors blocking it from working
i.e. delivering the outcome
• Select a modal False Negative Case
•
21/10/2015
Given the absence of the same configuration of
conditions one would NOT expect the same casual
mechanism to be present
Rick Davies @ DFID EvD
28
In Summary: When to use what?
• Ethnographic / participatory methods:
• When you want to understand/explore/test the
theories of specific stakeholders
• When there is no data set readily at hand
• QCA
• When there is a reasonably well developed
Theory of Change
• When the data set has sufficient diversity
21/10/2015
Rick Davies @ DFID EvD
29
In Summary: When to use what?
• Decision Tree algorithms
•
When easily communicable results are needed
• Genetic algorithms
•
When a quick exploration for alternate solutions is
needed
• Exhaustive search
•
When certainty is needed that a solution is the best
available
• Excel app (EvalC3)
•
21/10/2015
When you want to tweak results of all of the above
Rick Davies @ DFID EvD
30
Where to use any of these?
• Where numerical data is hard to find
• Where experimental approaches are
impractical
• Where causal complexity is likely to be high
• Large, decentralised projects, with diverse
interventions and contexts e.g.
• Grant making programs
• Participatory development programs
21/10/2015
Rick Davies @ DFID EvD
31
What to look out for
• How good is the underlying data?
• Relevant attributes included?
• Sufficient range of attribute values?
• Diversity of cases?
• Minimise redundant configurations
• Maximise proportion of all possible configurations
• Did stakeholders views inform selection of
attributes and outcomes for analysis?
21/10/2015
Rick Davies @ DFID EvD
32
What to look out for
• Transparency of process
• Who participates and how
• What performance measures have been used
• What search parameter have been set e.g. with
GA, Decision Tree
• Have the results been triangulated?
• Was approach to within-case analysis
systematic and transparent?
21/10/2015
Rick Davies @ DFID EvD
33
Some lessons so far?
1. “Successful models” are not distinct entities
• “Adjacent possible” always worth investigating
2. Its not only success (FPs) that matters
• Investigating False Positives and False Negatives
will help improve existing models
3. Unambiguous success (no FPs or FNs) is rare.
• Find an acceptable level of accuracy and support
21/10/2015
Rick Davies @ DFID EvD
34
Surplus material hereafter
21/10/2015
Rick Davies @ DFID EvD
35
Software options
• QCA
• See http://www.compasss.org/software.htm
• 8 packages
• Decision Trees
• Rapid Miner Studio
• Genetic algorithms
• Solver add-in to Excel
• EvalC3 application under development
• (Early adopters are welcome)
21/10/2015
Rick Davies @ DFID EvD
36
Testing the Waters: A
Qualitative
Comparative Analysis
of the Factors
Affecting Success in
Rendering Water
Services Sustainable
Based on ICT
Reporting, Itad, 2015
Initial QCA result
found
5 configurations,
using
5 project attributes
21/10/2015
Rick Davies @ DFID EvD
But these could be
simplified down to
3 configurations
using
2 project attributes
37
Comic Relief results
Predicted in
HCS+ model
More successful
Less successful
Observed in project assessments
More successful
Less successful
14
4
5
4
Prevalence = (14+5)/(14+4+5+4)
= 70%
Support = 14/(14+5)
= 74%
Accuracy = (14+4)/(14+4+5+4)
= 67%
Lift =(TP/(TP+FP))/ ((TP+FN)/(TP+FP+FN+TN))
=(14/(14+4))/ ((14+5)/(14+4+5+4)
= 1.04
21/10/2015
Rick Davies @ DFID EvD
38
• “Prediction is very difficult, especially if it’s
about the future.” Niels Bohr
• “Ninety per cent of problems have already
been solved in some other field. You just have
to find them.” Tony McCaffrey
21/10/2015
Rick Davies @ DFID EvD
39
More evidence, less poverty
Dean Kalan, Scientific American, October 2015
• “We must figure out what works and what
does not”,
But others are asking
• …what works for whom in what
circumstances
And others should be asking
• …and how often and how often is good
enough?
21/10/2015
Rick Davies @ DFID EvD
40