Edmunds` presentation - Institute for Research and Innovation in

Download Report

Transcript Edmunds` presentation - Institute for Research and Innovation in

Workshop on Complex Systems Research
Initiative: An Introduction to Agent-Based
Modelling
Edmund Chattoe-Brown ([email protected])
Department of Sociology, University of Leicester, UK
http://www.simian.ac.uk
Thanks
• This research funded by the Economic and Social
Research Council of the UK (http://www.esrc.ac.uk)
as part of the National Centre for Research Methods
(http://www.ncrm.ac.uk).
• Thanks are due to Nigel Gilbert (SIMIAN Co Director)
for the use of some training materials initially
developed primarily by him.
• Thanks to you all for inviting me!
• The usual disclaimers applies.
2
http://www.simian.ac.uk
Plan of the workshop
• Mornings: Introductory lecture/discussion.
• The rest: Discussion, questions.
• Afternoon: Hands on, initially exploring existing
models then (?) programming.
• Generally: Your proposed research.
3
http://www.simian.ac.uk
Plan for day 1
• The role of research methods in shaping what we
see. Examples of qualitative and quantitative
research and the need for a “third way”.
• A very brief interlude on social versus physical
science.
• A simple “running” example: The Schelling
segregation model. (Microcosm.)
• What should we learn from this example?
• Key concepts: Emergence, non-linearity, complexity.
• The distinctive methodology of ABSS/MAM and data.
4
http://www.simian.ac.uk
Opening Thoughts
• “I suppose it is tempting, if the only tool you have is a
hammer, to treat everything as if it were a nail.” (The
Psychology of Science, Abraham Maslow)
• “When scientists and mathematicians fail to find
positive clues leading towards solutions of their
problems, they sometimes reverse their frontal
strategies and employ reductio ad absurdum, which
by a process of eliminating all the impossibles and
improbables, leaves a residue of least absurd, ergo
most plausible solutions, which may be reduced, by
physically testing to unequivocable answers.”
(Buckminster Fuller, foreword to Confessions of a
Trivialist, p. ix)
5
http://www.simian.ac.uk
Overall goals
• To introduce a novel method (MAM/ABSS) for
understanding the social world using relevant
examples.
• To distinguish it clearly from some existing methods
and thus lay out a coherent research strategy arising
from it.
• To introduce (and provide hands on experience for)
a “typical” piece of software (NetLogo) for
implementing that method.
• To offer a “vision” of the future in research of this
kind.
6
http://www.simian.ac.uk
What are we used to?
• I may well be talking to quite a diverse audience. I
shall try not to assume too much.
• I’ll start with sociology and we can take it from there. I
can’t always promise obvious relevance of examples
but this isn’t just laziness!
• The two main methods of representing theory in
sociology are narratives and equations.
• These are almost invariably associated with
qualitative (ethnographic) and quantitative (statistical)
analysis respectively.
• Other methods: Experiments/randomised control
trials, history, analysis of artifacts/documents,
monitoring … (Interesting!)
7
http://www.simian.ac.uk
Example of narrative analysis
• “Turkish interviewees do not include themselves when they are evaluating
the status of ‘Turkish women’ in general. While referring to ‘Turkish
women’, most Turkish interviewees use the pronoun ‘they’:
• Turkish women are more home-oriented. I think that they are left in the
backstage because they do not have education, because they are not
given equal opportunities with men. (T3)
• One of the Turkish interviewees stated that it was difficult for her to answer
the questions related to her status ‘as a woman’, because:
• I don’t think of myself as a Turkish women, but as a Turkish person. I
mean I never think about what kind of role I have in the society as a
woman. (T1)
• Most Norwegian interviewees, on the other hand, identify with ‘Norwegian
women’ in general, and they refer to ‘Norwegian women’ as ‘we’:
• I think that in a way Norwegian women, that is we, at least have our rights
on paper. We have equal rights for education and we have good welfare
arrangements … (N1)” (Sümer, Acta Sociologica, 1998, 41(1), p. 122)
8
http://www.simian.ac.uk
Narrative analysis pros and cons
• As rich as you want it to be.
• Crosses levels of analysis (self reports on decision
making).
• Limited at some unknown and fuzzy barrier with
psychology.
• Real dangers of subjectivity (should be “regulated” by
the method though).
• The price of that richness is that incompleteness,
ambiguity and inconsistency can exist within the
narrative and be hard to spot.
• TANSTAAFL: Rich but “expensive” to collect and
analyse, especially with observational data.
• Can it generalise?
http://www.simian.ac.uk
9
Example of quantitative analysis
• “The most important empirical findings of this study can be
summarized as follows:
• … there is a moderate tendency for individuals with higher
service class origins to be more likely than others to enrol in PhD
programmes.
• …
• The estimated effect of class drops to zero when controlling for
parents’ education and employment in research or higher
education.
• The overall implication of these findings is that the transition
from graduate to doctoral studies is influenced by social origins
to a considerable degree. Thus, the notion that such effects
disappear at transitions at higher educational levels - due either
to changes over the life course or to differential social selection is not supported.” (Mastekaasa, Acta Sociologica, 2006, 49(4),
pp. 448-449.)
http://www.simian.ac.uk
10
Quantitative analysis pros and cons
• Can’t be too rich to “solve” or “fit”.
• Mostly completely explicit (though some
methodological background may be tacit i. e.
assumptions about distributions of data) thus
avoiding ambiguity, incompleteness and
inconsistency.
• Can it particularise?
• Hits data collection and analysis problem of
“atomisation”: “50 cases per variable” rule of
thumb in simple regression.
11
http://www.simian.ac.uk
Aside: No theory, no data, no logo
• Example: Educational success.
• Girls and boys go through a school system, get
grades/qualifications and reach different levels.
• They may start biologically different, be
socialised differently, form different peer groups,
be selected differently into schools or subjects,
be treated differently by teachers, develop
different interests and motivations, be offered
different resources, choose differently and so on.
• All these processes unfold in parallel, in diverse
combinations for diverse individuals.
12
http://www.simian.ac.uk
What does that mean for methods?
• If individuals are unique, we can all give up (but
there are reasons not to be so pessimistic).
• We often disagree (fruitlessly?) on where social
regularities lie: Attributes versus practices.
• Clearly gender is associated with educational
success through all these processes but the
notion of causality is much harder to apply. Why
would there be “big” patterns to find?
• Ethnography can subject tiny parts of sequences
to detailed examination (and practices should
generalise) but cannot look at the whole.
13
http://www.simian.ac.uk
Stepping back: Levels of description
• A micro level, where individual action occurs in
an “environment”.
• A macro level (environment), which shapes and
is shaped by the micro level.
• The eminent sociologist James S. Coleman
argues that in order to explain properly, a theory
must link one level by a process description to
another. (Mechanism/middle range sociology.)
• There are grounds for arguing that, although they
may appear (or claim) to, neither statistical nor
ethnographic accounts actually do this.
14
http://www.simian.ac.uk
Aside: Physical and social systems
• Physical systems cannot give accounts of
themselves nor respond adaptively to their
“environment”.
• They “follow” the same “laws of nature” that we
try to deduce from them. (Atoms in gas.)
• Regularities in social systems cannot be of this
kind because of reflection and adaptation.
• The unique (but fuzzy edged) domain of social
action arises from the almost unique ability of
humans to make rich models of their world
(including social science models). Marx?
15
http://www.simian.ac.uk
Cashing this out: Segregation model
• Agents live on a square grid so each has maximum 8 neighbours.
• There are two “types” of agents (red and green) and some grid
spaces are vacant. Initially agents/vacancies distributed randomly.
• All agents decide what to do in the same very simple way.
• Each agent has a preferred proportion (PP) of neighbours of its own
kind (0.5 PP means you want at least half your neighbours to be your
own kind - but you would accept all of them i. e. PP is minimum.)
• If an agent is in a position that satisfies its PP then it does nothing
otherwise it moves to a vacancy chosen at random.
• A time period is defined (arbitrarily) as the time it takes for each
agent (chosen in random order to avoid non robust patterns) to “take
a turn at” deciding and possibly moving.
16
http://www.simian.ac.uk
Marker
• I’m going to show you exactly how the computer
does this before too long.
• In a nutshell, the description amounts to:
Create the world.
Do some things to each agent and repeat.
17
http://www.simian.ac.uk
Initial random state
18
http://www.simian.ac.uk
Clustering
Aside: This is a
NetLogo “world
window”.
19
http://www.simian.ac.uk
Two questions
• What is the smallest PP (i. e. a number
between 0 and 1) that will produce clusters?
• What happens when the PP is 1?
20
http://www.simian.ac.uk
Answers
• About 0.3.
• No clusters form.
• Revisit 1: Had you “seen” the cluster data
generated by PP=0.3, might you (if of a
particular political or sociological persuasion)
have attributed xenophobia to the system?
• Reflection: Is PP=0.1 behaviourally
indistinguishable in cross section from PP=1?
Problem?
21
http://www.simian.ac.uk
Why and so what?
• Because PP is a minimum, people are always happy
“inside” a cluster of their own kind.
• If a cluster is “full” (no internal vacancies) then it
cannot be disrupted.
• Whether clusters form depends on whether their
shape is compatible with the PP for each “edge
agent”. (No “sharp corners” possible: Minimum size?)
• When PP is 1, no shape of the cluster edge is
compatible with the satisfaction of edge agents so the
cluster cannot form.
• An aggregate entity (the cluster) thus becomes a
structuring principle for individuals.
22
http://www.simian.ac.uk
Simple individuals/complex system
Individual Desires and Collective Outcomes
% Similar Achieved (Social)
120
100
80
60
% similar
% unhappy
40
20
0
0
50
100
150
-20
% Similar Wanted (Individual)
23
http://www.simian.ac.uk
Counter-intuitive
macro (social)
results from
simple micro
interactions. A
non-linear (and
complex)
system.
A vision: To be revisited/expanded
• Simulation is a “macroscope” (or
“complexoscope”) because it allows us to “see”
complexity in a way that is similar to the way that
a microscope allows us to see very small things.
• The explicit process specification (that should
mirror real social processes) shows us why
existing methods have difficulty linking micro and
macro levels. The “process” in a statistical model
is just the equation system linking variables. In
qualitative research there may be no such
process. (The reasons why are interesting and
puzzling.)
24
http://www.simian.ac.uk
Connection 1: Data and methods
• This is a patently unrealistic model: Identical decisions,
random movement, no housing market, no schools or jobs
to attend to. (I chose it deliberately!)
• How, broadly, would it be made more realistic?
• Using qualitative methods to study neighbourhoods,
perceptions and decision processes.
• Using quantitative methods to compare (in some sense)
the simulated clusters with some real ones. Does this look
anything like residential patterns by ethnicity in Toronto?
How like? (I’ll return to this.)
• Existing research methods are used in ways that are
clearly different but certainly not unrecognisable.
25
http://www.simian.ac.uk
Connection 2: Explanation
• It is the simulation that links the interplay of
situated micro processes (choosing agents with
neighbours) with macroscopic patterns (clusters).
• A social theory is thus neither represented as a
narrative or set of equations but as a computer
programme. (Coleman is happy!)
• The rigour of quantitative research is retained
(complete specification) but the behaviour only
needs to be “generated” not “solved” or “fitted” so
can be of arbitrary sophistication. (I’ll show this.)
• If we can “generate” something then we have
explained it. (Methodology hazard!)
26
http://www.simian.ac.uk
Connection 3: Complexity concepts
• Complexity: “Rich” patterns (here, non-linearity for
example) do not need to come from “rich” agents
or “rich” interactions. They can arise from simple
interactions between simple agents. World view?
• Emergence: The need to use categories at one
level of description that do not make sense at
another. (You cannot have a one agent cluster or
a one car traffic jam.)
• Non-linearity: We cannot assume things we often
do assume (large effects imply large causes,
similar effects have “close” causes).
27
http://www.simian.ac.uk
Informal thoughts on methodology
• These will be made more rigorous later.
• Generally, don’t use MAM/ABSS to “explain” a straight
line. The idea of “over fitting” (and Occam’s Razor)
applies but we can’t formalise it as we can in statistics.
• We need to worry about “how many” simulations can
match a given real system. This is our “leap of faith”.
• We discover this by general experience (clustering,
Power Law, S-shaped innovation curve) and address it
by “bar raising” and choice of research question/model.
• Some of these issues arise not from weaknesses in the
methodology itself but from the fact it is still being
established. (Equating poor methodology and poor
practice is a defence mechanism against novelty.)
28
http://www.simian.ac.uk
Similarity in the Schelling model
• A two (three?) state system.
• Hollow versus full clusters, direct red/green interfaces versus
vacancy buffers. (Vacancy chains idea.)
• Exact match to Toronto?
• Cluster sizes of correct distribution (but no location stability
across runs?)
• Cluster “shapes” correct?
• “There are clusters”: Actually pretty weak.
• Now consider 3 types: Separated versus concentric clusters. The
latter is much more discriminating.
• Or, what is internal structure of clusters with regard to PP? (Most
tolerant at edges?)
• Naïve (but useful?) notion: Ratio of possible world states to
states compatible with your theory as measure of “power”.
29
http://www.simian.ac.uk
Richness in the Schelling model
• Emphasis (so far) on spatial pattern.
• What about “biography” or “history” of agents?
• What are effects of in and out migration to produce
a dynamic rather than static equilibrium?
(Convergence as an “artefact” or a finding?)
• What are the distributions of any heterogeneous
parameters (PP for example) with respect to
clusters?
• Very loose idea: Can we “fit” on some comparisons
of real and simulated data and then “explain” on
others? (Hazard warning: We don’t know how
orthogonal different “aspects” - like biographies and
clusters - are.)
30
http://www.simian.ac.uk
A speculation
• Some research methods must be “radical
innovations” (rather than just “more of the
same”).
• If MAM/ABSS is such an RI, what follows?
Humility needed!
• Possible evidence of MAM/ABSS as an RI is its
ability/requirement to reuse existing data and
draw attention to novel data previously ignored.
• But this casts doubt on the “origins” of the
Schelling model: “If I were you, I wouldn’t start
from here at all”.
31
http://www.simian.ac.uk
How to start with MAM/ABSS?
• Think of it just like research design.
• What (one sentence?) are we trying to do/explain?
How does phosphorous “move” around Lake
Simcoe? How best can we make it do something
different?
• Why did we pick this method? (In some sense
reasons already given but need to defend against
claims of existing methods i. e. don’t “explain a
straight line”.)
• What is known? (TANSTAAFL again: Not just in
all the relevant domains but in MAM/ABSS too!)
32
http://www.simian.ac.uk
Why do all this work?
• Read a few articles at random.
• Make a set of weakly grounded assumptions.
• Build a model: Throw in a few more invisible
assumptions so it can’t be replicated.
• Play with the model, get “results” and publish in
an enclave simulation journal.
• Defend your arbitrary assumptions to the death
against others with equally arbitrary ones. Avoid
collecting data to decide.
• Be ignored by domain experts. Ignore them.
• Wait for MAM/ABSS to become a footnote in
social science history.
33
http://www.simian.ac.uk
Plan for day 2
• Going from informal methodology to formal. How
to turn these general guidelines into a plan for a
research project.
• What relevant parts of NetLogo do we need to
know about and (broadly) how do they work?
34
http://www.simian.ac.uk
Developing the vision 1
• MAM/ABSS is new which offers huge opportunities for
innovation and originality. I can “offer you” whole
social science disciplines with barely any models.
• The price we pay is that we cannot yet fall back on a
widely agreed “normal science”.
• We have to raise our own standards “from within”
without wrecking the community.
• We have to “try harder” to convince the rest of the
world.
• We have to manage the “us or them” boundary
especially carefully.
• You may decide (quite reasonably) that you want to
“come back later”.
35
http://www.simian.ac.uk
Developing the vision 2
• It can be done: We are looking for “win win” ideas.
• Social networks have an enormous number of potential
characterisations courtesy of existing Social Network
Analysis. If you can “generate” simulated networks that
look like real networks according to many of these
(potentially orthogonal) characterisations, you are really
on to something.
• First win: Tools (Which n measures of social networks
make the most effective index for similarity?)
• Second win: Perspective (What can we learn if we treat
existing SNA data as a sample rather than a bunch of
“ethnographically unique” case studies?)
36
http://www.simian.ac.uk
Formalising the methodology 1
The “Gilbert and Troitzsch Box”
37
http://www.simian.ac.uk
Formalising the methodology 2
• Choice of target: Clear research question. Avoid
TOE: Theory of Everything. (Geographer example
from Borges.)
• Choice of target: Research question, theory (or
theories), process in unknown environment, model.
• Process of abstraction: Start from key “stylised
facts” in domain. (Class example. Citation test.)
• Process of abstraction: Not all abstractions are
equally “harmful”. (“The assumptions you don’t
realise you are making are the ones that will do you
in”. Compare existing methods? More later.)
38
http://www.simian.ac.uk
Formalising the methodology 3
• Similarity: Already raised. How high can you go?
Transparency and replication? Do it yourself like
Darwin? (Commenting code.)
• Identification of novel data requirements may
reintroduce the really strong falsification test that
is so appealing in physical science (Einstein and
Mercury perihelion, position of Pluto conditional
on theory of gravitation being true). Can’t do this
with statistics because model fitting requires all
data “up front”.
• More on methodology later as needed.
39
http://www.simian.ac.uk
MAM/ABSS abstraction example
• A return to the Turkish/Norwegian women.
• Is it significant that I sometimes say “we” and sometimes “they?”
• Perhaps groups behave in certain ways and I either wish to
behave in that way or in some other.
• Suppose there are a number of “roles” that prescribe actions in
different social settings.
• I may choose a role (self interest?) but will I be accepted in it?
(White rastafarians.) It depends how I behave. Maybe the role I
am “put in” most often shapes how I see the world and how it
satisfies me. (Role strain? Roles are two sided.)
• A dynamic between behaviours, roles and interests? How does it
unfold? Do roles mutate?
• Reality check: what do we need to know here? Very broadly, how
people behave, how they think they ought to behave and how
they feel about it. “Killer” app? Women and work?
40
http://www.simian.ac.uk
Back to no theory, no data, no logo
• Phosphorous and Lake Simcoe: A blanket apology in advance.
• Surprising how often we are “following stuff” around a system whether the
stuff is dirty syringe needles, phosphorous or gazelles.
• Goal is water phosphorous levels not much above the natural “carrying
capacity”, a huge reduction over a short period. “Instant attention” of policy
makers? What else are people doing with this? (NHS epidemics example.)
• Where does the phosphorous come from, how does it move or “stick” and
what removes it from the area of study?
• Set of “phosphorous actors” and “P actions”: Golf courses and farmers
fertilisation, waste water run off from residential areas, sewage works,
manufacturing, other. Levels on rivers and open water are the key
measurement points. Abstract by not modelling dog walkers … Padding?
• Exogenous processes: Air pollution from other regions, outflow from the
study region, natural “leaching” from some patches perhaps? An
“accounting” approach based on overview of existing knowledge?
41
http://www.simian.ac.uk
NOTNODNOL 2
• Physical processes: Can phosphorous be absorbed or naturally converted
at some locations up to some level? How does it behave in ponds and
lakes? Does it “coat” patches? (Relatively easy to “split” raindrops?)
• What “can” we do? “PhosLok?”, taxes and subsidies, dredging/scrubbing,
prohibitions and enforcement, “giving up” on some rivers and making them
“sewers”, relocation, drains and changes to wastewater management. Out
of the box thinking (Perhaps motivated by the simulation itself: What if we
could move this lake?) and collecting the union of suggestions from
stakeholders and feeding them back iteratively.
• Back to Buckminster Fuller: Are there solutions that appear to cost
impossibly much or are simply unacceptable to all but one stakeholder? Do
some proposed solutions simply appear not to work? Interesting question:
How much does it matter if the model is “wrong” in comparing the relative
costs of different strategies?
• How “social” a model is this? Do we need to model how the local community
forms advice networks or shops for groceries or just how they allocate crops
to fields and decide when to feed their lawns? (This is why having a clear
research question matters: Does this move phosphorous?)
42
http://www.simian.ac.uk
Getting started
• We now have a pretty good NOTNODNOL blue print
for our literature review (and phosphorous is a pretty
good search term!)
• We also have some notion of what kind of team
“leaders” we might need (hydrologist/chemist, some
sort of social scientist/community studies person and
modeller). Models as common language.
• We are looking for physical models, problem regions,
management strategies, relevant social science on
behaviour change in particular groups. (Don’t close in
too soon though: What other water run off product
problems are there?)
43
http://www.simian.ac.uk
The world
• Patches and attributes: Altitude, water held,
surface water on patch to flow away, even cloud
saturation above? (Don’t dismiss “kludges”.)
• Rules of “transfer”: Water downhill, surface water
by patch permeability, surface water by patch
surface, clouds by (exogenous?) wind direction.
• I don’t know how much of this is “known” or how
existing models “transate” to this level of
abstraction. I know some atmosphere models do
do this!
• Some aspects (water flow) are likely to be good
approximations (pooling) at low cost.
44
http://www.simian.ac.uk
Relevant NetLogo
• Earth Sciences (Grand Canyon).
• file-open “realplacedata.txt” (This file is just a list
of altitudes extracted from other data. Not an NL
issue.)
• let patch-elevations file-read
• file-close
• Note: The patch-elevations variable comes
directly from creation of elevation variable in
patches-own. NL does the mapping for you.
• See also how this programme makes buttons for
“tracking” raindrops work.
45
http://www.simian.ac.uk
Aside
• Back to the “us and them problem”.
• Except with policy makers/funders “in charge”
(who have to be handled with tact), it is not
enough to say that a phenomenon exists to
require a model redesign. This is “death by
detail”.
• There must be data and reasonable grounds
(perhaps from other studies) for thinking that the
effect “matters”. Clear research designs are also
defensible.
• This is far from trivial. (Example of SNA and
large scale survey data.)
46
http://www.simian.ac.uk
What about “brains?”
• Schelling agents had decision processes based
on observation but they didn’t have “memories”
or “practices” to draw on in alternative situations.
• Mostly, agent brains are represented as sets of
“if then” rules, partly for interpretability and partly
for data access. (Other possibilities exist if
needed like “learning systems”.)
• Like most programming languages NL has “data
structures”. For example, lists representing the x,
y co-ordinates of my “required” daily activities.
• Example: Social Science (El Farol).
47
http://www.simian.ac.uk
Doing things to lists
• set foo (list (random 10) (random 10) 7 2)
• set foo (list (list 0 0 0) (list 1 1 1))
• set foo but-first foo [Also but-last: Past
behaviours being forgotten.]
• if empty? foo [ do-thing ]
• set foo filter [? < 3] [1 2 4 5 6 8 2]
• set fput 2 [3 4 5]
• set bar (item 2 [2 3 4 5]) (Note, starts from 0.)
• set foo (replace-item 2 [2 3 4] 15)
• Look at NetLogo Dictionary in Help.
48
http://www.simian.ac.uk
Aside
• Strings are mixtures looking rather like lists but can
include words, numbers and punctuation.
• A nifty trick (like LISP) is to use read-from-string to
“execute” strings as NL code. So, for example,
suppose you want an agent to act by if … then …
rules. If you put these in procedures they are “hard
coded” for each run but what you actually want is
for agents to be able to change their set of
practices (borrowing from others or deleting failed
rules) then store them all as a string (or probably
actually a string of strings) and then execute them
one at a time in each situation.
49
http://www.simian.ac.uk
Communication
• Once agents have “brains”, communication and
imitation fall out very naturally.
• Examples: Reputation in the Prisoner’s Dilemma,
the Gilbert and Troitzsch “shopping agents”.
• Warning! Don’t let your model develop feature
creep. This is not a model of how we diffuse
better practices in communities. We only want to
know what happens if we change the distribution
of behaviours.
• Is a farmer just a “ghostly presence” floating over
a farm?
50
http://www.simian.ac.uk
A Problem
• How to make systematic use of past data?
• If someone else read what you read, how similar
would their model be? Defensibility?
• Idea of inductive coding in qualitative research:
What do papers “talk about” and “how much?”
(More tomorrow?)
• Sources: Literature reviews as a first cut, the
“raw literature” once you have some structuring
ideas (NOTNODNOL models are useful here),
experts, stakeholder interviews, social science
“common sense”.
51
http://www.simian.ac.uk
“Version 0”
• The simplest model you can think of that
addresses the problem, contains all the “boxes”
(key processes) and actually works. Small?
• Now we can say more about “safe” abstractions
and forward development. Having a fixed “lay
down” for phosphorous across all patches is
almost certainly wrong but, within the
development framework, is trivial to fix in version
1. By contrast, a network free model would be
awful to “fix” in the next version (assuming
networks were needed, they may not be here).
52
http://www.simian.ac.uk
Uses of version 0
•
•
•
•
•
Learn the skills.
Build the team and get conversation going.
Get wider input.
Show potentially interested partners/funders.
Scope additional data requirements. (Sensitivity
analysis “starting from where you are”.)
• Example: Do I need “real” weather? Suppose I have a
“certain amount” of rain to distribute. How much (and
in what way) does it matter if I distribute it randomly, in
“lumps” or in “lumps by altitude?” You can “fake” this
before deciding whether to “invest”.
53
http://www.simian.ac.uk
Throwing it back
• From “here”, what are the problems you
envisage with SimCoeSim (or other models like
homeless epidemiology and urban land use?)
• How to do specific things in programming?
• How to represent certain processes or abstract
them?
54
http://www.simian.ac.uk
Plan for day 3
• Wrapping up the methodological outline.
• Some “short takes” on the state of the art in
various respects.
• Some passing reflections on “large scale”
research across disciplines.
• Avoiding bad practice in MAM/ABSS. Getting
“through” to publication or effective policy advice.
55
http://www.simian.ac.uk
More methodology: Parameters
• Avoid unmeasurable parameters generally. (But
allow for considerable creativity in research
methods: Firefighter example.)
• Not all measurable parameters need yet be
measured for scientific status. Unproblematically
measurable is better.
• “Quality” of models depends on progressive and
iterative refinement of values by “significance”.
• Too many unanchored parameters make a model
capable of anything. Searching the parameter
space from scratch is impossibly time consuming
unless you start from plenty of “best guesses”.
56
http://www.simian.ac.uk
More on parameters
• Keep a parameter log with current value and
rationale. Look for “the weakest link” and
delegate. Some “parameters” are innocuous like
divisions of colour scales.
• Hypothesis: In principle, a model should have no
tunable parameters except those susceptible to
policy. Here, the journey is more important than
the destination.
• Defensibility is important here. Critics of the
model can’t just say a parameter value is
“wrong”. They have to say what is better, why
and, ideally, how much it matters.
57
http://www.simian.ac.uk
Is this achievable?
• Abdou, Mohamed and Gilbert, Nigel (2009)
Modelling the Emergence and Dynamics of
Workplace Segregation, Mind and Society, 8, pp.
173-191.
• We are starting to be able to point to a few
examples with “gold standard” methodology but it
has taken a while for what is needed to have
become clear.
58
http://www.simian.ac.uk
Elegance
• An elegant model is one that has a favourable ratio of
parameters to explained phenomena. The more
phenomena with the fewer parameters the better. You
know it when you see it: Read models too!
• Example: Farmers choose crops ex ante. Crop totals
(plus exogenous “weather” plus practice if needed)
determine a “fragment” of market price ex post (a big
lump will be the rest of Canada or even the world
market) and this feeds back to regulate cropping
decisions and potentially also farm failure/growth.
These two (measurable?) parameters are all it takes
to “close” the economic system neatly as a first
approximation. (Though we need to know farmer
goals too.)
59
http://www.simian.ac.uk
The Bacharach Conjecture
• A model that makes minimally sensible
assumptions in all processes will outperform one
with excellent assumptions in a few processes
and plainly silly ones elsewhere. (Perfect
competition anyone?)
• Chattoe-Brown’s Lemma: It will also outperform
models with “missing” processes relevant to a
given research question.
• The combination of systematic literature
reviewing and consensual development of
“version 0” is intended to generate models on the
“right” side of this conjecture/lemma.
60
http://www.simian.ac.uk
Exogeneity
• Many social and physical processes are internal
to the target with bi-directional links.
• Some things affect the system but are not
significantly affected by it relative to the research
question. (So, in fact, rainfall may be affected by
surface evaporation but, in a model of
phosphorous transport in a local region, rainfall
can be treated, reasonably, as exogenous as
long as surface evaporation is only a small effect
and doesn’t partake of phosphorous transport
itself.)
61
http://www.simian.ac.uk
Version control
• Have a rolling list of versions and decide which
amendments should go in which version.
• Don’t have a rolling programme. You’ll need finished
versions to “publish” and the project will get in a mess
with “spaghetti code”.
• You can’t really “hack” if the code needs to be passed
around within the team.
• Establish a work plan and procedure for transferring
tasks between team members i. e. a complete
specification of (agreed) upgrades to the next version
for the modeller. Summary reports between, say,
physical and social science “groups”.
62
http://www.simian.ac.uk
Setting up for model testing
• Because MAM/ABSS are complex and data hungry, you don’t get
“a lot” of opportunities to test them. This makes these tests very
important and they need to be planned for in advance.
• Version 0 is supposed to “capture” existing knowledge and is
judged on ability to reproduce stylised facts.
• Can you achieve “hold backs” in various ways? (Historical data,
subsets of phosphorous sampling “stations”.)
• If you have parameter values you can’t set even approximately,
can you tune the model to fit one “output dimension” and then
use others as hold backs? (Clusters versus biographies?)
• This is another reason why careful choice of parameters is
important. Inability to set parameters with principled values
(however imperfect) is a potential “waste” of testing opportunities
which are much more “expensive” than data.
• Think seriously about things like not giving whoever tunes the
parameters the hold back data.
http://www.simian.ac.uk
63
Iterative development: Recap
• Conditional on a “finished” version, which parameter values are
most significant to the ability of the model to “match” real data?
(Very dangerous to “muddle” exploration of parameters with
changes to programme functionality.)
• What is the confidence we have in these crucial parameter
values?
• How should we allocate “research effort” (and of what kind) to
most increase this confidence? Out of the box thinking? Quick and
dirty laboratory experiments? More literature reviewing?
• “Reverse” sensitivity analysis. If we could halve the uncertainty on
this parameter value, would this get us into a single phase of
system behaviour?
• “Selling” research ideas to others? Meta-analysis? Larger samples
for existing findings? (Close friendship example.)
64
http://www.simian.ac.uk
Systematic reviewing 1
• If it is correct that MAM/ABSS is a distinctive
method that can “reuse” old data, how do we
actually do this?
• Example of social capital: Reading surveys and
overviews identifies several key dimensions that
are more or less central. (Everyone agrees on
networks but rather few on bureaucracy.)
• What is the minimal version 0 model that can
integrate these dimensions in an “elegant” way?
• Given this framework/V0M, what does each
specific paper or book contribute?
65
http://www.simian.ac.uk
Systematic reviewing 2
• Start with a narrative (usually an interview, here a journal article).
• “Chop it up” into theoretically significant sections i. e. here the
respondent is talking about “danger” or “respect”.
• Refine the “codes” (categories in the narrative) and produce
“memos” (any ideas created by the data: “This sounds like
patriarchy”, “Get a better tape recorder”).
• Repeat sequentially with more narratives. This should make the
codes robust, show their ubiquity, suggest relations between them
and “falsify” memos against a sample for their “proper” use
(organising principle versus “nice idea”.)
• The model thus reflects the “most agreed” aspects of a
phenomenon in a warrantable way.
• Matters slightly complicated by endogenous evolution of fields.
(Choice and refusal example.)
66
http://www.simian.ac.uk
Developing middle range theory
• Several people have asked me about “generic”
agent architectures. Some exist but aren’t that
useful because there is no such thing as a
“generic social behaviour”.
• However, there are areas where disciplinary
boundaries or the simple “difficulty” of theorising
have created recognisable gaps in what is
needed for certain kinds of MAM/ABSS.
• For example, models tend to be spatial, social or
relational but rarely two and never (to my
knowledge) all three.
67
http://www.simian.ac.uk
PACT models
• In reality, as busy academics, we fully recognise
the social world as consisting primarily of places
and people at specified times.
• Who we meet where defines specific relations
(colleagues) and underpins the “generation” of
different social ties: Which of your colleagues
would you also consider “a friend” to invite home
to meet your family or drink with on a non work
day? How did they get that way?
68
http://www.simian.ac.uk
Version 0
• Each agent has a “time plan”, simply a sequence of
places to “be at” at each point in the day.
• While there, they might meet anyone else who is there
too (geography structures networks). There is a very
low rate of “random” meeting (bus stops).
• Version 1 allows some voluntary activities (let’s all
meet in bar x at 5), easily adding “weekends”.
• Version 2 allows remote communication and deliberate
adaptation by how you “get on” with different people.
(A “good party” will be reproduced. People search
networks and “recommend”.)
• Can make use of distinctive data like “oral histories” of
friendships and time diaries. This will be pretty useful.
69
http://www.simian.ac.uk
Other theory needed
• Real dynamic decision making and thus “real”
communication (co-evolution of problem
representation and solutions rather than “hard coded”
shared representation).
• Behavioural underpinnings of realistic network
dynamics. (Neighbour greetings example.)
• Effective but economical representations of
organisations and hierarchies and their impact.
(Organisations as networks of “vacancies”.)
• Coherent agent representations of norms based on
empirical data in well defined domains.
• Models that can cope with (and explore how agents
cope with) “genuine novelty”.
70
http://www.simian.ac.uk
Reconfiguring/“freeing” methods
• How do we create/convince “theory building” or
“gap filling” ethnographers?
• How do we refocus at least some statistical
analysis on “measures of similarity” between
complex objects (rather than straight lines!)
• Is it easier to teach programmers sociology or
sociologists to programme?
• Why are methods so often tied to disciplines and
“flavoured” accordingly? (Experimental cultures
example.) Can we “free” methods?
71
http://www.simian.ac.uk
4 “No Nos” (and “Yes Yes” variants)
• Models that are of interest only to the designer and his/her
friends. [Models that cast light on unsolved problems in a
particular domain.]
• Models that don’t even capture the stylised facts of knowledge to
date. (Econophysics?) [Models which systematically encapsulate
what is known “to date” in an elegant way.]
• Models that even the designer doesn’t fully understand. [Models
the designer can explain to non-modellers in a way that
provokes intelligent responses.]
• Flaky or “do anything” models. [Models that generate a
systematic programme of data synthesis and/or collection.
Models that make robust predictions that are relatively
insensitive to parameter changes within the known uncertainty.]
72
http://www.simian.ac.uk
Quick thoughts on publication
• Avoid the “no nos”.
• Stand your ground on wrong headed criticisms.
• “Borrow” and accumulate effective responses
(examples) for the “standard objections”. (“Et tu?
defence” example.)
• Judge your target/audience and don’t write
“boiler plate” you don’t need.
• For “print”, figure out how to get your outputs into
graphs and tables.
• Think seriously about code and sample runs on
the web. (Many journals now endorse this.)
73
http://www.simian.ac.uk
What haven’t I talked about?
• The floor is yours!
74
http://www.simian.ac.uk
Now read on
• Ahrweiler, P., Pyka, A. and Gilbert, N. 2004. Simulating knowledge dynamics in innovation networks. In
R. Leombruni and M. Richiardi (eds.) Industry and labor dynamics: The agent-based computational
economics approach. Singapore: World Scientific Press.
• Chattoe, E. 2006. Using simulation to develop testable functionalist explanations: A Case study of
church survival, British Journal of Sociology, 57(3), September, pp. 379-397.
• Chattoe, E. and Hamill, H. 2005. It's Not Who You Know - It's What You Know About People You Don't
Know That Counts: Extending the Analysis of Crime Groups as Social Networks', British Journal of
Criminology, 45(6), November, pp. 860-876.
• Chattoe-Brown, E. 2009. The Social Transmission of Choice: A Simulation with Applications to
Hegemonic Discourse, Mind and Society, 8(2), December, pp. 193-207.
• Epstein, J. M. and Axtell, R. 1996. Growing artificial societies: Social science from the bottom up.
Washington, DC and Cambridge, MA: Brookings Institution Press and MIT Press
• Gilbert, N. and Troitzsch, K. G. 2005 Simulation for the Social Scientist, second edition Buckingham:
Open University Press. [Red cover edition. Important: Don’t get the blue cover first edition by mistake.
Examples not in NL!]
• Gilbert, N. 2007. Agent based models. Quantitative Applications in the Social Sciences 153. London:
Sage.
• Gilbert, N. 2007. A generic model of collectivities, Cybernetics and Systems, 38(7), September, pp.
695-706.
• Ramanath, A. M. and Gilbert, N. 2004. The design of participatory agent-based social simulations.
Journal of Artificial Societies and Social Simulation, 7(4).
75
http://www.simian.ac.uk
Resources
• JASSS, Journal of Artificial Societies and Social
Simulation <http://www.soc.surrey.ac.uk/JASSS/>.
• SIMIAN <http://www.simian.ac.uk>.
• NetLogo <http://ccl.northwestern.edu/netlogo/>.
• simsoc email distribution list
<http://www.jiscmail.ac.uk>.
• ESSA, European Social Simulation Association
<http://www.essa.eu.org>.
• NAACSOS, North American Association for
Computational Social and Organization Sciences
<http://www.casos.cs.cmu.edu/naacsos/>.
• CSSS, Computational Social Science Society <tbc!>.
76
http://www.simian.ac.uk
Welcome to NetLogo
• Click on the NL icon.
• Go to Files > Models Library.
• Select and click on Social Science to show
options.
• Select Segregation and then hit the Open button
bottom right.
77
http://www.simian.ac.uk
Things to observe
• Sliders.
• Buttons.
• Plots.
• World window.
• Interface, Information and Procedures buttons.
• Speed slider.
78
http://www.simian.ac.uk
Experiments with Schelling
•
•
•
•
Try population of 1000 with similar wanted at 50%.
Now try a population of 2000.
What happens with a population of 2500? Why?
Now try SW at 75% with populations at 2000 and 2500.
What are the differences in observable behaviour?
• What is the difference (for 2000 agents) between the
behaviour of the system when SW is 75% and when it
is 40%?
• How many different “dimensions” of the system can
vary based on different combinations of two
parameters?
• Can you get the system to do anything else that is
interesting?
79
http://www.simian.ac.uk
Other suggestive models
• Earth Science: Erosion.
• Earth Science: Grand Canyon.
• Social Science: Wealth Distribution.
• Social Science: Team Assembly.
80
http://www.simian.ac.uk