- Lorentz Center

Download Report

Transcript - Lorentz Center

Stochastic Diffusion
Processes: communication,
search and cognition
“Nothing seems more possible to me than that people someday will come to the definite opinion that
there is no copy in the ... nervous system which corresponds to a particular thought, or a particular idea,
or memory.”
Ludwig Wittgenstein, 1948:
Last writings on the Philosophy of Psychology, Volume 1.
Mark Bishop
Goldsmiths, University of London
Background
•
•
The talk is a synthesis of recent papers by Bishop
(2009) and Nasuto, Bishop & de Meyer (2009):
•
Bishop, J.M., (2009), A Cognitive Computing fallacy? Cognition,
computations and panpsychism, Cognitive Computing 1:3, pp. 221233.
•
Nasuto, S.J., Bishop, J.M. & De Meyer (2009), Communicating
neurons: a connectionist spiking neuron implementation of stochastic
diffusion search, Neurocomputing 72, pp. 704-712.
Acknowledgements:
•
A number of RAs, graduate/project students worked with me to
establish the foundations of SDP; in this talk in particular I draw on
results from: Paul Beattie, Darren Myatt, Mohammad Majid, Daniel
Jones, Tom Morey, Matt Warriner & Nicoletta Nicolaou.
Computations as cognition
•
In this talk I claim a ubiquitous computational metaphor lies at the heart
of cognitive science in [at least] three modes:
• (1) Explicitly: cognition is ‘computations on symbols’
•
•
GOFAI (‘[physical] symbol systems’); functionalism (philosophy
of mind); cognitivism (psychology); language of thought
(philosophy; linguistics)
(2) Implicitly: cognition as ‘computations on sub-symbols’
•
Connectionism (sub-symbolic AI; psychology; linguistics); the
digital connectionist theory of mind (philosophy of mind).
• (3) Descriptively: cognitive modelling via computational
simulations
•
Hodgkin–Huxley mathematical models of neuron action
potentials (computational neuroscience; psychology).
Overview
a) For each of the three identified modes of cognitive science I will highlight one
or more well known critiques that motivate a change from the hegemony of
the computational cognitive metaphor.
b) I will subsequently suggest a new cognitive metaphor; one grounded on
‘interactions and communication’.
c) And I will conclude by outlining NESTER - a novel connectionist architecture
based on Stochastic Diffusion Processes - that may escape [at least some of
the] classical critiques of computational cognition.
a) NB. This is not to suggest that we throw the baby (computational
modelling) out with the bath water (the generic computational metaphor).
b) Simply that a new metaphor of communication may shed a new and
useful light on areas of cognitive science hitherto obfuscated by the fog
of mere computations.
1. Symbolic cognition
•
Cognition involves discrete, internal mental states (representations or
symbols) whose manipulation can be described in terms of rules or
algorithms:
•
•
Good old-fashioned cognitive psychology; computations on
representations:
•
Cognitive states are computational relations to computational mental
representations that have content.
•
Cognitive process - changes in cognitive states - are computational
operations on computational mental representations that have
content.
Good old-fashioned AI; computation on symbols:
•
Newell & Simon’s Physical Symbol System (PSS) hypothesis:
“Any intelligent machine is at its core a PSS ... a PSS has the
necessary and sufficient means for general intelligent action”.
Some critiques of symbolic
cognition
• Godelian:
• Lucas - with [theoretical] knowledge of the Godel formula of any mathematical
system, a human is always greater than any given computational system.
• Penrose - computations cannot capture all of human [mathematical]
understanding.
• Searlian:
• The Chinese room argument - syntax is not sufficient for semantics.
• Computation as an ‘observer relative’ phenomena:
• Searle - “For any program there is some sufficiently complex object such that
there is some description of the object under which it is implementing the
program”; e.g. Searle’s wall as an instantiation of the ‘Wordstar’ program.
• Putnam - a rock implements every input-less FSA.
• Bishop - a non-repeating digital counter (or, pace Putnam, any ‘open physical
system’ such as a rock) implements any program with known-input over a finite
time period.
2. Sub-symbolic cognition
• In connectionist systems networks of learnt (tuned) feature detectors
cause functionally specified cognitive effects; knowledge defined as
vectors in Rn.
• A Learning Algorithm (e.g. back propagation) maps a spatial
trajectory of network parameters in a Euclidean space, Rn.
• Over time network parameters learn/evolve to perform desired
mappings over pairs of real valued input/output vectors.
• Strengths of classical connectionism include:
• Its application to many engineering problems requiring flexible A.I.
• Its use as a metaphor for both high level and low level cognitive
processes.
Critiques of sub-symbolic cognition
(a) Van de Velde: type / token
knowledge
• Standard connectionist models
most naturally represent
knowledge as ‘types’ or ‘classes’,
(book, computer, chair etc).
• A restriction Van de Velde
recognised as, “... a fundamental
cause of many problems when
modelling symbolic processes by
connectionist networks.”
Critiques of sub-symbolic cognition
(b) Dinsmore: only arity zero predicates
•
Conventional connectionist
networks can represent knowledge
as tokens, however such tokens
are always materially and spatially
defined as neuronal activations in
the network.
•
•
Either each node represents a
specific feature or knowledge
is distributed across
activations of groups of nodes.
Dinsmore suggests this form of
representation is limited to ‘arityzero predicates’ and that this is too
strong a restriction to model
general, real-world knowledge.
Critiques of sub-symbolic cognition
(c) Abbott: implausible use of inhibition
• e.g. In many ANNs lateral inhibition has
been extensively used to:
• ... perform ‘winner take all’ (Grossberg);
• … normalise signals and/or prevent
saturation (Douglas);
• … define topological structure
(Kohonen).
• However Abbott suggests there is a “lack of
evidence for widespread inhibitory neuronal
mechanisms in the cortex”.
3) Computational modelling
•
All matter - from the simplest particles to the most complex living
organisms - undergoes physical processes which are not usually given any
special computational interpretations.
•
For example, although we can describe the operation of a spring, as it
extends under moderate force, by Hook’s law; we don’t say that the
spring computes, according to Hook’s law, how much it should deform.
• However, when it comes to nervous systems the situation changes
abruptly.
•
Since the publication of the Hodgkin-Huxley equations in 1952 single
neuron behaviour has been extensively modelled computationally;
•
Subsequently in neuroscience it has been assumed that neurons possess
special computational capabilities (e.g. this neuron computes x; where x
may be gradients, edges, motion etc) which are not attributed to other,
more complex, biological substances (e.g. DNA).
3) Critiques [of the
hegemony] of computational
modelling (i)
• The assumption of ‘computational
capabilities’ to individual neurons is an
anthropomorphic viewpoint, because
computation is an intentional notion and
assumes existence of some ‘demon’ that is
able to interpret it.
• Thus, the very assumption of
‘computational capabilities’ of real
neurons leads to a homuncular theory
of mind.
3) Critiques of computational
modelling (ii)
•
Discoveries in neuroscience since the development of the Hodgkin-Huxley
model reveal complex neuronal behaviour and suggest that the
mathematical characterisation of single neurons via non-linear ordinary
differential equations does not capture the information processing
complexity of real neurons:
•
In particular it has been hypothesised (e.g. by Koch and Barlow and
Granger amongst others) that a neuron can select input contingent on its
spatial location on the dendritic tree or its temporal structure.
• Furthermore, there is strong evidence that real neurons operate on richer
information than provided by a single real number (mean firing rate) and
therefore that the full gamut of their operation cannot be adequately
described in a standard Euclidean setting.
•
Instead of modelling the neuron as a logical or numerical function,
perhaps it could be better described by an alternative metaphor?
An alternative metaphor:
communication and interaction
•
•
Communication as process; two definitions from the
dictionary:
•
“relating to the imparting or transmission of
something”, (OED).
•
“something imparted, interchanged, or transmitted”,
(Dictionary.com).
In this sense communication is a process of interaction
that occurs between agent and umwelt;
•
•
Umwelt being the outer world, environment or
reality, as it affects the agent; as such it may contain
other agents.
Thus, contra computation, communication as process is:
•
an observer independent, objective property of
agent-environment systems;
•
a potentially more powerful metaphor than
algorithms.
Swarm Intelligence (SI)
•
In the last two decades there has been a shift in research in
A.I. that seeks to move research away from the classical
modes of either equating intelligence with mere symbol
manipulations or simple connectionist systems ...
•
•
... Moving away from the view that mind is merely
equivalent to brain – a private internal process – hence
de-emphasising the autonomy of the individual thinker
and instead emphasising the collective nature of many
intelligent processes.
Swarm Intelligence emphasises the social nature of some
cognitive processes and draws inspiration from many natural
collective systems that solve complex problems in search
and optimisation.
Characteristics of swarm
intelligence systems
• Swarm Intelligence systems are typically made up of a population
of simple agents interacting locally with one another and with their
environment.
• Swarm Intelligence agents typically follow very simple rules:
• There is no centralised control structure dictating how
individual agents should react and behave;
• instead local interactions between agents lead to the
emergence of [seemingly] intelligent global behaviour.
• Natural examples of Swarm Intelligence include ant colonies, bird
flocking, animal herding, bacterial growth, and fish schooling ...
• ... even, as we shall see, workshop delegates seeking a good
place to eat in an unfamiliar town!
The Restaurant Game
•
•
A group of delegates arrive in a foreign town for an extended workshop on
the ‘Philosophy of the Information and Computer Sciences’ and need to
find a good place to eat.
•
A ‘good’ place to eat is the restaurant where most delegates are likely
to choose a meal they deem ‘GOOD’.
•
An individual delegate’s response to a randomly selected meal from a
restaurant menu {GOOD or BAD} is termed a ‘partial hypothesis
evaluation’; it provides partial evidence on the restaurant’s overall
quality.
•
The ‘search space’ (each delegate’s hypothesis space) is the set of all
restaurants in the town.
A naive exhaustive search by all the delegates for the best restaurant is
impractical as there will be too many (restaurant : dish) combinations to
evaluate over the duration of the summer school.
A simple metaphor* for a stochastic
diffusion search to find a ‘good’
restaurant
•
EACH DELEGATE:
1. Opens ‘Yellow Pages’ and selects a restaurant to visit at random, so defining
the agent’s initial restaurant hypothesis.
2. Partial hypothesis evaluation: at dinner the delegate selects a meal from
the menu at random and subsequently decides if it was ‘GOOD’ or ‘BAD’.
3. Utilising Passive recruitment: the next morning at breakfast …
4. IF <last night’s meal was ‘GOOD’>
5. THEN maintain restaurant hypothesis and GOTO (2)
6. ELSE IF <last night’s meal was ‘BAD’> THEN communicate with a random
colleague:
7. IF <colleague’s meal was ‘GOOD’>
8. THEN adopt colleague’s restaurant hypothesis and GOTO (2)
9. ELSE GOTO (1).
* The ‘Restaurant Game’ is offered as an illustration of SDS diffusion and partial evaluation mechanisms only; the restaurant game is not fully isomorphic to SDS in some pathological cases.
NESTER: a connectionist
framework to perform SDS
• Retina and Memory cells:
• Correspond to search space and
target.
• Temporally encode what a feature is
and where it is via Inter Spike
Intervals, ISI’s.
• Matching cells:
• Correspond to a population of SDS
agents.
• Periodically broadcast their hypothesis
to other matching cells encoded via an
Inter Spike Interval, (ISI).
• All cells types operate independently and
asynchronously.
NESTER implements SDS
•
In our 2009 paper Nasuto, Bishop & de
Meyer demonstrate that in its operation
NESTER instantiates Stochastic
Diffusion Search.
•
Hence, over time, the hypothesis
maintained by a dynamic cluster of
matching cells will cluster around the
best fit of the target on the retina
[search space].
•
Hence synchronisation of matching
cell hypothesis-signals (encoded via
ISIs) indicates convergence onto the
‘best fit’ location of the target on the
retina.
Knowledge representation in
NESTER
•
Each NESTER matching cell processes bi-variate information
as an ISI encoding a ‘feature’ value and an ‘identifier’ value.
•
A ‘feature’ value:
•
•
An ‘identifier’ value:
•
•
Temporal encoding of the value of a target ‘feature’.
Temporal encoding of the relative position of the
feature (either on the retina or in the target).
Hence in NESTER knowledge is not restricted to arity
zero predicates and knowledge is naturally processed as
‘tokens’ not ‘types’.
•
Non-spatial binding of
semantic knowledge in
NESTER
Unlike conventional connectionist systems, in NESTER knowledge is
not physically bound to specific matching cells, as the activity of
individual cells dynamically fluctuates over time.
•
•
Hence in an individual matching cell (or specific groups of cells),
activity has no fixed semantic interpretation.
Instead, by process of communication and interaction, a network
of NESTER matching cells naturally self-organise in response to
environmental stimuli.
•
On convergence, temporal stability in the search space is
reflected by collective temporal stability in a pattern of activity
across matching cells.
• Such a cluster is dynamic in nature, yet stable, analogous to, “a
forest whose contours do not change but whose individual trees
do”, (Arthur).
Stochastic Diffusion
Processes:
cognition
as
communication
•
In this talk I have criticised the ubiquity of the computational metaphor in
Cognitive Science.
•
I have introduced the ‘Restaurant Game’ as a metaphor for a simple
Stochastic Diffusion Search (SDS) and subsequently described NESTER, as
a spiking neuron connectionist implementation of SDS.
•
In conclusion I suggest that NESTER is a potentially interesting cognitive
architecture as it:
•
is not vulnerable to [at least some of] the standard critiques of
computational connectionism;
•
and is most naturally understood in terms of [the metaphors of] interaction
and communication.
•
For SDS demo see: <http://doc.gold.ac.uk/~map01mm/SDSSim/>.
•
For SDP repository see
<http://www.doc.gold.ac.uk/~mas02mb/sdp/index.htm>.
•
Some investigations
employing Stochastic
Diffusion Processes
A unification mechanism for Baars’ ‘global workspace’ and Dennett’s ‘multiple drafts’ (Nasuto); a
solution to the binding problem (Nasuto); a model for multi-stable visual attention (Nasuto); models of
visual attention (Summers); a novel metaphor for cognitive processing (Nasuto, Bishop et al.);
parameter estimation / 3D computer vision (Bishop; Myatt); resource allocation (Majid); sequence
detection (Jones); lip tracking (Grech-Cini & McKee); eye tracking (Bishop & Torr); mobile robot
localisation (Beattie et al.); site selection for wireless networks (Hurley & Whitaker); speech
recognition (Nicolaou); methods for automated object placement in virtual scenes (Cant
Langensiepen); feature tracking in Atmospheric Motion Vectors (Hernandez-Carrascal & Nasuto);
system for hybridized efficient genetic algorithms to solve bi-objective optimization problems with
application to network computing (US PATENT 60/941,600); automatic reconstruction of 3D dendritic
structure from optical light microscopy serial stacks (Nasuto); physically inspired artificial learning
models (Ruta & Gabrys); cellular automata and immunity amplified stochastic diffusion search
(Coulter & Ehlers); hybrid control system for collectives of evolvable nanorobots and microrobots
([US PATENT AG06F1900FI] Solomon Research); individual customers influence on the operation of
virtual power plants (Britta [MVV Energie]); stochastic diffusion search for real-time web search
(Hameed); swarm intelligence systems for transportation engineering: principles and applications
(Teodorovic); stochastic diffusion search and voting methods (Nircan); swarming behaviour in
wagering gaming machines ([PATENT WO 2009005578 20090108] ); noise, cost and speedaccuracy trade-offs: decision-making in a decentralized system (Marshall, Dornhaus, Franks &
Kovacs); computational molecular biology (Jones); moon rover localisation (Hari & Thiyagarajan);
testing and evaluation of the effectiveness of the stochastic search and optimization alogrithms
developed in a dynamic military systems environment ([US Military Research Call] ); swarm
intelligence stability based on stochastic diffusion search (Abbas, Mudathir, Rao & Rao); stochastic
programming of computer agents and system of systems designs (US Military).