Modelling Language Evolution Lecture 3: Evolving Syntax
Download
Report
Transcript Modelling Language Evolution Lecture 3: Evolving Syntax
Modelling Language Evolution
Lecture 3: Evolving Syntax
Simon Kirby
University of Edinburgh
Language Evolution & Computation Research Unit
Evolving the ability to learn syntax
(Batali 1994)
A “standard” recurrent network does not seem to be
able to learn syntax without some help
Elman provides this “help” via incremental memory
The network comes pre-setup to help it learn syntax
i.e., our model of an individual is born with a working
memory that grows over time
Does this correspond to an innate prespecification
for language learning?
Where do innate abilities come from?
If an organism has some innate predisposition…
… and that predisposition is functional, how do we
explain it?
Darwinian natural selection seems appropriate.
Could we model natural selection?
Can we evolve a syntax learner? (as opposed to
building one by hand?)
What things about a network could be
innate?
Many features of networks could be thought of as
innately determined…
The length of time before context units are blanked
The shape of the activation function
The number of nodes in the hidden layer
…
Batali suggests: the initial connection weights.
Normally, these are random – but what if they were
specified by genes?
How to model an organism
GENOTYPE
0.2 -1.3 0.05 0.9 -0.5 0.001 0.1
development
PHENOTYPE
The model has genes, which are expressed as a
phenotype.
The phenotype is simply the initial state of a network
(before learning).
How to evolve organisms
Crucial aspects of evolution:
A population of organisms (with varying phenotypes)
A task which they are trying to succeed at
A measure of how fit they are at this task
A way of selecting the fittest
A way of allowing the genes of the fittest to survive
A mechanism for introducing variation into the gene pool
Various techniques to model all of this (i.e., Genetic
Algorithms, Artificial Life etc.)
Batali’s model of evolution
1. Each organism (or agent) has its weights set by
genes
2. The agents then trained on some language
3. The agents’ error is used to assign fitness
4. Only the top third of the population is kept
5. The top third have their weights reset to what their
genes specify
6. Each agent “gives birth” to two new agents with
approximately the same genes (i.e., genes are
mutated)
7. Go to step 1.
The language task
One of the simplest languages that involves
n n
a
embedding is b
ab
aabb
aaaaaaaabbbbbbbb
*aaaaaaaaabbbbbbbb
What machinery would you need to recognise
strings from this language?
Minimally – a simple counter
Can an SRN with random initial weights learn this
language?
Performance of a trained
(but non-evolved) net
Networks fail to learn to count (although some
aspects of the language are learnt).
Evolving a better network
Batali used a population of 24 nets (initially with
genes specifying random weights)
Evolved using a fitness function based on ability
at a nb n after training
After 150 generations, the networks were better at
learning the task
They evolved initial weight settings that made
learning syntax possible
Evolved network performance
a
b
sp
rec
Issues that remain…
What is learning doing?
If language is always the same, the networks could
eventually end up with the whole thing innate (and
not need learning at all!)
What would happen if the networks were trained on
a class of languages?
Initial weights are a different type of innateness than
Elman’s. Can Batali also explain the critical period?
Is evolution just the same as learning?
We can think of a fitness landscape just like an error surface
What are the differences?
Does evolution do gradient descent?