A Connectionist Simulation of the Empirical Acquisition of

Download Report

Transcript A Connectionist Simulation of the Empirical Acquisition of

Connectionist Simulation of the Empirical
Acquisition of Grammatical Relations –
William C. Morris, Jeffrey Elman

Prepared by: Katarzyna Gorczyca i
Izabela Wnęk
Introduction



-
Many accounts of L1A assume that grammatical relations
and linking rules are innate and universal.
The main aim of our presentation - quite an opposite
approach: grammatical relations are learnt in a bottomup fashion in lg acquisition process.
The proposal is based on two observations:
early production of childhood speech is formulaic and
becomes systematic in a progressive fashion
grammatical relations are family-resemblance categories
and are too complex to be described by a single parameter
This hypothesis – tested by
connectionists (Elman) –
Simple Recurrent Network
SRN:
 learns to map from
sentences to semantic
roles
 its newly developed
subject has hidden layers
representations
 makes generalisations and
undergeneralisations
similar to those made by
children
Innateness vs bottom-up
learning
Grammatical relations (subject, object) – a
problem for lg acquisition system
/Semantics – world-knowledge <> syntax – abstract/
 One approach to learning syntax – grammatical
relations relegated to the innate endowment that
the child is born with
- single parameter with the binary value:

accusative and ergative
is sufficient to account for various grammatical
systems
BUT: cross-linguistically there’re no strictly accusative or
ergative lgs
Connectionists’ proposal:

Abstractions such as subject emerge in two steps:
1.
2.
rote learning of particular constructions
merging of the separately learnt constructions (minigrammars)
The experiment to be presented shows:
neural net trained with the task of assigning
semantic roles to sentence constituents can
acquire grammatical relations
- it associates particular subjecthood properties with the
appropriate verb arguments
- it manages (to a certain extent) to abstract this nominal
from its semantic context
Shape of grammatical
relations
Lg acquisition theories claim that lgs are either:

ACCUSATIVE
Subject is an agent of
the action, eg: Max hit
Larry and run away.
(it is Max that run away;
nominal Max controls
clause coordination)

ERGATIVE
Subject is a patient of
the action, eg: Max hit Larry
and run away.
(it is Larry that run
away; nominal Larry
controls clause
coordination; Larry was hit
by Max and run away)
BUT!
The issue is not merely the
identity of the subject.
The issue is: what properties
the various grammatical
relations control.
Exemplary properties that can be associated
with the subject cross-linguistically:
-
-
addressee of imperatives: Idalia, listen to us!
control of reflexivisation: Beata enjoys herself.
control of coordination: Laura pinched Żaneta and
smiled.
The grammatical relations of various lgs
control various combinations of these
(and other) properties.
This is what we mean by the „SHAPE”of
grammatical relations.
Example:
English – highly syntactically accusative lg
(Most of the properties are controlled by the subject)
Dyrbial – highly syntactically ergative lg
(Most of the properties are controlled by the „ergative subject” or „pivot”)
Kampangan – split lg
(Neither highly ergative nor accusative in syntax)

For a lg acquisition process to be
UNIVERSAL, it must be able to
accomodate a variety of lg types.
 Simply setting on the identity of the
subject is not sufficient.
 Rather, the various control patterns
(‘shapes’) must be accomodated.
SRN- can learn a variety of shapes
A connectionist simulation

Testing whether a network could build abstract
relationships corresponding to „subjects” and
„objects”
There is no innate knowledge of lg in the network
(no grammatical relations, no features
facilitating word displacement etc.)
Main assumptions:
1. System can process sequential data
2. It’s trying to map sequences of words to
semantic roles
EXPERIMENT
1.
2.
3.
4.
5.
SRN takes in sentences with various patterns
At each time step, a word or a full stop is
presented
After each sentence – an input representing
„reset” is presented to zero out the outputs.
The output patterns represent semantic roles in a
slot-based respresentation.
The input vocabulary – 56 words (25 verbs, 25
nouns, 6 function words)
Network architecture

SRN was taught to assign the proper noun
identifiers to the appropriate roles for a number of
sentence structures.
 Types of sentences:
1. simple declerative intransitives, eg.
Sandy jumped (agent role) Sandy fell (patient role)
2. simple declerative transitives, eg.
Sandy kissed him (ag. & pt.) Sandy saw him.
3. simple declerative passives,eg.
Sandy was kissed (pt.)
4. questions
Who did Sandy kiss? (ag.& pt., object questioned)
Who kissed Sandy? (ag.&pt., subject questioned)
Generalisation test

-
Test involved two systematic gaps – two
types of sentences not present in training:
passive sentences with experiental verbs,
eg. Dominika was seen by Max.
questioning embedded subjects in transitive
clauses with experiental verbs
eg. Who did Marta persuade to see Lidka?
RESULTS:
SRN (as connectionists expected) reacted to those
gaps in a different way:
1.
2.
It didn’t cope with the passive construction.
It bridged the questioning embedded subject
gap.
1.
”conspiracy of construction”
(it was provided with a sufficiently varied
constructions to cope with this gap successfully)
The same was observed in case of child L1A

How the network represented
subjects internally (in the hidden
layer) ?
1.
each verb construction combination has a
specific place where the subject is being
encoded
agents and patients are stored separately
because they can appear together &
experiencers are stored very close to
agents since they never apper together.
2.
CONCLUSIONS:

The most abstract aspects of lg are learnable.
 The network’s ability to abstract from semantics –
ability to partially bridge the artificial gap in the
training set (questioned embedded subject of
experiental verbs).
 SRN was able to define the position of the subject
in terms of a semantically-abstract entity.