Diapositive 1

Download Report

Transcript Diapositive 1

Human simulations of vocabulary
learning
Gillette, Gleitman, Gleitman, Lederer
Présentation Interface Syntaxe-Psycholinguistique
Y-Lan BOUREAU
1
Outline
Some background
The problem to be solved
Facts : nouns’ acquisition precedes verbs’
acquisition
Existing theory
Gillette et al.’s hypothesis
Simulation experiment
Learning from observation
Learning from linguistic hints
Discussion
2
The problem of language learning
Children learn language from scratch
Traditional hypothesis :
Children hear adults speak
They spot that « cat » is uttered most frequently
when there is a cat around
They infer that « cat » means ‘cat’
But babies’ vocabulary does not reflect input
frequencies
much more nouns than verbs in babies vocabulary
3
Nouns are learnt earlier
A conceptual hypothesis :
Verbs are conceptually more difficult
So cannot be learnt until babies display
adequate conceptual knowledge
Alternative hypothesis : information
requirements
Verbs require some syntax to be already
acquired (e.g. : I know that Mommy is
coming)
4
Pairing word to world
Three sources of information :
Nonlinguistic evidence (e.g. Mommy says «
cat » when the cat is there)
Linguistic evidence :
• Co-occurrence of semantically related words in
sentences (e.g. food names usually appear with
verbs like « eat »)
• Syntactic structures in which words occur (e.g. a
verb with one subject and two complements is
likely to be of the « give » kind)
5
Hypothesis
Hypothesis : the baby
(1) acquires a small stock of nouns by word-toworld pairing
(2) uses that stock of nouns as a scaffold for
constructing representations of the linguistic input
that will support a more efficient learning
procedure
Support : correlation of changes in vocabulary
size with appearance of multiword speech
6
A simulation experiment
Principle :
Adult learners
• (no conceptual issues any more)
Trying to guess : most frequently used
nouns or verbs
Observational clues : video clips
Linguistic clues : co-occurring words,
syntactic frame
7
First experiment
Only videoclips
Adults trying to guess 24 nouns and 24
verbs
8
Results, Part I : Nouns win
Nouns are guessed with much better results
than verbs :
MEAN % CORRECT FINAL IDENTIFICATION
exp 1
50
40
MEAN %
CORRECT FINAL
IDENTIFICATION
30
20
10
0
NOUNS
VERBS
9
Imageability rules
Provided clues are exclusively visual
Nouns of the set (e.g. elephant, plane, bag) are
a lot more « imageable » than verbs (e.g. think,
know, wait)
IMAGEABILITY
8
7
6
5
4
3
2
1
0
NOUNS
VERBS
10
Results, part II
Nouns
Nouns
Verbs
CORRECTING
WITH
IMAGEABILITY
Verbs
MEAN %
CORRECT
FINAL
IDENTIFICATION
50
45
40
35
30
25
20
15
10
5
0
11
Conclusion of experiment I
The one relevant factor seems to be
imageability
Not that surprising : from a video, you
learn imageable things ; a thing that is
not imageable would be hard to picture !!
12
Linguistic clues vs. Observational clues
All nouns removed.
6 conditions :
1 : videoclips (but with a bip for the verb)
2 : alphabetical lists of nouns
3 : 1+2 (videoclips + alphabetical lists)
4 : syntactic frames with all nonsense words
5 : sentences with only the verb as nonsense
6 : 1+5 (videoclips + sentences)
13
Linguistic clues vs. Observational clues
14
Linguistic clues vs. Observational clues
6 conditions :
1 : videoclips (but with a bip for the verb)
2 : alphabetical lists of nouns
3 : 1+2 (videoclips + alphabetical lists)
4 : syntactic frames with all nonsense words
5 : sentences with only the verb as nonsense
6 : 1+5 (videoclips + sentences)
15
Results
Mean % correct identification
Nouns reintroduced
No nouns
provided !!
100
90
80
70
60
50
40
30
20
10
0
Visuals
reintroduced
ces
+se
n
ten
s
eos
vid
ten
sen
syn
tac
tic
fra
+ li
lips
eoc
ces
+no
un
me
s
Mean % correct
identification
sts
uns
vid
so
f no
list
vid
eoc
lips
No more visual
information !!
16
Linguistic clues vs. Observational clues
Remarkably : leap between 3 and 4,
whereas the reverse could have been
expected !
Interestingly, those verbs that were best
learnt in the observational learnings
show a decrease between 3 and 4
17
Discussion
Verbs : complementary distributions (12
never learnt with visual clues = 12 best
learnt with linguistic clues)
This distribution corresponds to the «
imageability » criterion :
Quite logically, you can learn visually only
what is visually representable
Verbs that use higher level linguistic
representations have to wait until those can
be constructed
18
Discussion : general scheme
First, imageable words are learnt on a
word-to-world pairing basis
Those imageable words are mostly nouns
That would explain why nouns make up
most of young infants’ vocabulary
Second, this first set of words allows
learning of new words on a sentence-toworld pairing
Thus conceptual words can be learnt as well
19
Some reservations
The argument structure is the same across
languages (logical requirements), but :
Adults already know the words, so they could
try to guess the verbs by exhaustive search
with all the information given (e.g. : the best
performance is for « look », and it is probably
due to the use of « look » with « at » )
20
The End
21