Statistical Frequency in Word Segmentation
Download
Report
Transcript Statistical Frequency in Word Segmentation
Statistical Frequency in Word
Segmentation
Words don’t come with nice clean
boundaries between them
• Where are the word boundaries?
Question: How do children work out
where the word boundaries are?
There are several potential clues:
- Pauses (although this is dubious)
- Intonation (this too is dubious)
- Statistical regularities
Statistical Regularities
• Words very rarely begin with [dw],
• Words never begin with [bn],
• Words never begin with [lb],
• Etc.
• So if the child hears these sequences, the
child hypothesizes the sequence occurred
in the middle or at the end of the word.
Statistical Regularities
• Voiceless stops that begin words are
almost always aspirated,
• Voiced segments that end words are often
de-voiced,
• Various other phonological processes may
occur, e.g., word-final frication, etc.
• So these are phonological clues that may
help segment the speech stream.
Problem
• In order for children to be able to make
use of these cues, they must be able to
track the frequency of such items in the
speech, otherwise it is a useless cue.
• So if the child is not able to track the
frequency of [bn] at the beginning of
words, what use is using this strategy?
Statistical Tracking
• Very recent work suggests that children do in
fact have the capacity to track statistical
frequencies of certain elements in their
environment.
• Major researchers: Jenny Saffran (Wisconsin),
Rebecca Gomez (Arizona), Elisa Newport
(Rochester), Richard Aslin (Rochester), LouAnn
Gerken (Arizona), Gary Marcus (NYU), etc.
The Experiment - Overview
• Create a synthesized string of syllables that
occur in a particular frequency (can’t use English…).
• Expose the children to this string of syllables for
~20 minutes.
• Test children to see if they have a preference for
the highly frequent syllable sets or the rare
syllable sets.
• If children show a preference (no matter what
direction that preference is in), then children are
sensitive to frequencies of syllables in the input.
Sample Stimulus
Their language consisted of:
• Four consonants (p,t,b,d)
• Three vowels (a, i, u)
• Which when combined created 12
syllables (pa, ti, bu, da, etc.).
• These then created six words:
• babupu, bupada, dutaba, patubi, pidabu,
and tutibu
ba
bu
pu
bu
pa
da
du
ta
ba
pa
tu
bi
pi
da
bu
tu
ti
bu
ba
bi
bu
pa
pi
pu
2
ta
ti
tu
1
4
2
1
1
1
1
2
da
di
du
2
0
1
babu
bupu
bupa
pada
duta
taba
1
patu
tubi
pida
1
1
1
1
1
1
1
1
dabu
tuti
tibu
1
1
1
Transitional Probabilities
• The chances of a word containing bu are
much greater than the chances of a word
containing di.
• Transitional probabilities quantify this.
• The Transitional Probability of xy is:
xy
x
Transitional Probabilities
• So for the word babupu, the transitional
probability of babu is calculated as follows:
Frequency of babu / Frequency of ba
1/2 = 0.5
Frequency of bupu / Frequency of bu
1/4 = 0.25
Overall transitional probability of the word babupu =
(0.5+0.25) / 2 = 0.375
What’s the point?
• Transitional probability was manipulated
so that:
• The transitional probability was high within
a word, but low across a word boundary.
This is what a word IS in real life.
ba bu pu bu pa da du ta ba
High
High
Low
High
High
Low
High
High
Transitional
Transitional
Transitional
Transitional
Transitional
Transitional
Transitional Transitional
Probability
Probability
Probability
ProbabilityProbabilityProbability
Probability Probability
• 300 tokens of each of the six words were
randomly concatenated.
• All word boundaries were removed
• This left 4536 continuous syllables, which
were read by a speech synthesizer.
• Synthesizer produced a monotone of
syllables at a rate of 216 syllables per
minute.
Procedures
• Subjects consisted of 24 undergraduate
students.
• Subjects were told to listen to ‘nonsense’
language.
• Task is to figure out where words
begin/end.
• After 3 blocks of 7 minutes of exposure to
the language, subjects were tested.
Test Procedure
• Subjects heard two tri-syllabic strings, e.g.,
bu-pa-da
Real word
and
pi-da-bu
Not a real word
Which sounds more like a word from this
nonsense language?
36 trials in the test.
Results
• Mean score correct for all subjects was
27.2, where chance is 18. t-test shows
this to be statistically significantly different
from chance.
• Conclusion: adults are able to recognize
what is a word and what is not a word
based purely on statistical frequency.
Additional finding:
• the three words with the most common
syllables in them were easiest to
recognize.
• the three words with the least common
syllables in them were hardest to
recognize.
But can kids do this too?
• Answer appears to be Yes.
• Saffran et al. (1996) used essentially the
same stimuli on 8 month old children
• Used four strings of words instead of six.
• Children were exposed for only 2 minutes
(not 21 minutes)
Methodology
• Head turning Procedure
light
Child
speakers
Results
• Children looked statistically longer at the
speaker from which novel words were
being produced.
• Why is this? Why wouldn’t they look
longer at the speaker from which familiar
words are being produced?
Bottom Line
• Children have the ability to track
transitional probabilities of sounds on the
basis of very little exposure.
• This is therefore how words are parsed.
Tool against Nativism…?
• This has recently been the most prolific
weapon against the idea that children use
innate knowledge to acquire language.
• If children are using such sophisticated
skills to segment words, why can’t they
use similar (non-linguistic) skills to learn
syntax?
But it isn’t so simple
• Marcus et al. (1999) trained children on
sentences of the following sort:
• la – ta – la
• ga – na – ga
• da – ba – da
• x–y–x
And tested them on:
• wo – fe – wo
• gi – tu – gi
• po – zi – po
Namely, words with:
-new syllables, but
-the same structure (x-y-x)
And…
• wo – fe – fe
• gi – tu – tu
• po – zi – zi
Namely, words with:
-new syllables, and
-new structure (x-y-y)
Results
• Children appear to recognize the
difference between these sets of stimuli
Children are therefore tracking structure
and not just simple statistics.
Questions to ask yourself:
• Why would statistical tracking be useful to
linguists?
As a tool to explain language acquisition.
• Does statistical tracking explain how
children acquire language?
No, only certain aspects of it.
• What aspects of language can we track?
So far, it appears only phonologically
related things can be tracked like this (not
meaning-related things).
Most Important Questions
• Is this useful for ALL languages on Earth?
It appears that statistical tracking is only useful
for auditory stimuli, not visual…ASL?
• Are humans the only creatures that can do this?
(I hope so, otherwise other animals should have
language too…)
No. Vervet and Tamarin monkeys have been
shown to have essentially the same abilities that
humans do.
So what do we really know?
• Kids have spectacular abilities to track
statistics.
• But so do adults (so why can’t adults learn
languages as well as kids?)
• But so do monkeys (so why can’t monkeys
learn language as well as humans?)
• This ability appears to be limited to
statistics in auditory perception.