Velar palatalization in Russian and artificial grammar.

Download Report

Transcript Velar palatalization in Russian and artificial grammar.

Vsevolod Kapatsinski
Indiana University
[email protected]
http://mypage.indiana.edu/~vkapatsi/
Rule reliability and productivity
Velar palatalization in Russian and
artificial grammar
Laboratory Phonology XI
30 June – 2 July 2008
Work supported by NIH Training Grant DC-00012
and NIH Research Grant DC-00111
1
The puzzle of productivity loss
• Morphophonemic rules can lose
productivity while having no exceptions in
the lexicon
• How does this happen? If there are a lot of
examples supporting a rule, why would it
fail?
2
Case study:
Velar palatalization in Russian
kt
g
/_ -i
-ek/ik
-ok
(verbal stem extension)
(nominal diminutive)
(nominal diminutive)
Exceptionless in the lexicon (Levikova 2003, Sheveleva 1974)
Fully productive before -ek and -ok.
but
Partially productive before –i and -ik.
Why?
3
Hypothesis
•
•
•
•
Rules are extracted from the lexicon
Rules compete for inputs
Competition is resolved by relative reliability
Reliability = number of inputs that undergo
the rule divided by the number of inputs that
could undergo the rule
(Albright and Hayes 2003, Pierrehumbert 2006)
For []  ed , # of verbs that take –ed / # of verbs in English
4
Rule-Based Learner
(Albright and Hayes 2003)
• Takes in a lexicon of pairs of morphologically related
words
blok, blotisok, sotisobak, sobatizavtrak, zavtraka• Generalizes rules from it and weights them by
reliability
k  ti / o_ (1.0)
k  ti / V[+back;-high]__(0.75)
5
[]  a / ak_ (0.5)
Rule-Based Learner
(Albright and Hayes 2003)
• Generalizes rules from it and weights them by reliability
k  ti / o_ (1.0)
k  ti / V[+back;-high]__(0.75)
[]  a / ak_ (0.5)
• For each distinct output that an input can become, there will be one rule that’s
more reliable than other rules producing that output from that input
bok  boti
k  ti / o_ (1.0)
k  ti / V[+back;-high]__(0.75)
• The probability of an output given an input is given by dividing the reliability
of the most reliable applicable rule producing that output by the sum of
reliabilities of the most reliable rules leading to different outputs
bok 
boti 1/(1+0.5) = 67%
boka 0.5/(1+0.5) = 33%
6
blok, blotisok, sotilak, latizavtrak, zavtraka-
k  ti / o_ (1.0)
k  ti / V[+back;-high]__(0.75)
[]  a / ak_ (0.5)
bak 
bati 0.75/(0.75+0.5) = 60%
baka 0.5/(0.5+0.75) = 40%
*baki  palatalization never fails before -i
7
blok, blotisok, sotisobak, sobatizavtrak, zavtraka-
plat
kos
trub
var
ver
sol
voz
sor
ar
platikositrubivariverisolivozisoriari-
k  ti / o_ (1.0)
k  ti / V[+back;-high]__(0.75)
[]  i / C_ (0.69)
[]  a / ak_ (0.5)
bak 
bati 0.75/(0.75+0.5+0.69) = 39%
baka 0.5/(0.5+0.75+0.69) = 26%
8
baki 0.69/(0.5+0.75+0.69) = 36%  palatalization fails
-i is preceded by an alveopalatal in the output
-i is preceded by a velar in the output
Stored words derived from
a non-velar input
and bearing -i
Stored words derived from
a velar-final input
and bearing -i
New inputs that end
in a velar
and take -i
-ek
-ok
Stored words derived from
a non-velar input
and bearing -i
-i is preceded by an alveopalatal in the output
-i is preceded by a velar in the output
Stored words derived from
a velar-final input
and bearing -i
New inputs that end
in a velar
and take -i
-i
-ik
Stored words derived from
a non-velar input
and bearing -i
-i is preceded by an alveopalatal in the output
-i is preceded by a velar in the output
Stored words derived from
a velar-final input
and bearing -i
New inputs that end
in a velar
and take -i
Testing the hypothesis
• Borrowings from English in online communication
– Inputs:
• Take all verbs and nouns that end in /k/ or /g/ from the British
National Corpus, e.g., lock
• Plus a sample of verbs and nouns ending in other stops (for nouns,
matched preceding vowel proportions)
– Outputs:
• Choose suffix
– For a verb, -i, -a, or –ova
– For a noun, -ik, -ek, or –ok
56 velar-final, 140 non-velar-final
20 velar-final, 40 non-velar-final
• Choose whether to change the stem
– For a verb: lokatj, lokovatj, lotitj, lokitj,
– For a noun: lotok, lokok, lotek, lokek, lotik, lokik
– Count:
• Submit the possible outputs to Google
• Rate of vel.pal. failure: lokitj / (lotitj + lokitj)
12
Results:
Stem extensions
Likelihood of taking -i
Velar-final
Labial-final
Base
Coronal-final
Velars favor –a over –i while –i is favored elsewhere
13
Results:
Stem extensions
Mean
44%
Velar palatalization is likely to fail before –i despite being 14
exceptionless; AND –i is favored by non-velar-final inputs
Results: Diminutives
Mean 1% Mean 35%
Mean 0%
-ik is favored by
non-velars
-ok and –ek are
favored by velars
Velar palatalization
fails only before -ik
15
Results: Diminutives
-ek
-ik
-ok
Mean
Mean 10%
0%
Mean 1%
0% Mean
Mean 35%
100% Mean
g
-ik is favored by
non-velars
-ok and –ek are
favored by velars
k
Velar palatalization
fails only before -ik
p,b,t,d
16
Evidence from artificial grammar
• Issue:
• speakers avoid using –i after velars because
vel.pal. is unproductive before –i
OR
• vel.pal. is unproductive before –i because
-i is mostly used after non-velars
17
Evidence from artificial grammar
• Native English speakers exposed to two artificial
languages:
Language
BLUE
RED
{k;g}{t;d}i
100%
30
{t;d;p;b}  {t;d;p;b}i
25%
75%
8
24
{t;d;p;b}  {t;d;p;b}a
75%
25%
24
8
18
Paradigm
(Bybee and Newman 1995)
19
Paradigm
20
The subject repeats the singular-plural pair
Paradigm
21
Paradigm
22
The subject says the plural
Results
***
BLUE
RED
As expected, -i is more productive in the red language with non-velars
23
Results
100%
30
Rate of velar
palatalization
is lower
in Red Language
than in Blue Language
*
Prediction confirmed
BLUE
RED
24
Results
***
The more productive
-i is with non-velar-final
inputs for a subject,
the less productive is
velar palatalization for
the same subject.
25
Constraining the model:
Processing stages
• Two-stage model:
– Stage I:
-i vs. –a
– Stage II:
g   vs. ‘do nothing’
• One-stage model:
– g  i vs.
– g  ga vs.
– C  Ci
Context effects
Mean
44%
Velar palatalization is likely to fail before –i despite being
exceptionless
27
Explaining context effects
• Context effects are due to differences in the relative
reliabilities of specific velar-changing rules
i/V[+back;-high]
_ (1.0)
(.475)
g  /V
[+back;-high]_i
i/V[-high]
_ (1.0)
(.350)
g  /V
[-high]_i
i/V_ (.272)
g  /V_i
(1.0)
i/[+voice]_ (1.0)
(.195)
g  /[+voice]_i
i/C[+voiced]
__i(.232)
[]  []/C
(.756)
[+voiced]
.475vs.
vs..756
.232
log: 1.0
.195vs.
vs..756
.232
ping: 1.0
Suppose that the decision on whether to change the stem is made in the
context of an already chosen suffix (-i)
In this context, all velar-changing rules are completely reliable (they are
exceptionless).
Thus, relative reliability predicts context effects only if the suffix and the
stem change are chosen simultaneously.
28
Constraining the model:
Decision rule
• Rule-Based Learner relies on a stochastic decision
between competing rules
• The speaker cannot go for the most reliable rule all
the time
– The most reliable rule in both the blue language and the
red language is palatalizing  the L’s should not differ
– Albright and Hayes (2003)
• Novel verbs that are similar to many regular English verbs are
more likely to take the regular past tense than novel verbs that
are similar to neither regular nor irregular English verbs
• Regular rule is the most reliable one in both cases
• The two classes of words should not differ
29
Summary
If
• Rules compete
• The outcome of competition is influenced by reliability (Albright and Hayes
2003, Pierrehumbert 2006)
• Known words are retrieved from the lexicon not generated by the grammar
Then
• An exceptionless rule loses productivity but can remain exceptionless if
the triggering affix comes to be used mostly with segments that cannot
undergo the rule.
To account for the present results,
• Competition between rules must be resolved stochastically.
• The suffix and the stem shape must be chosen during a single decision
stage.
30
References
Albright, A., and B. Hayes. 2003. Rules vs. analogy in English past
tenses: A computational / experimental study Cognition, 90, 119-61.
Bybee, J., and J. Newman. 1995. Are stem changes as natural as affixes?
Linguistics, 33, 633-54.
Kapatsinski, V. M. 2005. Characteristics of a rule-based default are
dissociable: Evidence against the Dual Mechanism Model. In S.
Franks, F. Y. Gladney, and M. Tasseva-Kurktchieva, eds. Formal
Approaches to Slavic Linguistics 13: The South Carolina Meeting,
136-46. Ann Arbor, MI: Michigan Slavic Publications.
Levikova, S. I. 2003. Bol’shoj slovar’ molodezhnogo slenga. [The big
dictionary of youth slang]. Moscow: Fair-Press.
Pierrehumbert, J. B. 2006. The statistical basis of an unnatural alternation.
In L. Goldstein, D.H. Whalen, and C. Best (eds), Laboratory
Phonology VIII: Varieties of Phonological Competence, 81-107.
Berlin: Mouton de Gruyter.
Sheveleva, M. S. 1974. Obratnyj slovar’ russkogo jazyka. [Reverse
dictionary of Russian]. Moscow: Sovetskaja Enciklopedija.
31