Transcript ppt

CMSC 723 / LING 645: Intro to
Computational Linguistics
September 22, 2004: Dorr
Supplement: PC-Kimmo Tutorial
Prof. Bonnie J. Dorr
Dr. Christof Monz
TA: Adam Lee
Building Automata in Kimmo:
English Epenthesis
Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x}
How do we implement this? X+s ==> Xes; else Y+s ==> Ys
What does this look like? Let S = {x, s, z}
S
+:0, =
S
+:e
s
Note 1: S = S:S, s = s:s, etc.
Note 2 : No way to get out of
last state except for `s’
Building Automata in Kimmo:
English Epenthesis
Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x}
How do we implement this? X+s ==> Xes; else Y+s ==> Ys
What does this look like? Let S = {x, s, z}
S
+:0, =
S
+:e
+:e
+:0
Note 1: S = S:S, s = s:s, etc.
=, S
Note 2 : No way to get out of
last state except for `s’
s
Note 3: No way to get out of
intermed state on `s’
Building Automata in Kimmo:
English Epenthesis
This takes care of x, s, z, but what about ch, sh?
+:0, =
S
S
+:e
+:0
=, S
s
+:e
Building Automata in Kimmo:
English Epenthesis
+:0, =
This takes care of x, s, z, but what about ch, sh, ss?
Add two new states. (And now we need to add numbers!)
+:e
s
s,h,S
s
S
S
+:e
c
s,h,S +:0
+:e
c
=, S
s
Building Automata in Kimmo:
English Epenthesis
Add numbers to states! Problem: Now that we have
introduced c, s, h, we need to
worry about these in other states!
+:e
4
s
s,h,S
+:0, =
s
S
S
+:e
5
1
3
c
s,h,S +:0
+:e
2
6
c
=, S
s
Building Automata in Kimmo:
English Epenthesis
Add numbers to states! Problem: Now that we have
introduced c, s, h, we need to
worry about these in other states!
+:e
4
s
s,h,S
+:0, =, h
s
s
S
Sc
+:e
5
1
3
c
c
s,h,S
+:e
2
h
+:0
6
c
=, S, c, h
s
Building Automata in Kimmo:
English Epenthesis
Add numbers to states! Problem: In fact, we have to worry
about all feasible pairs in every
state.
+:e
4
s
s,h,S
+:0, =, h
=s
s
=
S
Sc
+:e
5
1
3
c
Note: If a feasible
c
=
s,h,S
+:e
pair is missing
2
h
+:0
between 2 states,
6
it is assumed that
c
pair is not
possible, e.g., we
cannot go from 6
to 1 on s:s
=, S, c, h
s
Building Automata in Kimmo:
English Epenthesis
Unfortunately, life can get even more complicated (because of restriction below):
there can be interaction with other automata. For example, Epenthesis interacts
with Y-replacement, so we need a feasible pair for y:i and +:0 in Epenthesis!
s
=s
+:0, =, h
Note: If a feasible
pair is missing
between 2 states,
it is assumed that
pair is not
possible, e.g., we
cannot go from 6
to 1 on s:s
1
h
4
s,h,S
=
Sc
c
=
+:e
s
S
+:e
3
c
5
s,h,S
2
+:e
+:0
6
c
=, S, c, h
s
Building Automata in Kimmo:
English Epenthesis
Here is the actual automaton matrix used in simple-english.aut
RULE "Epenthesis" 6 9
c h s S y + + = _
s
= s
+:0, =, h
1
4
c
=
h
s,h,S
s
S
=
S c
+:e
3
c
5
s,h,S
2
c h s S i e 0 = _
+:e
+:e
+:0
c
=, S, c, h
s
6
1:
2 1 4 3 3 0 1 1 1
2:
2 3 3 3 3 0 1 1 1
3:
2 1 3 3 3 5 6 1 1
4:
2 3 3 3 3 5 6 1 1
5.
0 0 1 0 0 0 0 0 0
6:
1 1 0 1 1 5 6 1 1
Hint: This is the sort of thing
you’ll need for German find+t
What about long-distance
dependencies?
 How do we handle Buch and Bücher?
 Trick: Use maybe-umlaut, coupled with special
marker in the suffix
 Root form in lexicon: b|ch Continuation class (or
alternation) is called /NOUN-NEUT-ADD-ER
 In lexicon for NOUN-NEUT-ADD-ER, put ending
+&er, where & indicates that the root may have a
character that should be umlauted.
 In the Maybe-Umlaut automaton, search for
sequence “| .. +&e” and change to “* .. +&e”
Building a Lexicon in Kimmo:
Simple English Example
 ALTERNATION /Root Root






ALTERNATION /N N
ALTERNATION /End End
LEXICON INITIAL
0 /Root "(“
LEXICON N
0 /C1 "(cat n) (person p3) (number sg)“
+s /C2 "(cat n) (person p3) (number pl)“
+y /A "(cat a)“
LEXICON Root
spy /N
spy /V
LEXICON End
0 # “)”
END
Demo of PC-Kimmo
 cd kimmo
 pckimmo
 load rules simple-english.aut
 load lexicon simple-english.dic
 generate try+s
 recognize tries
 set tracing on
 recognize tries