Tectogrammatical - Institute of Formal and Applied Linguistics
Download
Report
Transcript Tectogrammatical - Institute of Formal and Applied Linguistics
PDT:
Tectogrammatical
Representation
Jan Hajič
Institute of Formal and Applied Linguistics
School of Computer Science
Faculty of Mathematics and Physics
Charles University, Prague
Czech Republic
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
1
Tectogrammatical Annotation (tlayer)
Underlying (deep) syntax
4 sublayers (integrated):
dependency structure, (detailed) functors
valency annotation
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (“grammatemes”):
detailed functors, underlying gender, number, ...
Total
39 attributes (vs. 5 at m-layer, 2 at a-layer)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
2
Analytical vs. Tectogrammatical
representation
Underlying verb + tense
Deep function
Elided Actor in
Another ellipsis...
Prepositions out
(TR: sublayer 1 only shown)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
3
Layer 3: Tectogrammatical
Underlying (deep) syntax
4 sublayers:
dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):
detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
4
Tectogrammatical Functors
syntactic
“Actants”: ACT, PAT, EFF, ADDR, ORIG
semantic
modify: verbs, nouns, adjectives
cannot repeat in a clause, usually obligatory
Free modifications (~ 50), semantically defined
can repeat; optional, sometimes obligatory
Ex.: LOC, DIR1, ...; TWHEN, TTILL,...; RSTR; BEN, ATT, ACMP,
INTT, MANN; MAT, APP; ID, DPHR, ...
Special
Coordination, Rhematizers, Foreign phrases,...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
7
Tectogrammatical Example
Analytical verb form:
he would be allowed to be enrolled
Collapsed
Additional
attributes (grammatemes):
conditional + “allow”
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
8
Tectogrammatical Example
Predicate with copula (state)
you were fired
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
9
Tectogrammatical Example
Passive construction (action)
(The) book has been translated [by Mr. X]
Disappeared
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
Added
10
Tectogrammatical Example
Object
he gave Mary a book
Obj goes into ACT, PAT, ADDR, EFF or ORIG based on governor’s valency frame
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
11
Tectogrammatical Example
Relative clause (embedded)
the woman, who had a French accent, was very pretty
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
12
Tectogrammatical Example
Incomplete phrases
Peter works well, but Paul badly
Added
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
13
Layer 3: Tectogrammatical
Underlying (deep) syntax
4 sublayers:
dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):
detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
14
Deep Word Order, Topic/Focus
Example:
Analytical
dep. tree:
Baker bakes rolls.
March 5, 2008
vs. BakerIC bakes rolls.
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
15
Deep Word Order
Topic/Focus
Deep word order:
from “old” information to the “new” one (left-toright) at every level (head included)
projectivity by definition (almost...)
i.e., partial level-based order -> total d.w.o.
Topic/focus/contrastive topic
attribute of every node (t, f, c)
restricted by d.w.o. and other constraints
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
16
Layer 3: Tectogrammatical
Underlying (deep) syntax
4 sublayers:
dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):
detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
17
Coreference
(intro only: see Silvie’s part)
Grammatical (easy)
relative clauses
which, who
Peter and Paul, who ...
control
infinitival constructions
promise
PRED
John promised to go ...
reflexive pronouns
{him,her,thme}self(-ves)
go
PAT
John
ACT
he
ACT
home
DIR3
Mary saw herself in ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
18
Coreference
Textual
Ex.: Peter moved to Iowa after he finished his PhD.
move
PRED
Peter
ACT
f inish
TWHEN
Iow a
DIR1
PhD
PAT
he
ACT
he
APP
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
19
Layer 3: Tectogrammatical
Underlying (deep) syntax
4 sublayers:
dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):
detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
20
“Grammatemes”
Detailed functors (subfunctors)
only for some functors:
TWHEN: before/after
LOC: next-to, behind, in-front-of, ...
also: ACMP, BEN, CPR, DIR1, DIR2, DIR3, EXT
Lexical (underlying)
number (SG/PL), tense, modality, degree of
comparison, ...
strictly only where necessary (agreement!)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
21
Tectogrammatical attributes I
node typing
functor, subfunctor
complex, coap, qcomplex, root, atom, ...
TWHEN: TWHEN.basic, TWHEN.before
is_member, is_generated, is_parenthesis,
is_dsp_root, is_state, quot_type, ...
grammatemes (16):
aspect, degcmp, deontmod, sempos, tense, indeftype,
politeness, person, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
22
Tectogrammatical attributes II
topic/focus:
valency: t_lemma, val_frame.rf
bookkeeping: id
coref_gram.rf, coref_text.rf, compl.rf
tfa, deepord
reference to TR node, type of coreference
sentmod
Linking to analytical layer
a.lex.rf (“main” anal. node), a.aux.rf (others)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
23
Fully Annotated Sentence
He spends
his days
sketching
passers-by,
or trying to.
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
24
Definition of Valency
Ability (“desire”) of words (verbs, nouns,
adjectives) to combine themselves with other
units of meaning
Properties of valency:
Specific for every word meaning (in general)
leave: sb left sth for sb vs. sb left from somewhere
same as in PropBank leave.02 vs. leave.01
Typically strongly correlates with surface form
morphological case (~ ending), preposition+case, ...
Semantic constraints
are very dangerous
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
25
Structure of Valency
word sense group 1
valency frame:
slot1 slot2 slot3
surface expression
word sense group 2
...
vyměnit (to replace)
word (lemma)
vyměnit1
ACT PAT EFF
Nom. Acc. za+Acc.
vyměnit2
...
PDT VALLEX (Cz), EngVallex (En)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
26
PDT-VALLEX Entry
dosáhnout: “to reach”, “to get [sb to do sth]”
browser/user-formatted example:
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
27
Corpus <-> Valency Lexicon
Corpus:
Sentence 2035:
Lexicon:
March 5, 2008
Sentence 15345:
Sentence 51042:
ENTRY: uzavřít
vf1: ACT(.1) CPHR({smlouva}.4)
ex: u. dohodu (close a contract)
vf2: ACT(.1) PAT(.4)
ex.: u. pokoj (close a room, house)
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
28
Valency & Form: Constraints
Tree structure:
n1
n2
n3
n4
(Sets of) Constraints:
n1: lemma=uvažovat mode=active
n2: case=Nom afun=Sb
n3: lemma=o afun=AuxP
n4: case=Loc afun=Obj
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
29
Example: Valency & Form
1:2
relative clause
lemma=say mode=active
to_say:
ACT
EFF
afun=AuxC lemma=that
afun=Sb
case=Nom
afun=Obj POS=verb
• linear representation: EFF(that[.v])
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
31
Valency and Text Generation
Using valency for...
...getting the correct (lemma, tag) of verb arguments
Example:
VALLEX
entry: starat (se) ACT(.1) PAT(o.[.4])
starat
V..............
starat_se
“to take care of” PRED
Martin
ACT
tygr
PAT “tiger”
Martin
se
....1..........
...............
Martin
March 5, 2008
o
...............
“Martin
tygr
....4.......... takes
se stará o tygry.
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
care of
tigers.”
32