Tectogrammatical - Institute of Formal and Applied Linguistics

Download Report

Transcript Tectogrammatical - Institute of Formal and Applied Linguistics

PDT:
Tectogrammatical
Representation
Jan Hajič
Institute of Formal and Applied Linguistics
School of Computer Science
Faculty of Mathematics and Physics
Charles University, Prague
Czech Republic
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
1
Tectogrammatical Annotation (tlayer)


Underlying (deep) syntax
4 sublayers (integrated):





dependency structure, (detailed) functors
 valency annotation
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (“grammatemes”):
 detailed functors, underlying gender, number, ...
Total

39 attributes (vs. 5 at m-layer, 2 at a-layer)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
2
Analytical vs. Tectogrammatical
representation
Underlying verb + tense
Deep function
Elided Actor in
Another ellipsis...
Prepositions out
(TR: sublayer 1 only shown)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
3
Layer 3: Tectogrammatical


Underlying (deep) syntax
4 sublayers:




dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):


detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
4
Tectogrammatical Functors
syntactic

“Actants”: ACT, PAT, EFF, ADDR, ORIG



semantic
modify: verbs, nouns, adjectives
cannot repeat in a clause, usually obligatory
Free modifications (~ 50), semantically defined


can repeat; optional, sometimes obligatory
Ex.: LOC, DIR1, ...; TWHEN, TTILL,...; RSTR; BEN, ATT, ACMP,
INTT, MANN; MAT, APP; ID, DPHR, ...

Special

Coordination, Rhematizers, Foreign phrases,...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
7
Tectogrammatical Example

Analytical verb form:

he would be allowed to be enrolled
Collapsed
Additional
attributes (grammatemes):
conditional + “allow”
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
8
Tectogrammatical Example

Predicate with copula (state)

you were fired
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
9
Tectogrammatical Example

Passive construction (action)

(The) book has been translated [by Mr. X]
Disappeared
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
Added
10
Tectogrammatical Example

Object

he gave Mary a book
Obj goes into ACT, PAT, ADDR, EFF or ORIG based on governor’s valency frame
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
11
Tectogrammatical Example

Relative clause (embedded)

the woman, who had a French accent, was very pretty
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
12
Tectogrammatical Example

Incomplete phrases

Peter works well, but Paul badly
Added
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
13
Layer 3: Tectogrammatical


Underlying (deep) syntax
4 sublayers:




dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):


detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
14
Deep Word Order, Topic/Focus

Example:

Analytical
dep. tree:
Baker bakes rolls.
March 5, 2008
vs. BakerIC bakes rolls.
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
15
Deep Word Order
Topic/Focus

Deep word order:


from “old” information to the “new” one (left-toright) at every level (head included)
projectivity by definition (almost...)


i.e., partial level-based order -> total d.w.o.
Topic/focus/contrastive topic


attribute of every node (t, f, c)
restricted by d.w.o. and other constraints
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
16
Layer 3: Tectogrammatical


Underlying (deep) syntax
4 sublayers:




dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):


detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
17
Coreference
(intro only: see Silvie’s part)

Grammatical (easy)

relative clauses
 which, who


Peter and Paul, who ...
control
 infinitival constructions


promise
PRED
John promised to go ...
reflexive pronouns
 {him,her,thme}self(-ves)

go
PAT
John
ACT
he
ACT
home
DIR3
Mary saw herself in ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
18
Coreference

Textual

Ex.: Peter moved to Iowa after he finished his PhD.
move
PRED
Peter
ACT
f inish
TWHEN
Iow a
DIR1
PhD
PAT
he
ACT
he
APP
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
19
Layer 3: Tectogrammatical


Underlying (deep) syntax
4 sublayers:




dependency structure, (detailed) functors
topic/focus and deep word order
coreference (mostly grammatical only)
all the rest (grammatemes):


detailed functors
underlying gender, number, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
20
“Grammatemes”

Detailed functors (subfunctors)

only for some functors:




TWHEN: before/after
LOC: next-to, behind, in-front-of, ...
also: ACMP, BEN, CPR, DIR1, DIR2, DIR3, EXT
Lexical (underlying)


number (SG/PL), tense, modality, degree of
comparison, ...
strictly only where necessary (agreement!)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
21
Tectogrammatical attributes I

node typing


functor, subfunctor



complex, coap, qcomplex, root, atom, ...
TWHEN: TWHEN.basic, TWHEN.before
is_member, is_generated, is_parenthesis,
is_dsp_root, is_state, quot_type, ...
grammatemes (16):

aspect, degcmp, deontmod, sempos, tense, indeftype,
politeness, person, ...
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
22
Tectogrammatical attributes II

topic/focus:




valency: t_lemma, val_frame.rf
bookkeeping: id
coref_gram.rf, coref_text.rf, compl.rf



tfa, deepord
reference to TR node, type of coreference
sentmod
Linking to analytical layer

a.lex.rf (“main” anal. node), a.aux.rf (others)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
23
Fully Annotated Sentence
He spends
his days
sketching
passers-by,
or trying to.
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
24
Definition of Valency


Ability (“desire”) of words (verbs, nouns,
adjectives) to combine themselves with other
units of meaning
Properties of valency:



Specific for every word meaning (in general)
 leave: sb left sth for sb vs. sb left from somewhere
 same as in PropBank leave.02 vs. leave.01
Typically strongly correlates with surface form
 morphological case (~ ending), preposition+case, ...
Semantic constraints
are very dangerous
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
25
Structure of Valency


word sense group 1
 valency frame:

slot1 slot2 slot3
surface expression
word sense group 2
 ...



vyměnit (to replace)
word (lemma)
vyměnit1
ACT PAT EFF
Nom. Acc. za+Acc.
vyměnit2
...
PDT VALLEX (Cz), EngVallex (En)
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
26
PDT-VALLEX Entry


dosáhnout: “to reach”, “to get [sb to do sth]”
browser/user-formatted example:
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
27
Corpus <-> Valency Lexicon
Corpus:

Sentence 2035:

Lexicon:
March 5, 2008
Sentence 15345:
Sentence 51042:
ENTRY: uzavřít
vf1: ACT(.1) CPHR({smlouva}.4)
ex: u. dohodu (close a contract)
vf2: ACT(.1) PAT(.4)
ex.: u. pokoj (close a room, house)
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
28
Valency & Form: Constraints

Tree structure:
n1
n2
n3
n4

(Sets of) Constraints:
 n1: lemma=uvažovat mode=active
 n2: case=Nom afun=Sb
 n3: lemma=o afun=AuxP
 n4: case=Loc afun=Obj
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
29
Example: Valency & Form

1:2

relative clause
lemma=say mode=active
to_say:
ACT
EFF
afun=AuxC lemma=that
afun=Sb
case=Nom
afun=Obj POS=verb
• linear representation: EFF(that[.v])
March 5, 2008
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
31
Valency and Text Generation

Using valency for...


...getting the correct (lemma, tag) of verb arguments
Example:
VALLEX
entry: starat (se) ACT(.1) PAT(o.[.4])
starat
V..............
starat_se
“to take care of” PRED
Martin
ACT
tygr
PAT “tiger”
Martin
se
....1..........
...............
Martin
March 5, 2008
o
...............
“Martin
tygr
....4.......... takes
se stará o tygry.
Companions Semantic Representation and Dialog
Interfacing Workshop - Tectogrammatics
care of
tigers.”
32