Extracting Information from Participial Structures

Download Report

Transcript Extracting Information from Participial Structures

Extracting Information from
Participial Structures
Kata Gábor, Enikő Héja, Ágnes Mészáros
Research Institute for Linguistics, HAS
8th INTEX WORKSHOP, 2005
STRUCTURE




IE system and its shortage: the
problem of participles
NPs and participles in Hungarian
a possible enhancement of the IE
system
implementation in INTEX
IE system

input text (1-2 sentences of short business news)

shallow syntactic analysis

pre-defined semantic patterns (event frames)


output: event frames’ slots filled by the elements
of the input text
the event, its participants and circumstances are
identified
Event frames
Az ABN Amro Bank egyesül a Kereskedelmi és Hitelbankkal.
ABN Amro Bank fuses with Commercial and Credit Bank.
<event schema="owner_changed.fusion.6" roles_matched="3/3">
<rv role="member_company_1" pos="N" case="NOM" sem="company|institute">
<NP id="88" sem="company countable human institute">
<w id="0" class="DET" at="1-1" lex="az" case="NOM">Az</w>
<w id="2" class="UNKNOWN" at="2-2" lex="ABN">ABN</w>
<w id="4" class="UNKNOWN" at="3-3" lex="Amro">Amro</w>
<w id="6" class="N" at="4-4" lex="bank" case="NOM">Bank</w>
</NP>
</rv>
<rv role="_1" pos="V" lemma="egyesül">
<w id="8" class="V" at="5-5" lex="egyesül">egyesül</w>
</rv>
<rv role="member_company_2" pos="N" case="INS" sem="company|institute">
<NP id="118" sem="company countable institute">
<w id="13" class="DET" at="6-6" lex="a" case="NOM">a</w>
<w id="15" class="ONADJ" at="7-7" lex="kereskedelem"
case="NOM">Kereskedelmi</w>
<w id="17" class="CONJ" at="8-8" lex="és">és</w>
<w id="19" class="N" at="9-9" lex="hitelbank" case="INS">Hitelbankkal.</w>
</NP>
</rv>
</event>
Mapping syntax to event frames
SYNTAX
EVENT FRAMES

verb
main event

arguments
participants

free modifiers
circumstances
(time, location,manner...)
Mapping syntax to event frames
Problem: secondary information (cause or
antecedent of the main event) is ‘hidden’
in participial structures:
[A befektetők által tegnap eladott
részvények] megnövelték a tőzsde
forgalmát.
[The shares sold yesterday by the
investors] increased the traffic at the stock
exchange.
Mapping syntax to event frames
[A befektetők által tegnap eladott részvények] megnövelték a tőzsde
forgalmát.
[The shares sold yesterday by the investors] increased the traffic at
the stock exchange.
a befektetők / the investors /
eladott / sold /
tegnap / yesterday /
részvények / shares /
Mapping syntax to event frames
[A befektetők által tegnap eladott részvények] megnövelték a tőzsde
forgalmát.
[The shares sold yesterday by the investors] increased the traffic at
the stock exchange.
A befektetők tegnap eladtak részvényeket.
The investors sold shares yesterday.
A solution



a preprocessing module within the IE system
which transforms participial structures into
sentences with a finite predicate
semantic frame matching may operate on
transformed sentences
1st step: past participles within NPs
• the participle preserves the meaning of its base verb
• its arguments can be derived from the internal structure
of the NP
NPs in Hungarian 1.
NPs in Hungarian 2.
ADV
NP+case
DET
Participles
AP+case
N+Postp
V.INF
...
(past, present)
modifiers
head Noun
Participles in Hungarian

ADJ – Participle homonimy is a problem:

“mérsékelt PC-chip kereslet”
modest /~moderated/ demand for PC-chips
* Valaki mérsékelte a PC-chip keresletet.
* Somebody moderated the demand for PC-chips

“ragozott szóalakok”
inflected word forms
* Valaki ragozott szóalakokat.
* Somebody inflected word forms.

only participles can be transformed
Participle or Adjective?

syntactic tests





comparative
ADV formation
predicative use
impossibility of preverb detachment
we need to decide in the context whether the given word form
is an ADJ or a PART:
1. If at least one of the base verb’s
complements is present, than it is a
participle.
Participle or Adjective?

syntactic tests





comparative
ADV formation
predicative use
preverb detachment
we need to decide in the context whether the given word form
is an ADJ or a PART:
2. If at least one of the base verb’s
complements / adjuntcs / a preverb is present,
than it is a participle.
Participle or Adjective?

TESTS:
• comparative: “mérsékeltebb kereslet”
more moderate demand
• predicative: “Ez a szóalak ragozott.”
This word form is inflected.
• ADV formation: mérsékelt  mérsékelten
moderate  moderately
• preverb detachment:
“a [fel nem újított] házak”

“the [re- not stored] houses” (=not restored)
* Ezek a házak
[fel nem újítottak].
* These houses are [re- not stored].
THE GRAMMAR
- the correctness and informativity of the resulting
sentence depends on the correct identification of
verbal arguments and modifiers within the NP
- then these elements are transformed according to
their grammatical function
• past participles may be formed from both transitive or
intransitive verbs
• if the base verb is intransitive, the head noun of the NP
represents the subject of the base verb:
“az összedőlt épület” /the collapsed building/
• if the base verb is transitive, the head noun represents the
direct object of the base verb
“a bejelentett változások”
/the changes announced/
 transitivity needs to be coded
THE GRAMMAR

transformation rules are (enhanced) FSTs:
• they store relevant elements of the input NP in
variables
• the output is made up of the content of these
variables but in an altered order + function words
needed in the sentence
• our delaf dictionary codes


transitivity properties of verbs (on the basis of a
lexicon-grammar of verbal argument structures)
+- preverb feature shows whether the base verb has a
preverb
Transformation Graphs 1.
Transitive Verbs


transitive verbs without expressed subject
(“somebody” insertion):
Det
(V_compl)
 Valaki
V_vmib Det
VMIB
N –t
N
(V_compl) .
transitive verbs with a subject with the PostP “által”:
Det
 Nsubj
Nsubj
által
V_vmib
Det
(V_compl)
N –t
VMIB
(V_compl) .
N
Transformation Graphs 2.
Intransitive Verbs

head N becomes subject (patient)
Det
 Det
(V_compl)
N
V_vmib
VMIB
N
(V_compl) .
Structure of the graphs
1 graph
3 subgraphs according to complement-types:
possessor / verbal complement+adjunct /
nothing/
each subgraph divided into two paths:
transitive / intransitive verbs
Evaluation


central aspect: to what extent does it augment the
efficiency of the IE system?
lack of information (recall value) is considered less
important than incorrect information (precision)

evaluated on the 231.000 words corpus of short business
news;

1259 hits  898 qualified as informative

precision: 64%

further task: recall
(requires a corpus with manually annotated
participial structures)
THANK YOU FOR YOUR
ATTENTION!
{gkata, eheja, magnes}@corpus.nytud.hu