Two-Stage Constraint Based Hindi Parser

Download Report

Transcript Two-Stage Constraint Based Hindi Parser

Two-Stage Constraint
Based Hindi Parser
LTRC, IIIT Hyderabad
Brief Recap


Broad coverage parser
Dependency





Paninian framework
vibhakti-karaka correspondence
karaka frames (basic + transformation)
Source groups, demand groups
Constraints


Three basic constraints
Constraints as Integer programming equations
Parser

Two stage strategy


Stage I (Intra-clausal relations)



Appropriate constraints formed
Dependency relations marked
Relations such as k1, k2, k3, etc. for each verb
Stage II (Inter-clausal relations & conjunct relations)


Conjuncts, relative clauses, kriya mula, etc
In certain cases, separates syntax from semantics (eg.
kriya mula), in others, reduces the complexity.
Steps in Parsing
SENTENCE
Morph, POS tagging,
Chunking
Identify
Demand
Groups
STAGE - II
Load Frames
&
Transform
Find Candidates
YES
Apply
Constraints
& Solve
NO
Is Complex
Final Parse
Stage I: Types being handled

Simple Sentences (finite verbs)


Non-finite verbs







Clausal arguments
wA_huA
wA_hI
nA
kara
0_rahe, etc.
Copula
Genitive
Stage - II

Handles:




Conjuncts
 Subordinating & Coordinating
Relative clauses
Complex predicates
Basic constraints similar to Stage-I



Some additional constraints
New demand groups
New candidates
Steps (Stage II)
Output of
STAGE - I
Identify New
Demand
Groups
Load Frames
&
Transform
Find
Candidates
Repair
FINAL PARSE
Apply
Constraints
& Solve
Example – Relative Clause

vaha puswaka jo rAma ne
mohana ko
xI hE prasixXa hE
that book
which Ram ERG. Mohana DAT. gave is famous is
‘The book which Ram gave to Mohana is famous’
Output after Stage - I
_ROOT_
main
main
hE
xI
k1
k1
k2
puswaka
k4
vaha
rAma
mohana
jo
k1s
prasixXa
Identify the demand group

xiyA ‘give’

Main verb of the relative clause
Identify the demand group,
Load and Transform DF

jo ‘which’ transformation (special)

Transforms the demand frame of the main verb of the
relative clause
-------------------------------------------------------------------------------------------------------------arc-label
necessity
vibhakti
lextype
src-pos
arc-dir
oprt
-------------------------------------------------------------------------------------------------------------nmod__relc
m
any
n
r|l
p
insert
--------------------------------------------------------------------------------------------------------------
Karaka Frame
vaha
puswaka jo
rAma ne
mohana ko
xI
that book
which Ram ERG. Mohana DAT. gave
‘The book which Ram gave to Mohana is famous’
Main verb of
relative clause
prasixXa hE |
famous is
Transformed frame for xe after applying the jo trasformation
-------------------------------------------------------------------------------------------------------arc-label
necessity
vibhakti
lextype
src-pos arc-dir oprt
-------------------------------------------------------------------------------------------------------nmod__relc
m
any
n
r|l
p
insert
---------------------------------------------------------------------------------------------------------
New row
inserted after
transformation
Possible candidates
nmod__relc

vaha puswaka jo rAma ne mohana ko xI hE prasixXa hE |
Output after Stage - II
_ROOT_
main
hE
k1
k1s
vaha puswaka
prasixXa
nmod__relc
xiyA hE
k2
jo
k1
k4
mohana
rAma
Example II – Coordination

rAma Ora siwA kala
Aye |
Ram and Sita yesterday came
‘Ram and Sita came yesterday’
Output of Stage - I
_ROOT_
dummy
dummy
rAma
main
Aye
Ora
k1
siwA
k7t
kala
For Stage – II (Constraint
Graph)
_ROOT_
main
rAma
Aye
Ora
k1
ccof
ccof
siwA
k7t
kala
Candidate Arcs
_ROOT_
main
k1
rAma
Aye
Ora
k1
ccof
ccof
siwA
kala
Solution Graph
_ROOT_
main
k1
rAma
Aye
Ora
k7t
ccof
ccof
siwA
kala
Parse tree
_ROOT_
main
Aye
k7t
k1
Ora
ccof
rAma
kala
ccof
siwA
Output after Stage II
Finite Verb Coordination

rAma Gara gayA Ora vaha so
Ram
home went and
‘Ram went home and slept’
he
gayA |
sleep went
_ROOT_
main
gayA
k1
dummy
main
Ora
so
k1
k2
vaha
rAma
Gara
Output after Stage I
Karaka Frame - Ora
Finite
Ora
ccof
v_fin
Ora
ccof
v_fin
ccof
ccof
gayA
so
Finite Verb Coordination
(Parse Tree)
_ROOT_
main
Ora
ccof
gayA
k1
k2
rAma
Gara
ccof
so
k1
vaha
Output after Stage II
Relative Clause Coordination

rAma ne vaha puswaka KarIxI jo prasixXa hE Ora jo saswI hE
‘Ram purchased the book which is famous and which is cheap’
_ROOT_
main
main
KarIxI
hE
k1
k1
rAma
k2
jo
main
dummy
Ora
k1s
prasixXa
puswaka
Output after Stage I
hE
k1
k1s
jo
saswI
Karaka Frame - Ora
Relative Clause
n
puswaka
nmod__relc
nmod__relc
Ora
ccof
v_rel
Ora
ccof
v_rel
ccof
hE
ccof
hE
Relative Clause Coordination
(Parse Tree)
_ROOT_
main
KarIxI
k1
rAma
k2
puswaka
nmod__relc
Ora
ccof
ccof
hE
k1
jo
hE
k1s
prasixXa
k1
k1s
jo
saswI
Output after Stage II
Non-Finite Verb Coordination

rAma
Kelakara
Ora KAnA KAkara
Ram having played and food
having eaten sleep went
_ROOT_
dummy
Ora
so gayA
main
so
vmod vmod
k1
Kelakara KAkara rAma
k2
KAnA
Output after Stage I
Karaka Frame - Ora
Non-Finite
v_fin
so
Ora
Ora
ccof
v_nfin
ccof
v_nfin
ccof
Kelakara
ccof
KAkara
Non-Finite Verb Coordination
(Parse Tree)
_ROOT_
main
so
vmod
Ora
ccof
Kelakara
k1
rAma
ccof
KAkara
k2
KAnA
Output after Stage II
Nominal Coordination

rAma Ora siwA kala
Aye |
Ram
and Sita
yesterday came
‘Ram and Sita came yesterday’
_ROOT_
dummy
dummy
rAma
main
Aye
Ora
k1
siwA
Output after Stage I
k7t
kala
Karaka Frame - Ora
Nominal
Ora
Ora
ccof
ccof
n
n
ccof
rAma
ccof
siwA
Nominal Coordination (Parse
Tree)
_ROOT_
main
Aye
k7t
k1
Ora
ccof
rAma
kala
ccof
siwA
Output after Stage II
Example

rAma Ora siwA kala Aye |
_ROOT_
dummy
dummy
rAma
main
Aye
Ora
k1
siwA
k7t
kala
Steps (Stage II)
Output of
STAGE - I
Identify
Nodes
Identify New
Demand
Groups
Load Frames
&
Transform
Find
Candidates
Repair
FINAL PARSE
Apply
Constraints
& Solve
Constraint Graph Nodes
(Stage II)


Selected from the intermediate
parse tree (Stage I)
Set-I (demand nodes)
1.
2.
3.
4.
5.
Conjuncts
Nearest verbal ancestor of ‘jo’ (usually
just the parent)
_ROOT_
Children of _ROOT_ other than (1) and
(2).
Other nodes which are added due to
nodes in Set 2
Constraint Graph Nodes
(Stage II)

Set-II (source nodes)
1.
2.

Possible children and parents of
conjuncts
Possible heads of the relative clause.
Identification of nodes in Set-II will
generally trigger the repair.
Steps (Stage II)
Output of
STAGE - I
Identify
Nodes
Identify New
Demand
Groups
Load Frames
&
Transform
Find
Candidates
Repair
FINAL PARSE
Apply
Constraints
& Solve
Identify the demand group


Ora
Aye
Steps (Stage II)
Output of
STAGE - I
Identify
Nodes
Identify New
Demand
Groups
Load Frames
&
Transform
Find
Candidates
Repair
FINAL PARSE
Apply
Constraints
& Solve
General Principles

Repair/Revision
1.
Any node which becomes
a potential child in stage 2,
its arc to its existing parent
is open to revision

rAma Ora siwA kala
Aye
• Node 4 becomes potential child (of node
1)
• Its parent (node 2) is open to revision
General Principles

Repair/Revision after
parse of stage I
2.
Any node which becomes
a potential parent must
be re-looked at.

rAma Ora siwA kala
Aye
• Node 2 becomes potential parent (of 1)
• Its child (node 4) is open to revision
Algorithm

Identify nodes of the constraint graph




From Set 1, and
From Set 2
Remove all outgoing edges from _ROOT_.
Find possible candidates for demand
nodes present in Set 1 from Set 2





Parent candidate for finite verb
Parent and children for conjuncts
Children of _ROOT_
Convert the formed constraint graph into
integer programming (IP) problem.
Solve the IP equations to get the possible
solution parse.
An example
raama aura sitaa
kala
aaye
’Ram’
’and’ ’Sita’ ’yesterday’
Ram and Sita came yesterday
‘came’
_ROOT_
dummy

Output after stage I
dummy
rAma
main
Aye
Ora
k1
siwA
k7t
kala
Identify Nodes
_ROOT_
dummy

Set 1 nodes
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
_ROOT_

dummy
Set 1 and Set 2
dummy
rAma
main
Aye
Ora
k1
siwA
k7t
kala
Constraint Graph

New Constraint Graph


Ora, Aye and _ROOT_
are the demand groups
Note: ‘kala’ remains
attached to its parent
‘aaye’ (does not show
up in stage 2)
_ROOT_
ccof
k1
rAma
Ora
ccof
main
Aye
k1
siwA
Example
_ROOT_
main

Final Parse
Aye
k7t
k1
Ora
ccof
rAma
ccof
siwA
kala
Types of complex sentences

Relative clauses
Initial
 Final
 Medial


Conjuncts (Coordination)
Simple clause
 Relative clause
 Non-finite
 Nominal, adjectival, adverbial

Some other examples:



rAma ne vaha puswaka KarIxI jo saswI hE
Ora jo bAjZAra meM prasixXa hE |
samIra Ora aBay ne vaha puswaka KarIxI jo
saswI hE Ora jo bAjZAra meM prasixXa hE |
rAma Ora mohana ke xoswa kI baccI Aye |



Only baccI came, or
Both rAma and baccI came
Use of ‘gnp’ of the main verb, Aye vs. AI
THANKS!!