A Quick Tutorial on How it Works

Download Report

Transcript A Quick Tutorial on How it Works

BLUE (Boeing Language Understanding
Engine) A Quick Tutorial on How it Works
Working Note 35
2009
Peter Clark
Phil Harrison
(Boeing Phantom Works)
BLUE
Each paragraph is broken up into sentences, then each sentence is
processed in turn.
For each sentence: BLUE has a
“An object is thrown from a cliff.”
pipelined architecture with 10
transformation steps:
1. Preprocessing
2. Parsing
3. Syntactic logic generation
4. Reference resolution
5. Transforming Verbs to Relations
6. Word Sense Disambiguation
7. Semantic Role Labelling
8. Metonymy resolution
9. Question Annotation
10. Additional Processing
isa(object01,Object),
isa(cliff01,Cliff),
isa(throw01,Throw),
object(throw01,object01),
origin(throw01,cliff01).
1. Preprocessing
 Replace math symbols with words
 +,-,/,*,= become “plus”,”minus”,”divided by”,”times”,”is”
 remove non-ASCII characters
 replace chemical formulae with dummy noun
“NaCl is a chemical”
“formula1 is a chemical”
2. Parsing
“An object is thrown from a cliff”
*S:-16*
+-----------------+-------+
NP:-1
VP:-12
+---+---+
+--------------+--+
DET:0
N^:0
AUX:0
VP:-8
|
|
|
+---------+---+
AN
N:0
IS
VP:0
*PP:-4*
|
|
+------+--+
OBJECT
V:0
P:0
NP:-3
|
|
+--+---+
THROWN FROM DET:-2 N^:0
|
|
A
N:0
|
CLIFF
3. Syntactic Logic Generation
 Produce initial “syntactic logic”
 Nouns, verbs, adjectives, adverbs become objects
 prepositions, verb-argument positions become relations
*S:-16*
+-----------------+-------+
NP:-1
VP:-12
+---+---+
+--------------+--+
DET:0
N^:0
AUX:0
VP:-8
|
|
|
+---------+---+
AN
N:0
IS
VP:0
*PP:-4*
|
|
+------+--+
OBJECT
V:0
P:0
NP:-3
|
|
+--+---+
THROWN FROM DET:-2 N^:0
|
|
A
N:0
|
CLIFF
throw01: input-word(throw01, [“throw”,v])
subject(throw01,object01)
“from”(throw01,cliff01)
object01: input-word(object01, [“object”,n])
determiner(object01, “an”)
cliff01:
input-word(cliff01,[“cliff”,n])
determiner(cliff01, “a”)
4. Reference resolution

Reference: Ties sentences together
A ball fell from a cliff.
The ball weighs 10 N.


BLUE accumulates logic for each sentence in turn
“The red ball”
 search for previous object which is a red ball
 If > 1, warn user and pick the most recent
 If 0, assume a new object
 “The second red ball” → take 2nd matching object
5. Transforming verbs to relations
 Simple case: syntactic structure = semantic structure
 But more likely: they differ
 IF: a semantic relation appears as a verb
 use the verb’s subject and sobject as args of the relation
;;; "A cell contains a nucleus"
subject(contain01,cell01)
sobject(contain01,nucleus01)
input-word(contain01, ["contain",v])
;;; "A cell contains a nucleus"
encloses(cell01,nucleus01)
 Special cases:
 verb’s subject and preposition are the args of the relation
 “The explosion resulted in a fire” → causes(explosion01,fire01)
 “be” and “have” map to an underspecified relation
 “The cell has a nucleus” → “have”(cell01,nucleus01)
6. Word Sense Disambiguation



Largely naïve (context-independent) WSD
 same word always maps to same concept
If word maps to CLib concept, use that
 If > 1 mapping, use a preference table to pick best
else climb WordNet from most likely WN sense to CLib concept
WordNet
CLib Ontology
Physical-Object
“object”
Lexical Term
Goal
Concept (Word Sense)
7. Semantic Role Labeling

Assign using a hand-built database of (~100) rules
;;; "The man sang for 1 hour"
subject(sing01,man01)
"for"(sing01,x01)
value(x01,[1,*hour])
;;; "The man sang for 1 hour"
agent(sing01,man01)
duration(sing01,x01)
value(x01,[1,*hour])
;;; "The man sang for a woman"
subject(sing01,man01)
"for"(sing01,woman01)
;;; "The man sang for a woman"
agent(sing01,man01)
beneficiary(sing01,woman01)
8. Metonymy Resolution


Where a word is replaced with a closely related word.
Literal meaning is non-sensical



“John read Shakespeare”
“Erase the blackboard”
“Left lane must exit”
“Change the washing machine”
 NOTE: non-sensical with respect to target ontology
5 main types of metonymy fixed
a. FORCE for EXERTION
"The force on the sled"→ "The force of the exertion on the sled"
b. VECTOR-PROPERTY for VECTOR
"The direction of the move" → "The direction of the velocity of the move"
c. SUBSTANCE for STRUCTURAL-UNIT
"The oxidation number of NaCl"
→ "The oxidation number of the basic structural unit of NaCl"
d. OBJECT for EVENT
"The speed of the car is 10 km/h"
→ "The speed of the movement of the car is 10 km/h"
e. PLACE for OBJECT
"The cat sits on the mat" → "The cat sits at a location on the mat"
9. Question Annotation
 Find-a-value questions: Extract a variable of interest
_Height23
(what-is-a _Elephant23)
(what-is-the _Process23)
(how-many _Elephant23)
(how-much _Water23)
(what-types _Cell23)
; find the value (no wrapper)
; find the definition
; find the identity of
; find the count
; find the amount
; find the subclasses of the instance's class
 Clausal questions: Extract clauses to be queried about
;;; "Is it true that the big block is red?"
Assertional Triples:
size(block01,x01)
value(x01,[*big,Spatial-Entity)
5 types
value(x02,*red)
Query Triples:
color(block01,x02)
query-type(is-it-true-that-questionp,t)
Is it true
Is it false
It it possible
Why
How
10. Additional Processing
 Occasional specific tweaks, e.g.,
;;; "Is it true that the reaction is an oxidation reaction?"
equal(reaction01,oxidation-reaction01)
;; "Is it true that the reaction is an oxidation reaction?"
is-a(reaction01,oxidation-reaction01)