A Quick Tutorial on How it Works
Download
Report
Transcript A Quick Tutorial on How it Works
BLUE (Boeing Language Understanding
Engine) A Quick Tutorial on How it Works
Working Note 35
2009
Peter Clark
Phil Harrison
(Boeing Phantom Works)
BLUE
Each paragraph is broken up into sentences, then each sentence is
processed in turn.
For each sentence: BLUE has a
“An object is thrown from a cliff.”
pipelined architecture with 10
transformation steps:
1. Preprocessing
2. Parsing
3. Syntactic logic generation
4. Reference resolution
5. Transforming Verbs to Relations
6. Word Sense Disambiguation
7. Semantic Role Labelling
8. Metonymy resolution
9. Question Annotation
10. Additional Processing
isa(object01,Object),
isa(cliff01,Cliff),
isa(throw01,Throw),
object(throw01,object01),
origin(throw01,cliff01).
1. Preprocessing
Replace math symbols with words
+,-,/,*,= become “plus”,”minus”,”divided by”,”times”,”is”
remove non-ASCII characters
replace chemical formulae with dummy noun
“NaCl is a chemical”
“formula1 is a chemical”
2. Parsing
“An object is thrown from a cliff”
*S:-16*
+-----------------+-------+
NP:-1
VP:-12
+---+---+
+--------------+--+
DET:0
N^:0
AUX:0
VP:-8
|
|
|
+---------+---+
AN
N:0
IS
VP:0
*PP:-4*
|
|
+------+--+
OBJECT
V:0
P:0
NP:-3
|
|
+--+---+
THROWN FROM DET:-2 N^:0
|
|
A
N:0
|
CLIFF
3. Syntactic Logic Generation
Produce initial “syntactic logic”
Nouns, verbs, adjectives, adverbs become objects
prepositions, verb-argument positions become relations
*S:-16*
+-----------------+-------+
NP:-1
VP:-12
+---+---+
+--------------+--+
DET:0
N^:0
AUX:0
VP:-8
|
|
|
+---------+---+
AN
N:0
IS
VP:0
*PP:-4*
|
|
+------+--+
OBJECT
V:0
P:0
NP:-3
|
|
+--+---+
THROWN FROM DET:-2 N^:0
|
|
A
N:0
|
CLIFF
throw01: input-word(throw01, [“throw”,v])
subject(throw01,object01)
“from”(throw01,cliff01)
object01: input-word(object01, [“object”,n])
determiner(object01, “an”)
cliff01:
input-word(cliff01,[“cliff”,n])
determiner(cliff01, “a”)
4. Reference resolution
Reference: Ties sentences together
A ball fell from a cliff.
The ball weighs 10 N.
BLUE accumulates logic for each sentence in turn
“The red ball”
search for previous object which is a red ball
If > 1, warn user and pick the most recent
If 0, assume a new object
“The second red ball” → take 2nd matching object
5. Transforming verbs to relations
Simple case: syntactic structure = semantic structure
But more likely: they differ
IF: a semantic relation appears as a verb
use the verb’s subject and sobject as args of the relation
;;; "A cell contains a nucleus"
subject(contain01,cell01)
sobject(contain01,nucleus01)
input-word(contain01, ["contain",v])
;;; "A cell contains a nucleus"
encloses(cell01,nucleus01)
Special cases:
verb’s subject and preposition are the args of the relation
“The explosion resulted in a fire” → causes(explosion01,fire01)
“be” and “have” map to an underspecified relation
“The cell has a nucleus” → “have”(cell01,nucleus01)
6. Word Sense Disambiguation
Largely naïve (context-independent) WSD
same word always maps to same concept
If word maps to CLib concept, use that
If > 1 mapping, use a preference table to pick best
else climb WordNet from most likely WN sense to CLib concept
WordNet
CLib Ontology
Physical-Object
“object”
Lexical Term
Goal
Concept (Word Sense)
7. Semantic Role Labeling
Assign using a hand-built database of (~100) rules
;;; "The man sang for 1 hour"
subject(sing01,man01)
"for"(sing01,x01)
value(x01,[1,*hour])
;;; "The man sang for 1 hour"
agent(sing01,man01)
duration(sing01,x01)
value(x01,[1,*hour])
;;; "The man sang for a woman"
subject(sing01,man01)
"for"(sing01,woman01)
;;; "The man sang for a woman"
agent(sing01,man01)
beneficiary(sing01,woman01)
8. Metonymy Resolution
Where a word is replaced with a closely related word.
Literal meaning is non-sensical
“John read Shakespeare”
“Erase the blackboard”
“Left lane must exit”
“Change the washing machine”
NOTE: non-sensical with respect to target ontology
5 main types of metonymy fixed
a. FORCE for EXERTION
"The force on the sled"→ "The force of the exertion on the sled"
b. VECTOR-PROPERTY for VECTOR
"The direction of the move" → "The direction of the velocity of the move"
c. SUBSTANCE for STRUCTURAL-UNIT
"The oxidation number of NaCl"
→ "The oxidation number of the basic structural unit of NaCl"
d. OBJECT for EVENT
"The speed of the car is 10 km/h"
→ "The speed of the movement of the car is 10 km/h"
e. PLACE for OBJECT
"The cat sits on the mat" → "The cat sits at a location on the mat"
9. Question Annotation
Find-a-value questions: Extract a variable of interest
_Height23
(what-is-a _Elephant23)
(what-is-the _Process23)
(how-many _Elephant23)
(how-much _Water23)
(what-types _Cell23)
; find the value (no wrapper)
; find the definition
; find the identity of
; find the count
; find the amount
; find the subclasses of the instance's class
Clausal questions: Extract clauses to be queried about
;;; "Is it true that the big block is red?"
Assertional Triples:
size(block01,x01)
value(x01,[*big,Spatial-Entity)
5 types
value(x02,*red)
Query Triples:
color(block01,x02)
query-type(is-it-true-that-questionp,t)
Is it true
Is it false
It it possible
Why
How
10. Additional Processing
Occasional specific tweaks, e.g.,
;;; "Is it true that the reaction is an oxidation reaction?"
equal(reaction01,oxidation-reaction01)
;; "Is it true that the reaction is an oxidation reaction?"
is-a(reaction01,oxidation-reaction01)