Answer Mining by Combining Extraction Techniques with Abductive

Download Report

Transcript Answer Mining by Combining Extraction Techniques with Abductive

Answer Mining by Combining
Extraction Techniques with
Abductive Reasoning
Sanda Harabagiu, Dan Moldovan, Christine
Clark, Mitchell Bowden, Jown Williams and
Jeremy Bensley
LCC
TREC 2003 Question Answering Track
Abstract
• Information Extraction Technique:
– Axiomatic knowledge derived from WordNet for
justifying answers extracted from the AQUAINT text
collection
• CICERO LITE:
– Named entity recognizer
– Recognize precisely a large set of entities that ranged
over an extended set of semantic categories
• Theorem Prover:
– Produce abductive justifications of the answers when
it had access to the axiomatic transformations of the
WordNet glosses
Introduction
• TREC-2003: Main task & Passage task
• Main task:
– Factoids
– Lists
– Definitions
• Main_task_score = ½ * factoid_score + ¼
* list_score + ¼ *definition_score
• Factoid questions:
– Seek short, fact-based answers
– Ex. ”What are pennies made of?”
• List questions:
– Requests a set of instances of specified types
– Ex. “What grapes are used in making wine?”
– Final answer set was created form the
participants & assessors
– IR = #instances judged correct and distinct /
#answers in the final set
– IP = #instances judged correct and distinct /
#instances returned
– F = (2 * IP * IR) / (IP + IR)
• Definition questions:
– Assessor created a list of acceptable info
nuggets, some of which are deemed essential
– NR (Nugget Recall) = #essential nuggets
returned in response / #essential nuggets
– NP (Nugget Precision)
• Allowance = 100 * #essential and acceptable
nuggets returned
• Length = total #non-white space characters in
answer strings
• Definition questions:
– NP = 1, if length < allowance
– NP = 1 – (length – allowance) / length,
otherwise
– F = (26 * NP * NR) / (25 * NP + NR)
• TREC-2003:
– Factoids: 413
– Lists: 37
– Definition: 50
Answer Type
Count
Answers to Factoid
383
NIL-answers to Factoid
30
Answer instances in List final set
549
Essential nuggets for Definition
207
Total nuggets for Definition
417
The Architecture of the QA System
Question Processing
• Factoid or List questions:
– Identify the expected answer type encoded as
• Semantic class recognized by CICERO LITE or
• In a hierarchy of semantic concepts using the
WordNet hierarchies for verbs and nouns
– Ex. “What American revolutionary general
turned over West Point to the British?”
• Expected answer type is PERSON due to the noun
general found in the hierarchy of humans in
WordNet
• Definition questions:
– Parsed for detecting the NPs and matched
against a set of patterns
– Ex. “What is Iqra?”
• Matched against the pattern <What is QuestionPhrase>
• Associated with the answer pattern <QuestionPhrase, which means Answer-Phrase>
Document Processing
• Retrieve relevant passages based on the
keywords provided by question processing
• Factoid questions:
– Ranks the candidate passages
• List questions:
– Ranks better passages having multiple occurrences
of concepts of the expected answer type
• Definition questions:
– Allows multiple matches of keywords
Answer Extraction
• Factoid:
– Answers first extracted based on the answer
phrase provided by CICERO LITE
– If the answer is not a named entity, it is
justified abductively by using a theorem
prover that makes user of axioms derived
form WordNet
– Ex. “What apostle was crucified?”
• List:
– Extracted by using the ranked set of extracted
questions
– Then determining a cutoff measure based on
the semantic similarity of answers
• Definition
– Relies on pattern matching
Extracting Answers for Factoid
Questions
• 289 correct answers
– 234: identified by the CICERO LITE or
recognizing it from the Answer Type Hierarchy
– 65: due to theorem prover reported in
Moldovan et al. 2003
• The role of theorem prover is to boost the precision
by filtering out incorrect answers that are not
supported by an abductive justification
• Ex. “what country does Greenland belong
to?”
– Answered by “Greenland, which is a territory
of Denmark”
– The gloss of the synset of {territory, dominion,
province} is “a territorial possession controlled
by a ruling state”
• Ex. “what country does Greenland belong
to?”
– The logical transformation for this gloss:
• control:v#1(e,x1,x2) & country:n#1(x1) &
ruling:a#1(x1) & possession:n#2(x2) &
territorial:a#1(x2)
– Refined expression:
• process:v#2(e,x1,x2) & COUNTRY:n#1(x1) &
ruling:a#1(x1) & territory:n#2(x2)
Extracting Answers for Definition
Questions
•
•
•
•
50 definition questions evaluated
207 essential nuggets
417 total nuggets
485 answers extracted by this system
– Two runs: Exact answers & Corresponding sentencetype answers
– Vital matches: 68(exact) & 86(sentence) form 207
– 110(exact ) & 114(sentence) from final set 417
• 38 patterns
– 23 patterns had at least a match for the tested
questions
Extracting Answers for List
Questions
• 37 list questions
• A threshold-based cutoff of the answers
extracted
• Decided on the threshold value by using
concept similarities between candidate
answers
• Given N list answers
– First computes the
similarity between the
first and the last
answer
– Similarity of a pair of
answers
– Consider a window of
three noun or verb
concepts to the left
and to the right of the
exact answer
• Given N list answers:
– Separate the concepts in nouns and verbs obtaining
– Similarity formula:
• Given N list answers:
• Given N list answers:
• Given N list answers:
Performance Evaluation
• Two different runs:
– Exact answers & whole sentence containing the
answer
Conclusion
• Second submission was slightly higher than first
submission
• Definition question gets higher score:
– An entire sentence allowed more vital nuggets to be
identified by the assessors
• Factoid questions in the main task were slightly
better than in the passage task
– Passage might have contained multiple concepts
similar to the answer, and thus produced a more
vague evaluation context