lecture16_question_answering_askMSR

Transcript lecture16_question_answering_askMSR

Question-Answering via the Web: the
AskMSR System
Note: these viewgraphs were originally developed by
Professor Nick Kushmerick, University College Dublin,
Ireland. These copies are intended only for use for
review in ICS 278.
1
Question-Answering
• Users want answers, not documents
Databases
Information Retrieval
Information Extraction
Question Answering
Intelligent Personal
Electronic Librarian
• Active research over the past few years,
coordinated by US government “TREC”
competitions
• Recent intense interest from security services
(“What is Bin Laden’s bank account number?”)
2
Question-Answering on the Web
• Web = a potentially enormous “data set” for data mining
– e.g., >8 billion Web pages indexed by Google
• Example: AskMSR Web question answering system
– “answer mining”
• Users pose relatively simple questions
– E.g., “who killed Abraham Lincoln”?
•
•
•
•
Simple parsing used to reformulate as a “template answer”
Search engine results used to find answers (redundancy helps)
System is surprisingly accurate (on simple questions)
Key contributor to system success is massive data (rather than better
algorithms)
– References:
• Dumais et al, 2002: Web question answering: is more always better?
In Proceedings of SIGIR'02
3
AskMSR
Lecture 5
• Web Question Answering: Is More Always Better?
– Dumas, Bank, Brill, Lin, Ng (Microsoft, MIT, Berkeley)
• Q: “Where is
the Louvre
located?”
• Want “Paris”
or “France”
or “75058
Paris Cedex 01”
or a map
• Don’t just
want URLs
Adapted from: COMP-4016 ~ Computer Science Department ~ University College Dublin ~ www.cs.ucd.ie/staff/nick ~ © Nicholas Kushmerick 2002
4
“Traditional” approach (Straw man?)
• Traditional deep natural-language processing approach
– Full parse of documents and question
– Rich knowledge of vocabulary, cause/effect, common sense, enables
sophisticated semantic analysis
• E.g., in principle this answers the “who killed Lincoln?” question:
•
The non-Canadian, non-Mexican president of a North American
country whose initials are AL and who was killed by John Wilkes
booth died ten revolutions of the earth around the sun after 1855.
5
AskMSR: Shallow approach
• Just ignore those documents, and look for ones
like this instead:
6
AskMSR: Details
2
1
3
5
4
7
Step 1: Rewrite queries
• Intuition: The user’s question is often syntactically
quite close to sentences that contain the answer
– Where is the Louvre Museum located?
– The Louvre Museum is located in Paris
– Who created the character of Scrooge?
– Charles Dickens created the character of Scrooge.
8
Query rewriting
•
–
–
–
Classify question into seven categories
Who is/was/are/were…?
When is/did/will/are/were …?
Where is/are/were …?
a. Category-specific transformation rules
eg “For Where questions, move ‘is’ to all possible locations”
“Where is the Louvre Museum located”
Nonsense,
 “is the Louvre Museum located”
but who
cares? It’s
 “the is Louvre Museum located”
only a few
 “the Louvre is Museum located”
more queries
 “the Louvre Museum is located”
to Google.
 “the Louvre Museum located is”
(Paper does not give full details!)
b. Expected answer “Datatype” (eg, Date, Person, Location, …)
When was the French Revolution?  DATE
•
Hand-crafted classification/rewrite/datatype rules
(Could they be automatically learned?)
9
Query Rewriting - weights
• One wrinkle: Some query rewrites are more
reliable than others
Where is the Louvre Museum located?
Weight 5
if we get a match,
it’s probably right
Weight 1
Lots of non-answers
could come back too
+“the Louvre Museum is located”
+Louvre +Museum +located
10
Step 2: Query search engine
• Throw all rewrites to a Web-wide search engine
• Retrieve top N answers (100?)
• For speed, rely just on search engine’s “snippets”,
not the full text of the actual document
11
Step 3: Mining N-Grams
• Unigram, bigram, trigram, … N-gram:
list of N adjacent terms in a sequence
• Eg, “Web Question Answering: Is More Always Better”
– Unigrams: Web, Question, Answering, Is, More, Always, Better
– Bigrams: Web Question, Question Answering, Answering Is, Is More, More
Always, Always Better
– Trigrams: Web Question Answering, Question Answering Is, Answering Is
More, Is More Always, More Always Betters
12
Mining N-Grams
• Simple: Enumerate all N-grams (N=1,2,3 say) in all
retrieved snippets
• Use hash table and other fancy footwork to make this efficient
• Weight of an n-gram: occurrence count, each weighted by
“reliability” (weight) of rewrite that fetched the document
• Example: “Who created the character of Scrooge?”
–
–
–
–
–
–
–
–
Dickens - 117
Christmas Carol - 78
Charles Dickens - 75
Disney - 72
Carl Banks - 54
A Christmas - 41
Christmas Carol - 45
Uncle - 31
13
Step 4: Filtering N-Grams
• Each question type is associated with one or more
“data-type filters” = regular expression
• When…
Date
• Where…
Location
• What …
Person
• Who …
• Boost score of n-grams that do match regexp
• Lower score of n-grams that don’t match regexp
• Details omitted from paper….
14
Step 5: Tiling the Answers
Scores
20
Charles
15
10
Dickens
Dickens
merged,
discard
old n-grams
Mr Charles
Score 45
Mr Charles Dickens
tile highest-scoring n-gram
N-Grams
N-Grams
Repeat, until no more overlap
15
Experiments
• Used the TREC-9 standard query data set
• Standard performance metric: MRR
– Systems give “top 5 answers”
– Score = 1/R, where R is rank of first right answer
– 1: 1; 2: 0.5; 3: 0.33; 4: 0.25; 5: 0.2; 6+: 0
16
Results
[summary]
• Standard TREC contest test-bed:
~1M documents; 900 questions
• E.g., “who is president of Bolivia”
• E.g., “what is the exchange rate between England and the
US”
• Technique doesn’t do too well (though would have
placed in top 9 of ~30 participants!)
– MRR = 0.262 (ie, right answered ranked about #4-#5)
– Why? Because it relies on the enormity of the Web!
• Using the Web as a whole, not just TREC’s 1M
documents… MRR = 0.42 (ie, on average, right
answer is ranked about #2-#3)
17
Example
• Question: what is the longest word in the English
language?
– Answer =
pneumonoultramicroscopicsilicovolcanokoniosis (!)
– Answered returned by AskMSR:
• 1: “1909 letters long”
• 2: the correct answer above
• 3: “screeched” (longest 1-syllable word in English)
18
Open Issues
• In many scenarios (eg, monitoring Bin Laden’s
email) we only have a small set of documents!
• Works best/only for “Trivial Pursuit”-style factbased questions
• Limited/brittle repertoire of
– question categories
– answer data types/filters
– query rewriting rules
19

lecture16_question_answering_askMSR

Transcript lecture16_question_answering_askMSR

Directory