lecture16_question_answering_askMSR
Download
Report
Transcript lecture16_question_answering_askMSR
Question-Answering via the Web: the
AskMSR System
Note: these viewgraphs were originally developed by
Professor Nick Kushmerick, University College Dublin,
Ireland. These copies are intended only for use for
review in ICS 278.
1
Question-Answering
• Users want answers, not documents
Databases
Information Retrieval
Information Extraction
Question Answering
Intelligent Personal
Electronic Librarian
• Active research over the past few years,
coordinated by US government “TREC”
competitions
• Recent intense interest from security services
(“What is Bin Laden’s bank account number?”)
2
Question-Answering on the Web
• Web = a potentially enormous “data set” for data mining
– e.g., >8 billion Web pages indexed by Google
• Example: AskMSR Web question answering system
– “answer mining”
• Users pose relatively simple questions
– E.g., “who killed Abraham Lincoln”?
•
•
•
•
Simple parsing used to reformulate as a “template answer”
Search engine results used to find answers (redundancy helps)
System is surprisingly accurate (on simple questions)
Key contributor to system success is massive data (rather than better
algorithms)
– References:
• Dumais et al, 2002: Web question answering: is more always better?
In Proceedings of SIGIR'02
3
AskMSR
Lecture 5
• Web Question Answering: Is More Always Better?
– Dumas, Bank, Brill, Lin, Ng (Microsoft, MIT, Berkeley)
• Q: “Where is
the Louvre
located?”
• Want “Paris”
or “France”
or “75058
Paris Cedex 01”
or a map
• Don’t just
want URLs
Adapted from: COMP-4016 ~ Computer Science Department ~ University College Dublin ~ www.cs.ucd.ie/staff/nick ~ © Nicholas Kushmerick 2002
4
“Traditional” approach (Straw man?)
• Traditional deep natural-language processing approach
– Full parse of documents and question
– Rich knowledge of vocabulary, cause/effect, common sense, enables
sophisticated semantic analysis
• E.g., in principle this answers the “who killed Lincoln?” question:
•
The non-Canadian, non-Mexican president of a North American
country whose initials are AL and who was killed by John Wilkes
booth died ten revolutions of the earth around the sun after 1855.
5
AskMSR: Shallow approach
• Just ignore those documents, and look for ones
like this instead:
6
AskMSR: Details
2
1
3
5
4
7
Step 1: Rewrite queries
• Intuition: The user’s question is often syntactically
quite close to sentences that contain the answer
– Where is the Louvre Museum located?
– The Louvre Museum is located in Paris
– Who created the character of Scrooge?
– Charles Dickens created the character of Scrooge.
8
Query rewriting
•
–
–
–
Classify question into seven categories
Who is/was/are/were…?
When is/did/will/are/were …?
Where is/are/were …?
a. Category-specific transformation rules
eg “For Where questions, move ‘is’ to all possible locations”
“Where is the Louvre Museum located”
Nonsense,
“is the Louvre Museum located”
but who
cares? It’s
“the is Louvre Museum located”
only a few
“the Louvre is Museum located”
more queries
“the Louvre Museum is located”
to Google.
“the Louvre Museum located is”
(Paper does not give full details!)
b. Expected answer “Datatype” (eg, Date, Person, Location, …)
When was the French Revolution? DATE
•
Hand-crafted classification/rewrite/datatype rules
(Could they be automatically learned?)
9
Query Rewriting - weights
• One wrinkle: Some query rewrites are more
reliable than others
Where is the Louvre Museum located?
Weight 5
if we get a match,
it’s probably right
Weight 1
Lots of non-answers
could come back too
+“the Louvre Museum is located”
+Louvre +Museum +located
10
Step 2: Query search engine
• Throw all rewrites to a Web-wide search engine
• Retrieve top N answers (100?)
• For speed, rely just on search engine’s “snippets”,
not the full text of the actual document
11
Step 3: Mining N-Grams
• Unigram, bigram, trigram, … N-gram:
list of N adjacent terms in a sequence
• Eg, “Web Question Answering: Is More Always Better”
– Unigrams: Web, Question, Answering, Is, More, Always, Better
– Bigrams: Web Question, Question Answering, Answering Is, Is More, More
Always, Always Better
– Trigrams: Web Question Answering, Question Answering Is, Answering Is
More, Is More Always, More Always Betters
12
Mining N-Grams
• Simple: Enumerate all N-grams (N=1,2,3 say) in all
retrieved snippets
• Use hash table and other fancy footwork to make this efficient
• Weight of an n-gram: occurrence count, each weighted by
“reliability” (weight) of rewrite that fetched the document
• Example: “Who created the character of Scrooge?”
–
–
–
–
–
–
–
–
Dickens - 117
Christmas Carol - 78
Charles Dickens - 75
Disney - 72
Carl Banks - 54
A Christmas - 41
Christmas Carol - 45
Uncle - 31
13
Step 4: Filtering N-Grams
• Each question type is associated with one or more
“data-type filters” = regular expression
• When…
Date
• Where…
Location
• What …
Person
• Who …
• Boost score of n-grams that do match regexp
• Lower score of n-grams that don’t match regexp
• Details omitted from paper….
14
Step 5: Tiling the Answers
Scores
20
Charles
15
10
Dickens
Dickens
merged,
discard
old n-grams
Mr Charles
Score 45
Mr Charles Dickens
tile highest-scoring n-gram
N-Grams
N-Grams
Repeat, until no more overlap
15
Experiments
• Used the TREC-9 standard query data set
• Standard performance metric: MRR
– Systems give “top 5 answers”
– Score = 1/R, where R is rank of first right answer
– 1: 1; 2: 0.5; 3: 0.33; 4: 0.25; 5: 0.2; 6+: 0
16
Results
[summary]
• Standard TREC contest test-bed:
~1M documents; 900 questions
• E.g., “who is president of Bolivia”
• E.g., “what is the exchange rate between England and the
US”
• Technique doesn’t do too well (though would have
placed in top 9 of ~30 participants!)
– MRR = 0.262 (ie, right answered ranked about #4-#5)
– Why? Because it relies on the enormity of the Web!
• Using the Web as a whole, not just TREC’s 1M
documents… MRR = 0.42 (ie, on average, right
answer is ranked about #2-#3)
17
Example
• Question: what is the longest word in the English
language?
– Answer =
pneumonoultramicroscopicsilicovolcanokoniosis (!)
– Answered returned by AskMSR:
• 1: “1909 letters long”
• 2: the correct answer above
• 3: “screeched” (longest 1-syllable word in English)
18
Open Issues
• In many scenarios (eg, monitoring Bin Laden’s
email) we only have a small set of documents!
• Works best/only for “Trivial Pursuit”-style factbased questions
• Limited/brittle repertoire of
– question categories
– answer data types/filters
– query rewriting rules
19