Watson: The Jeopardy! Machine
Download
Report
Transcript Watson: The Jeopardy! Machine
Watson: The Jeopardy!
Machine
Robin Sturm
Who is Watson?
This is Watson!
(His “face” at least)
This is also Watson
What is Watson?
• A computer developed by IBM to play
Jeopardy!
• Watson was IBM’s next “Grand Challenge”
after Deep Blue defeated Garry Kasparov in
a chess tournament in 1997.
Watson’s Challenge
• Winning Jeopardy was associated with
intelligence.
• Competing in Jeopardy required mastery of
natural language processing, something
computers had struggled with for awhile.
• Jeopardy would go further than simple math
and require much
• All the same, mastery of question and
answering would go further than just Jeopardy
Why Jeopardy
• Jeopardy would be the platform to develop
question-answering abilities
• Jeopardy questions are unlike many others,
they have linguistic nuances, puns,
allusions that make it extremely complex.
• Also incorporates timing and strategy.
What is Jeopardy?
• Trivia based game show.
• 2 rounds with 30 questions divided into 6
categories
– 1 Daily double in round 1, 2 in round 2
• Open domain questions. (Any topic)
• Contestants answer by buzzing in on a buzzer
• Gain money for correctness, lose money if
incorrect.
• Final Jeopardy
AI required for Jeopardy
• Natural Language Processing: determining
what a question is asking for, what kind of
answer is required, what are the key words
involved.
• Information Retrieval: Looking through all
of the stored data and finding potential
answers.
Watson Development
• Piquant: Early question and answering
computer. Performed well below threshold
needed for Jeopardy.
• Later became Blue J as it was developed to
be more equipped to play Jeopardy.
• Watson was continuously updated and
improved to answer specific types of
questions better.
Progress Tracking
How would you answer this question?
ON SEPT. 1, 1715 LOUIS XIV DIED
IN THIS CITY, SITE OF A
FABULOUS PALACE HE BUILT.
Is it easier to answer the question if you
are choosing from a list of options?
• Paris, Athens, London, Versailles, Berlin, Milan,
Vienna.
How to solve the problem
Humans
• Breaking down question
– Typically a brief process for
humans
• Intuition
– You know it, or you don’t.
• Searching through memory.
– Think back to books,
magazines, classes…
• Analyze confidence.
– Is it a wild guess?
– Will incorrectness cost you
dearly.
Computer
• Breaking down question
– Complicated task, several
possibilities.
• No intuition
• Search through
documents.
– Parallel computing
– Several answer possibilities
• Analyze confidence.
– Which of the possible
answers is the best?
WATSON’S STEPS
TO ANSWER A
QUESTION…
Question Analysis
• Parse the question.
– Break down into parts of
speech
– Find the key words.
– Understand what the
question is asking for.
• Sends out queries.
– Several possible
interpretations of the
question.
– Each one aims to find its
own answer.
• Question classification
– Identify the type.
• Focus and LAT
detection
– Find the blank to fill in;
“This”
• Relation detection
• Decomposition
– Two part questions that
can be better solved in
parts
Using Predictive Annotation
• Mark up question with category types.
• Works well for who, what, where, when
questions.
Hypothesis Generation
• Each interpretation of
the question is sent off
to find possible answers.
• Aims to propose many
possibilities.
– The more possibilities,
the better.
• Different algorithms
find different types of
answers.
• Several search
techniques
– Document search (key
word)
– Passage search
– Knowledge base search
(database)
• Candidate answers
– 85% of the time the
correct answer is in the
top 250 original
candidates.
Hypothesis/Evidence Scoring
• Eliminates answers that
are obviously wrong
• Finds passages that may
support a certain
answer.
– Positive and negative
scoring based on content
and context of passages
• Algorithms in parallel
score all of the possible
answers.
• Several scoring
algorithms are used
– Counting number of IDFweighted terms in
common
– Measuring length of
longest similar
subsequences
– Measuring alignment of
logical forms
(grammatical
relationships)
– Geospatial and temporal
reasoning.
Final Merging and Ranking
• Incorporates experiences from prior questions.
• Able to weight and apply the algorithms it ran
to determine the significance of the evidence.
• Calculates confidence in the possible answers
it came up with.
• A certain level of confidence is necessary to
answer the question (changes based on the
game)
DeepQA Approach
• “[A] Massively parallel
probabilistic evidencebased architecture.”
• > 100 techniques for
analyzing clue, finding
sources, generating
hypotheses, finding
and scoring evidence,
and merging and
ranking.
• Principles
– Massive parallelism
– Many experts
– Pervasive confidence
estimation
– Integrate shallow and
deep knowledge
WHAT
COMPRISES
WATSON?
Computing Power Behind Watson
• 90 IBM Power 750 servers
– 3.5 GHz POWER7 eight core processor, with
four threads per core
• 2880 POWER7 processor cores
– Able to perform “massively parallel” tasks
• 16 Terabytes of RAM
• Process at 500 gigabytes per second
• 80 TeraFLOPs
The Data Inside Watson
• Roughly 4 terabytes of information.
– Entirely text-based, no pictures/audio/video.
• Included dictionaries, books, textbooks,
encyclopedias, news articles, and all of
Wikipedia.
• Some of the data was structured (databases),
but a lot was unstructured or semi-structured.
• Divided into clusters and tagged for
usefulness.
LEADING UP
TO THE GAME
Watson developed strategy
throughout sparring
competitions against past
Jeopardy players.
Daily Double Betting
• Betting strategy depends on the time of the
game and the opponents’ scores.
• First Round: Catch opponents if behind. Fairly
conservative if ahead.
• Second Round: More aggressive to pull ahead.
• End of Second Round: Strategic bets to
maintain a lead (if any).
– In one sparring match bet 100$ when leading
27,500 to 8,200 and 4,600
Learning Throughout the Game
• Sometimes Watson doesn’t fully grasp what
a category is asking for, learns during the
game from the previous answers.
Final Jeopardy Betting
• Watson judges its score against the others
to determine what is needed to win.
Watson’s Answering Strategy
• A person won’t answer a question if we don’t
know enough, neither does Watson.
• Watson determines this through a
confidence scoring.
Using the Buzzer
• Although Watson was quite quick, it was
sometimes not quick enough to hit the buzzer
first.
• Humans were able to anticipate when the
buzzer would be activated based on listening
to the speaker.
• While humans and Watson both thought
about the question when it was asked, humans
process differently.
• Exemplar categories: “Celebrities Middle
Names”, “Actors Who Direct.”
Buzz Threshold
• The programmers added in a calculation to
Watson that would incorporate confidence
into determining whether to buzz in.
• This took the state of the game into account
.
Pop Culture
• Questions about books, history, or facts are
fairly constant.
• For pop culture questions (as well as current
events) Watson was updated with
information.
• Ex: “Not to be confused with Lady Gaga is
Lady this, the country music group with the
CD ‘Need You Now’”
Issues between Jeopardy and IBM
• Jeopardy originally feared IBM would use
the show as a stunt.
• IBM execs were concerned the clues would
be skewed in favor of the humans.
• After many practice rounds of ringing in
electronically, Watson was required to
physically press a button.
• Concern over computer advantage.
THE GRAND
CHALLENGE
Strengths and Weaknesses
Strengths
• No pressure.
– As a computer, did not feel
emotions.
– On game day when the
other players and even the
programmers were nervous,
Watson felt nothing.
• Precise algorithms.
• Played a smart strategy.
– Looked for Daily Doubles
– Bet smart
Weaknesses
• Lacked instinct.
– There’s nothing Watson can
automatically know off the
top of its head.
• Lacked ability to anticipate
buzzer.
– Strength champions had
based off of listening.
• Can’t learn from a
competitor’s mistake.
– Repeated incorrect responses.
Watson’s Stage Presence
• Voice comprised of recordings by Jeff
Woodman
• Avatar designed by Joshua Davis
• Features IBM “Smarter Planet” logo and 42
threads circling the planet.
• Designed to reflect moods based on the game.
Watson’s Competition
Ken Jennings
Brad Rutter
The Game
First Match
• Made a few mistakes
during the normal rounds.
• Incorrect final Jeopardy
answer.
• Jennings with $4,800,
Rutter with $10,400, and
Watson with $35,734
Second Match
• Fairly tight game
• The humans pulled ahead in
a few categories by virtue of
timing.
• At one points Jennings was
leading by a couple thousand
points.
• Watson got the last Double
Jeopardy to pull ahead.
• Preserved lead in final
Jeopardy
Final outcome
Future of Watson
• Healthcare
– Use in diagnostics and treatment suggestion.
• Businesses
– Customer service
– Any situation in which it is necessary to answer
a question by parsing through information
• Could be fine-tuned to fit a field.
Bibliography
Brain, Marshall. "How was IBM’s Watson computer able to answer the questions on Jeopardy? How did the technology work? How might it be
used?" HowStuffWorks. HowStuffWorks, 18 Feb. 2011. Web. 6 July 2012. <http://blogs.howstuffworks.com/2011/02/18/ how-was-ibmswatson-computer-able-to-answer-the-questions-on-jeopardy-how-did-the-technology-work-how -might-it-be-used/>. “
The DeepQA Project." IBM. IBM, n.d. Web. 6 July 2012. <http://www.research.ibm.com/deepqa/ deepqa.shtml>. "FAQs." IBM. IBM, n.d. Web. 6
July 2012. <http://www.research.ibm.com/deepqa/faq.shtml>.
Ferrucci, David, et al. "Building Watson: An Overview of the DeepQA Project." AI Magazine Fall 2010: n. pag. AI Magazine. Web. 6 July 2012.
<http://www.aaai.org/Magazine/Watson/watson.php>.
Hale, Mike. "Actors and Their Roles for $300, HAL? HAL!" New York Times [New York] 8 Feb. 2011: n. pag. Web. 6 July 2012.
<http://www.nytimes.com/2011/02/09/arts/television/ 09nova.html?_r=1>.
IBM Watson Team. Interview. The Reddit Blog. reddit, 23 Feb. 2011. Web. 6 July 2012. <http://blog.reddit.com/2011/02/ibm-watson-research-teamanswers-your.html>.
Jackson, Joab. "IBM Watson Vanquishes Human Jeopardy Foes." PCWorld. IDG Consumer & SMB, 16 Feb. 2011. Web. 6 July 2012.
<http://www.pcworld.com/businesscenter/article/219893/ ibm_watson_vanquishes_human_jeopardy_foes.html>.
Loftus, Jack, ed. Gizmodo. Gawker Media, 26 Apr. 2009. Web. 6 July 2012. <http://gizmodo.com/ 5228887/ibm-prepping-soul+crushing-watsoncomputer-to-compete-on-jeopardy>.
Markoff, John. "Computer Program to Take On ‘Jeopardy!’." New York Times [New York] 26 Apr. 2009: n. pag. Web. 6 July 2012.
<http://www.nytimes.com/2009/04/27/technology/ 27jeopardy.html>.
Pearson, Tony. "Inside System Storage -- by Tony Pearson." developerWorks. IBM, 18 Feb. 2011. Web. 6 July 2012.
<https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/
entry/ibm_watson_how_to_build_your_own_watson_jr_in_your_basement7?lang=en>.
Prager, John, et al. Question-answering by Predictive Annotation. Technical rept. New York: ACM, 2000. ACM Digital Library. Web. 6 July 2012.
<http://dl.acm.org/citation.cfm?id=345574>.
Radev, Dragomir R., John Prager, and Valerie Samn. Ranking Suspected Answers to Natural Language Questions Using Predictive Annotation.
Technical rept. N.p.: n.p., 2000. ACM Digital Library. Web. 6 July 2012. <http://dl.acm.org/citation.cfm?doid=974147.974168>.
"The Research Team." IBMWatson. IBM, n.d. Web. 6 July 2012. <http://www-03.ibm.com/innovation/us/ watson/research-team/algorithms.html>.
"Show #6086 - Monday, February 14, 2011." J! Archive. N.p., n.d. Web. 6 July 2012. <http://www.j-archive.com/showgame.php?game_id=3575>.
Silverman, Matt. "Engineering Intelligence: Why IBM’s Jeopardy-Playing Computer Is so Important." Mashable Tech. Mashable, 11 Feb. 2011. Web.
6 July 2012. <http://mashable.com/2011/02/11/ ibm-watson-jeopardy/>.
Singh, Tarandeep. "Artificial Intelligence Algorithm behind IBM Watson." Geeknizer. Geeknizer, 16 Feb. 2011. Web. 6 July 2012.
<http://geeknizer.com/ artificial-intelligence-algorithm-behind-ibm-watson/>.
“Watson (computer)." Wikipedia. Wikipedia. Web. 6 July 2012. <http://en.wikipedia.org/wiki/ Watson_%28computer%29>.
Zimmer, Ben. "Is It Time to Welcome Our New Computer Overlords?" The Atlantic. Atlantic Monthly Group, 17 Feb. 2011. Web. 6 July 2012.
<http://www.theatlantic.com/technology/archive/2011/ 02/is-it-time-to-welcome-our-new-computer-overlords/71388/>.