7. Decision Trees and Decision Rules
Download
Report
Transcript 7. Decision Trees and Decision Rules
國立雲林科技大學
National Yunlin University of Science and Technology
Probabilistic Model for Definitional
Question Answering
Graduate : Chen, Shao-Pei
Authors
: Kyoung-Soo Han, Young-In Song, and
Hae-Chang Rim
SIGIR
1
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Outline
Motivation
Objective
Methodology
Experimental Results
Conclusion
2
Intelligent Database Systems Lab
Motivation
N.Y.U.S.T.
I. M.
It is difficult to find which information is useful
for the answer to a definitional question. A
definitional question such as “What is NASA?”.
A short passage cannot answer the definitional
questions because a definition needs several
essential information about the target.
3
Intelligent Database Systems Lab
Objective
N.Y.U.S.T.
I. M.
We propose a formal model for definitional QA,
considering the characteristics of the definitional
questions.
We model the definitional QA from the two points
of view, topic and definition.
What is NASA?
S1: NASA is the agency responsible for the public space program of the USA.
S2: NASA was established in 1958.
S3: The headquarters of NASA is located in Washington, D.C.
S4: NASA announced the new annual budget.
S5: John who works for NASA gave a housewarming party yesterday.
S6: Ji-Sung Park is a famous football player from South Korea.
{S1,S2,S3,S4} are the topic sentences
{S1,S2,S3,S6} are the definitional sentences
{S1,S2,S3} is the answer to the question.
4
Intelligent Database Systems Lab
Definitional question answering system
based on the probabilistic model
What is NASA?
NASA
Relevant documents to the question
target are retrieved, and answer
candidates are extracted from the
retrieved documents.
S1: NASA is the agency
responsible
for the public space program of
the USA.
….
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Methodology
N.Y.U.S.T.
I. M.
General Language Model
Dirichlet smoothing
Topic Language Model
Definition Language Model
6
Intelligent Database Systems Lab
Experimental Results
N.Y.U.S.T.
I. M.
The external definitions
have almost no noise
and the news articles
are generally less noisy
than web pages.
The large difference in the
term distribution explains the
reason why the system heavily
considering the definition type
performs so well.
The result is slightly
underestimated for TREC
2004 questions because
ours do not consider
other types of questions.
7
Intelligent Database Systems Lab
Conclusion
N.Y.U.S.T.
I. M.
The proposed model can be easily extended to
other descriptive QA
For the future work, we will estimate the
probabilities of the language models using
more contexts.
8
Intelligent Database Systems Lab