ppt - University of Illinois Urbana

Download Report

Transcript ppt - University of Illinois Urbana

Frame an IR Research Problem and
Form Hypotheses
ChengXiang Zhai
Department of Computer Science
Graduate School of Library & Information Science
Institute for Genomic Biology, Statistics
University of Illinois, Urbana-Champaign
http://www-faculty.cs.uiuc.edu/~czhai, [email protected]
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
1
•
•
•
General Steps to Define a Research
Problem
Generate and Test
Raise a question
Novelty test: Figure out to what extent we know how to answer the question
– There’s already an answer to it: Is the answer good enough?
• Yes: not interesting, but can you make the question more challenging?
• No: your research problem is how to get a better answer to the raised question
•
– No obvious answer: you’ve got an interesting problem to work on
Tractability test: Figure out whether the raised question can be answered
– I can see a way to answer it or potentially answer it: you’ve got a solvable
problem
•
– I can’t easily see a way to answer it: Is it because the question is too hard or
you’ve not worked hard enough? Try to reframe the problem to make it easier
Evaluation test: Can you obtain a data set and define measures to test
solutions/answers?
– Yes: you’ve got a clearly defined problem to work on
•
– No: can you think of anyway to indirectly test the solutions/answers? Can you
reframe the problem to fit the data?
Every time you reframe a problem, try to do all the three tests again.
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
2
Rigorously Define Your Research
Problem
•
•
•
•
•
Exploratory: what is the scope of exploration? What is the
goal of exploration? Can you rigorously answer these
questions?
Descriptive: what does it look like? How does it work? Can
you formally define a principle?
Evaluative: can you clearly state the assumptions about data
collection? Can you rigorously define measures?
Explanatory: how can you rigorously verify a cause?
Predictive: can you rigorously define what prediction is to be
made?
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
3
Frame a New Computation Task
• Define basic concepts
• Specify the input
• Specify the output
• Specify any preferences or constraints
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
4
Map of IR Applications
Kids
Lawyers
Scientists
Web pages
News articles
“Google Kids”
Email messages
…
Organization docs
Customer Peking Univ.
Service community
People
Literature
Assistant
Email management
+ automatic reply
Legal Info
Systems
Literature
Online
Shoppers
Local
Web
Service
Intranet
Search
Legal docs/Patents
Medical records
Customer complaint
letter/transcripts
Blog articles
Search
Browsing
2008 © ChengXiang Zhai
Alert
Mining
Task/Decision
support
Dragon Star Lecture at Beijing University, June 21-30, 2008
?
5
From a new application to
a clearly defined research problem
•
•
•
•
Try to picture a new system, thus clarify what new functionality is to be
provided and what benefit you’ll bring to a user
Among all the system modules, which are easy to build and which are
challenging?
Pick a challenge and try to formalize the challenge
– What exactly would be the input?
– What exactly would be the output?
Is this challenge really a new challenge (not immediately clear how to
solve it)?
– Yes, your research problem is how to solve this new problem
– No, it can be reduced to some known challenge: are existing methods
sufficient?
• Yes, not a good problem to work on
•
• No, your research problem is how to extend/adapt existing methods to
solve your new challenge
Tuning the problem
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
6
Tuning the Problem
Level of Challenges
Make an easy problem harder
Increase impact (more general)
Make a hard problem easier
Unknown
Known
Impact/Usefulness
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
7
Examples of Problem Formulation
• Risk minimization framework
• Study of smoothing
• Axiomatic retrieval framework
• Comparative Text Mining
• Contextual PLSA
• Opinion Integration
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
8
•
Form Research Hypotheses
Typical hypotheses in IR:
– Hypothesis about user characteristics (tested with user studies or userlog analysis, e.g., clickthrough bias)
– Hypothesis about data characteristics (tested with fitting actual data,
e.g., Zipf’s law)
– Hypothesis about methods (tested with experiments):
• Method A works (or doesn’t work) for task B under condition C by
measure D (feasibility)
• Method A performs better than method A’ for task B under condition C
by measure D (comparative)
•
•
•
•
• Introduce baselines naturally lead to hypotheses
Carefully study existing literature to figure our where exactly you can make
a new contribution (what do you want others to cite your work as?)
The more specialized a hypothesis is, the more likely it’s new, but a
narrow hypothesis has lower impact than a general one, so try to
generalize as much as you can to increase impact
But avoid over-generalize (must be supported by your experiments)
Tuning hypotheses (next lecture)
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
9
Next Lecture (June 26):
Test/Refine Hypothese
2008 © ChengXiang Zhai
Dragon Star Lecture at Beijing University, June 21-30, 2008
10