guir-summer00 - University of California, Berkeley

Transcript guir-summer00 - University of California, Berkeley

Incorporating Metadata into Search UIs
Marti Hearst and Ame Elliott
GUIR
Summer 2000
Outline



What’s wrong with search?
The Simplicity / Flexibility tradeoff
Task-oriented specialization


Vortals
Metadata-based Previews
What’s right with (Web) search?


Easy to get to site home pages
Automatic category suggestion



Disambiguates terms
Isolates home pages of sites
Suggests related information
What’s wrong with (Web) search?





Too many results
Wrong meanings for words
Difficult to express complex ideas / goals
Doesn’t help on the sites themselves
Doesn’t answer questions





Find campsite availability at a particular park
Pros and cons of tamoxifen for cancer treatment
Find a perfect wooden chest for your niece’s bedroom
Find prior art for this patented idea
Find a good opthamologist in your area
The future of search:
A Dichotomy

Information Intensive



Business analysis
Scientific research
Planning & design

Quick lookup


Question answering
Location-based info


Restaurants
Local history
Next generation search interfaces

More specialized in terms of




Tasks
Collections
Interfaces
Improved Technologies



Question-answering
Categorization
Information previews
The Simplicity / Flexibility Tradeoff
wizard
hyperlinks
text search
Variations in Flexibility
spreadsheet
Choice of
operators/
combinations
standard
GUI
wizard
hypertext
Choice of input values
standard
search
Flexibility Differences

Standard GUIs




Many operations
Restricted order of
operations
Task-centric
Completion matters

Hypertext




One operation (link)
Operation order
unrestricted
Information-centric
No natural stopping
point
Spreadsheets

Highly flexible



Several operators
Many orders to use & combine them in
What gets used? (Nardi 93)


Most people learn a very limited subset of
operations, use these in stereotyped ways
Most groups depend on local experts
Standard Search


Few operators (less flexible)
Many many input values (more flexible)
How to Control Flexibility?
Focus on the task.
The Importance of the Task
Results from HCI suggest the importance
of taking the task into account.



Searching patent databases  Proving non-infringement
Browsing newsgroups
 Finding the denial-of-service hacker
Getting all satellite news
 Anticipating the competition
The Importance of the Task


Example: How does Web page download time
effect usability?
In one study, Spool found:
(56kbit modem)



Users rated the sites:



Amazon: 36 sec/page (avg)
About.com: 8 sec/page (avg)
Fastest: Amazon
Slowest: About.com
Why?
The Importance of the Task

Perceived speed


Strong correlation between perceived speed
and whether the users felt they completed
their task
Strong correlation between perceived speed
and whether the users felt they always knew
what to do next (scent).
How to Incorporate the Task?

Goal:


Look at a work practice. Restrict the search
method to support this task.
Two Mechanisms using Metadata


Restrict collection: Vortals
Restrict suggested next steps: Previews
Metadata types
GeoRegion
+ Time/Date
+
Topic
+
Role
Restrict the Collection: Vortals
As Web Grows, Search Degrades
Solution: Specialize the collection
(Vortal = Vertical Portal)
 Reduces ambiguity of query word usage
 Eliminates irrelevant information in advance
 Allows for customization / personalization

Vortal Example: FindLaw

A vertical slice through legal text
WWW
Industry
Intranet
Desktop
Cascading priority based on locality of information
WWW
Industry
Intranet
Desktop
Specific slice through the data:
analyst vs salesperson, or legal vs. medical
WWW
Industry
Intranet
Desktop
Slice again based on task, e.g., research vs reporting
A simpler example (FindLaw)


Only one topic – law
Many different legal sources
Slicing by Topic Only


Generic search interface not enough
No support for legal tasks


Find prior art for patent infringement case
Find weaknesses in the application of
intellectual property law in the 6th circuit court
of appeals
Rather than search as usual across an
intersection of metadata types …
Information Previews: where to go next
Task-Specific Preview Combinations
A Simple Example
Yahoo restaurant guide combines:



Region
Topic (restaurants) + Attributes (cuisine)
Related Information
Other attributes (ratings)
 Other topics related in place and time (movies)

Yellow: geographic region
Green: restaurants & attributes
Red: related in place & time
Combining Information Types

Region

State

City

A&E




Film
Theatre
Music
Restaurants



Assumed task: looking for
evening entertainment

California
Eclectic
Indian
French
Other Possible Combinations






Region + A&E
City + Restaurant + Movies
City + Weather
City + Education: Schools
Restaurants + Schools
…
Bookstore preview combinations



topic + related topics
topic + publications by same author
topic + books of same type but related topic
Pre-defined Sources



Decide in advance which collections to
show results from
Places search results in context
Problem: the same metadata is used for
all queries
Information previews

Use the metadata to show where to go next




More flexible than canned hyperlinks
Less complex than full search
Help users see and return to what happened
previously
Reduces mental work


Recognition over recall
Suggest alternatives
The Importance of Informative Previews

Jared Spool’s studies (www.uie.com)

More clicks are ok if


The “scent” of the target does not weaken
If users feel they are going towards, rather
than away, from their target.
The Importance of Informative Previews

How to indicate “scent”?




Information organization reflects tasks
Longer, more descriptive links
Show category subtopic information
Breadth vs. depth tradeoffs



CNN categories (more scrolling) vs. Yahoo’s (more clicking)
Menu studies
Larson & Czerwinski study
Intermediate breadth
depth generally best
vs.
Problem with Previews

Standard approaches




Hand edited, predefined
Not tailored to task as it develops
Not personalized
Not dynamic
A new project: FLAMENCO
FLexible Access using MEtadata in Novel COmbinations

Main idea:


Preview and postview information
Determined dynamically and (semi)
automatically, based on current task
Flamenco: Dynamic Previews

Medical example


Allow user to select metadata in any order
At each step, show different types of relevant
metadata,
based on prior steps and personal history,
 include # of documents


Previews restricted to only those metadata
types that might be helpful
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
• Pregnadienes (5)
• Prednisone (5)
• Pregnenes
• Budesonide (4)
• Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids > Admin & Dosage
1.
Dosage levels for asthmatic steroids: A survey.
Steroids
•Pregnanes
• Pregnadienes (3)
• Prednisone (5)
Related Categories
•Inhalators (40)
•Emotional Effects (25)
•Preferred Suppliers (30)
User Preferred
• Musculoskeletal (0)
•Drug Resistance (2)
•All Categories (50)
50 Documents: [Sort by author] [Sort by popularity] [Sort by Dosage] [Cluster]
1. Optimal dosage levels for prednisone in the treatment of childhood asthma.
2. …
Other paths: back up and go forward
Asthma > Steroids
Asthma > Steroids > Budesonide
Asthma > Steroids > Budesonide > Huang
Asthma > Huang > Budesonide
Another Application
Finding images for design tasks
See Ame’s talk
Dynamic Metadata Previews

How different from Yahoo & Amazon?

Dynamically determine what to show next
Yahoo’s combos are predefined
 Amazon’s are also predefined, and limited to taste
and general topic only


A way to seamlessly integrate



Related topics
User preferences (personalization)
Context-sensitivity
Evaluation Methodology

Regression Test

Select a set of tasks


Start with a baseline system


Use these throughout the evaluation
Evaluate using the test tasks
Add a feature
Evaluation again
 Compare to baseline
 Only retain those changes that improve results

Project Goals


Develop a general understanding of how
to usefully incorporate metadata into the
search process
Developed user-validated methodology
that can be extended to other domains.
Summary



Standard search is too flexible
Hyperlinks are too restrictive
Task-centric approaches



Task-specific collections
Flamenco: Showing next choices / previews
Issues


How to identify tasks?
Given lots of task-specific UIs, how to find the right one?

guir-summer00 - University of California, Berkeley

Transcript guir-summer00 - University of California, Berkeley

Directory