razorfish - University of California, Berkeley

Download Report

Transcript razorfish - University of California, Berkeley

Incorporating Metadata into Search
User Interfaces
Marti Hearst
UC Berkeley
Razorfish
Nov 3, 2000
Web Search is Working!
Survey finds high user satisfaction
Study by npd group
From
http://searchenginewatch.internet.com/reports/npd.html
Web Search is Working!
Survey finds high user satisfaction
(a recent upswing – the decline was caused by an increase in # of pages indexed)
From
http://searchenginewatch.internet.com/reports/npd.html
Web Search is Working!
Why? Queries are still short!
Average query length currently ~2.4 words (Doug Cook, Inktomi)
From
http://searchenginewatch.internet.com/reports/npd.html
My guess:
Web Search is Successful at Finding
Good Starting Points (home pages)
Evidence

Web search engines are heavily using




Link analysis
Page popularity
Interwoven categories
These all find dominant home pages
Consequences

Web search engines are providing source
selection!

A side note:
A digital library issue as well. DL’s make people do
this step explicitly. People don’t generally like this!

What happens at the site?

Follow hyperlinks or use site search
Following Hyperlinks


Works great when it is clear where to go
next
Frustrating when the desired directions
are undiscernable or unavailable
Site Search


This is not getting good reviews
Large, disorganized results sets
An Analogy
hypertext
text search
Analogy

Hypertext:



A fixed number of choices of where to go next;
A glance at the map tells you where you are;
But may not go where you want to go.


To get from Topeka to Santa Fe, may have to go through
Frostbite Falls
Site Search:


Can go anywhere;
But may get stuck, disoriented, in a crevass!
Goal: An All-Tertrain Vehicle

The best of both techniques



A vehicle that magically lays down track to
suggest choices of where you want to go next
based on what you’ve done so far and what
you are trying to do
The tracks follow the lay of the land and go
everywhere, but cross over the crevasses
The tracks allow you to back up easily
How to make an all-tertrain vehicle?
Two ideas:
Focus on the task.
Use metadata explicitly.
The Importance of the Task
Results from HCI suggest the importance
of taking the task into account.



Searching patent databases vs. Proving non-infringement
Browsing newsgroups
vs. Finding the denial-of-service hacker
Getting all satellite news
vs. Anticipating the competition
The Importance of the Task:
Indirect Evidence

How does Web page download time effect
usability?

In one study, Spool found:
(56kbit modem)



Users rated the sites:



Amazon: 36 sec/page (avg)
About.com: 8 sec/page (avg)
Fastest: Amazon
Slowest: About.com
Why?
The Importance of the Task

Perceived speed


Strong correlation between perceived speed
and whether the users felt they completed
their task
Strong correlation between perceived speed
and whether the users felt they always knew
what to do next (scent).
Metadata
Metadata types
GeoRegion
+ Time/Date
+
Topic
+
Role
Content-based Metadata

Medical text


Architectural images


Location, Style, Materials, Period …
Recipes!


Anatomy, Disease, Chemicals, Procedures…
Cuisine, Ingredients, Season, Calories …
Example:

SOAR vs. epicurious
soar.berkeley.edu/recipes
soar.berkeley.edu/recipes
soar.berkeley.edu/recipes
soar.berkeley.edu/recipes
www.epicurious.com
www.epicurious.com
www.epicurious.com
www.epicurious.com
Epicurious Metadata Usage

Advantages





Creates combinations of metadata on the fly
Different metadata choices show the same information in
different ways
Previews show how many recipes will result
Easy to back up
Supports several task types



``Help me find a summer pasta,'' (ingredient type with event type),
``How can I use an avocado in a salad?'' (ingredient type with dish type),
``How can I bake sea-bass'' (preparation type and ingredient type)
Epicurious Metadata Usage
Problem: lacks integration with search
What about Yahoo?

Routes through the metadata are




Predefined
Unstable (due to symbolic links)
Long (due to bad mixing of metadata)
Example: Where is Berkeley?
 College and University > Colleges and Universities >United States > U >
University of California > Campuses > Berkeley
 U.S. States > California > Cities >Berkeley > Education > College and
University > Public > UC Berkeley
Yahoo using metadata well
Yahoo restaurant guide combines:



Region
Topic (restaurants)
Related Information
Other attributes (cuisines)
 Other topics related in place and time (movies)

Yellow: geographic region
Green: restaurants & attributes
Red: related in place & time
Combining Information Types

Region

State

City

A&E




Film
Theatre
Music
Restaurants



Assumed task: looking for
evening entertainment

California
Eclectic
Indian
French
Other Possible Combinations






Region + A&E
City + Restaurant + Movies
City + Weather
City + Education: Schools
Restaurants + Schools
…
Bookstore preview combinations



topic + related topics
topic + publications by same author
topic + books of same type but related topic
Problems with Metadata Usage

Standard approaches





Paths are hand-edited, predefined
Not well-integrated with search
Not tailored to task as it develops
Not personalized
Not dynamic
A new project: FLAMENCO
FLexible Access using MEtadata in Novel COmbinations

Main ideas:



Make metadata an explicit part of the
interface, but in a highly-usable manner
Preview and postview choices
Determine views dynamically and (semi)
automatically, using a task-based model
Flamenco: Dynamic Previews

Medical example


Allow user to select metadata in any order
At each step, show different types of relevant
metadata,
based on prior steps and personal history,
 include # of documents


Previews restricted to only those metadata
types that might be helpful
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
• Pregnadienes (5)
• Prednisone (5)
• Pregnenes
• Budesonide (4)
• Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
• Pregnadienes (5)
• Prednisone (5)
• Pregnenes
• Budesonide (4)
• Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
Pregnadienes (5)
Prednisone (5)
• Pregnenes
Budesonide (4)
Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids > Admin & Dosage
1.
Dosage levels for asthmatic steroids: A survey.
Steroids
•Pregnanes
Pregnadienes (3)
Prednisone (5)
Related Categories
•Inhalators (40)
•Emotional Effects (25)
•Preferred Suppliers (30)
User Preferred
• Musculoskeletal (0)
•Drug Resistance (2)
•All Categories (50)
50 Documents: [Sort by author] [Sort by popularity] [Sort by Dosage] [Cluster]
1. Optimal dosage levels for prednisone in the treatment of childhood asthma.
2. …
Other paths: back up and go forward
Asthma > Steroids
Asthma > Steroids > Budesonide
Asthma > Steroids > Budesonide > Huang
Asthma > Huang > Budesonide
Dynamic Metadata Previews

How different from Yahoo & Amazon?

Dynamically determine what to show next
Yahoo’s combos are predefined
 Amazon’s are also predefined, and limited to taste
and general topic only


A way to seamlessly integrate



Related topics
User preferences (personalization)
Context-sensitivity
Evaluation Methodology

Regression Test

Select a set of tasks


Start with a baseline system


Use these throughout the evaluation
Evaluate using the test tasks
Add a feature
Evaluation again
 Compare to baseline
 Only retain those changes that improve results

Summary



Standard search is too flexible
Hyperlinks too restrictive
Flamenco:
Task-centric search interfaces
 Integrate metadata with search
 Dynamic previews
 Easily retrace steps
 Systematically determine what works for real users

Application to Image Search
Image Search



Content analysis is making strides
Rich hand-assigned metadata is available
But most search based on




Keyword matching (alltheweb/lycos multimedia)
Image-component based querying (QBIC)
Overall similarity to sample image (Blobworld)
Combo of keyword and image component
Image Search: What is the task?

Illustrate my slides?



“Find a crevasse”
Keyword match works pretty well
Find inspiration for an
architectural design?


General similarity: maybe
But more control might be better
Blobworld
Architects’ Image Use

Work practices


Observations from personal design experience, and
surveys of designers
Common activities for image use



Browsing most common at early stages of design
Collage making, sketching, pinning up on walls
Cultural and social practice


Designers learn how to do this in schools
Ways of communicating with images varies with
organization
Slide by Ame Elliott
Current Problems Browsing
Images On-Line

Current on-line image collections offer few
advantages over paper collections



Queries are textual and must be well-formed


Lose papers’ ease of manipulation
Little gain in accessibility
Not appropriate for the early phases of design when
image browsing is critical
Image search engines don’t follow good UI
design in general

Poor support for search starting points, collection
visualization
Slide by Ame Elliott
Creating Browsing Scenarios

Rationale


Learned about search strategies and how architects
look for images now
3 design scenarios presented to 2 architects



“Add handicapped ramp to entrance of suburban
home”
“Design addition to children’s home in a Victorian
mansion”
“Design a maritime cultural center on the beach in
San Diego”
Slide by Ame Elliott
Results of Interviews

Were they believable professional problems?


Yes
How would they browse for images to help with
these tasks?



Gathered lists of terms
Learned some ideas about strategies
High degree of consistency between interviewees
Slide by Ame Elliott
How different from medical
example?



More open-ended
Easier to scan many images quickly
Tertrain metaphor not used here



Not narrowing down a large set
Rather, always viewing more images
A mechanism for “steering” through the
metadata
SPIRO:
>40,000 art &
architecture images
Detailed metadata
SPIRO Query Form
SPIRO query on Subject: church
A Better Example



Greatbuildings.com
Hyperlinks metadata together
But a small collection


~1000 buildings
~4500 images total
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
The Approach



Create an architecture that allows
experimentation with different approaches
Add functionality in a stepwise fashion
Architecture task:




Emphasize images over text
Use greatbuildings-style interface as a reasonable
baseline for comparison
Find out how much choice is too much
Find out whether explicit metadata is better than
implicit more-like-this
Summary



Standard search is too flexible
Hyperlinks too restrictive
Flamenco:
Task-centric search interfaces
 Integrate metadata with search
 Dynamic previews
 Easily retrace steps
 Systematically determine what works for real users
