Developing systems for full-text search in biomedicine.

Download Report

Transcript Developing systems for full-text search in biomedicine.

Evidence for Showing Gene/Protein
Name Suggestions in Bioscience
Literature Search Interfaces
Anna Divoli, Marti A. Hearst, Michael A. Wooldridge
School of Information
University of California, Berkeley
08 Jan 2008
Pacific Symposium of Biocomputing
outline
• BioText search engine (in brief)
• Aims
• HCI principles (in brief)
• First study: biological information preferences
• Second study: gene/protein name expansion preferences
• Conclusions from studies
• Current and future work
biotext search engine
aims
• Determine whether or not bioscience literature searchers wish
to see related term suggestions, in particular, gene and protein
names
• Determine how to display to users term expansions
hci principles
• Design for the user, not for the designers or the system
• Needs assessment:
• Task analysis:
who users are
what their goals are
what tasks they need to perform
characterize what steps users need to take
create scenarios of actual use
decide which users and tasks to support
• Iterate between: designing & evaluating
Design
Evaluate
Prototype
hci principles - cont.
• Make use of cognitive principles where available
Important guidelines:
• Prototypes:
Reduce memory load
Speak the user’s language
Provide helpful feedback
Respect perceptual principles
Get feedback on the design faster
Experiment with alternative designs
Fix problems before code is written
Keep the design centered on the user
first study: biological information preferences
• Online surveys
• Questions on they are searching the literature and what
information would like a system to suggest
• 38 participants:
- 7 research institutions
- 22 graduate students, 6 postdocts, 5 faculty, and 5 others
- wide range of specialties: systems biology, bioinformatics,
genomics, biochemistry, cellular and evolutionary biology,
microbiology, physiology, ecology...
participants’ information
results
Related Information Type
Avg rating
Gene’s Synonyms
4.4
Gene’s Synonyms refined by organism
Gene’s Homologs
Genes from same family: parents
Genes from same family: children
Genes from same family: siblings
Genes this gene interacts with
3.7
Diseases this gene is associated with
Chemicals/drugs this gene is associated with
Localization information for this gene
1
(Do NOT want this)
2
3
(Neutral)
# selecting 1 or 2
2
4.0
3.7
3.4
3.6
3.2
2
5
7
4
9
4
3.4
3.2
3.7
4
6
8
3
5
(REALLY want this)
second study: gene/protein name expansion preferences
• Online surveys
• Evaluating 4 designs for gene/protein name suggestions
• 19 participants:
- 9 of which also participated in the first study
- 4 graduate students, 7 postdocs, 3 faculty, and 5 others
- wide range of specialties: molecular toxicology, evolutionary
genomics, chromosome biology, plant reproductive biology,
cell signaling networks, computational biology…
design 1: baseline
design 2: links
design 3: checkboxes
design 4: categories
results
Design
Participants who rated
design 1st or 2nd
Average rating
(1=low, 4=high)
#
%
3
(checkboxes)
15
79
3.3
4
(categories)
2
(links)
1
(baseline)
10
53
2.6
9
47
2.5
0
0
1.6
conclusions
• Strong desire for the search system to suggest information closely
related to gene/protein names.
• Some interest in less closely related information .
• Most participants want to see organism names in conjunction with
gene names.
• A majority of participants prefer to see term suggestions grouped by
type (synonyms, homologs, etc).
• Split in preference between single-click hyperlink interaction
(categories or single terms) and checkbox-style interaction.
• The majority of participants prefers to have the option to chose either
individual names or whole groups with one click.
• Split in preference between the system suggesting only names that it is
highly confident are related and include names that it is less confident
about under a “show more” link.
in progress: biotext’s name suggestions
• (link to development site)
• (take screenshots when the interface is ready as a fall-back
plan)
future work
• We plan to assess presentation of other results of text
analysis, such as the entities corresponding to diseases,
pathways, gene interactions, localization information, function
information, and so on.
• Assess the usability of one feature at a time, see how
participants respond, and then test out other features
• Need to experiment with hybrid designs, e.g., checkboxes for
the individual terms and a link that immediately adds all terms in
the group and executes the query.
• Adding more information will require a delicate balancing act
between usefulness and clutter.
current study
• Evaluating the different views of BioText search engine
• 16 participants (so far):
- 6 graduate students, 4 postdocs, 1 faculty, 5 other
• Results:
Text search
Figure caption
search
Table search
Frequently
11
7
6
Sometimes
4
5
3
Rarely
0
3
4
Never
0
0
2
Undecided
1
1
1
acknowledgments
We are grateful to all the participants of our studies!
Supported by NSF DBI-0317510
BioText Search Engine available at: http://biosearch.berkeley.edu