Transcript Document

A HUBzero Extension for
Automated Tagging
Jim Mullen
Advanced Biomedical IT Core
Indiana University
September 6, 2013
My Work on Extension
I implemented the automated tagging
extension, but others came up with the idea
and contributed to the design, including Bill
Barnett, Michael Grobe and Anurag
Shankar.
A HUBzero Extension for Automated Tagging
September 6, 2013
Automated Tagging Extension
Goal. Support automated tagging of Indiana
CTSI (Clinical and Translational Sciences
Institute) Hub (http://indianactsi.org) pages
using the NCBO (National Center for
Biomedical Ontology) Annotator
Motivation. Tagging (assigning terms from a
controlled vocabulary/ontology to pages) can
be very helpful for site search and navigation,
but manual tagging is expensive.
A HUBzero Extension for Automated Tagging
September 6, 2013
NCBO Annotator
A web site that includes web services for
annotating text using various controlled
vocabularies and ontologies, such as
SNOMED and MeSH (Medical Subject
Headings).
A HUBzero Extension for Automated Tagging
September 6, 2013
NCBO Annotator Example
Text:
“Gene therapy vectors based on murine retroviruses have
now been in clinical trials for over 20 years. During that
time, a variety of novel vector pseudotypes were
developed in an effort to improve gene transfer.”
Ontology: MeSH
Terms/Tags:
Genes Gene Therapy Retroviridae
therapy Time
Transfer (Psychology)
A HUBzero Extension for Automated Tagging
September 6, 2013
Extension Overview
• The Indiana CTSI HUB was built using HUBzero
(http://hubzero.org), which was built on top of the
Joomla content management system.
• Extension works with Joomla (version 1.5) as well as
HUBzero
• Extension consists of:
o Plugin – conditionally tags pages when they are
accessed, and displays the tags on pages
o Component – provides user interface for search
and navigation and administrative interface for
configuration
A HUBzero Extension for Automated Tagging
September 6, 2013
Extension Overview (continued)
• User interface (front-end)
o Information/help page
o Multi-word auto-complete tag search
o Tag cloud
o Tag information page
• Admin interface (back-end)
o Configuration of extension
A HUBzero Extension for Automated Tagging
September 6, 2013
Tags on Pages
The
extension
adds tags
to the
bottom of
pages
(using a
plugin).
A HUBzero Extension for Automated Tagging
September 6, 2013
Information/Help Page
You can create
an article that
users will be
directed to
when they click
on the “What’s
this?” link.
A HUBzero Extension for Automated Tagging
September 6, 2013
Tag Search
You can
select the
ontology to
use for the
search.
Auto-completion is provided for search terms.
A HUBzero Extension for Automated Tagging
September 6, 2013
Tag Cloud
The size of a
term is
proportional to
the number of
pages that are
tagged with it
A HUBzero Extension for Automated Tagging
September 6, 2013
Tag Information Page
The Tag Info
page lists
the pages
that contain
the specified
tag.
A HUBzero Extension for Automated Tagging
September 6, 2013
Extension Installation
Upload a
zip file
using the
HUBZero /
Joomla
admin
interface
A HUBzero Extension for Automated Tagging
September 6, 2013
Extension Configuration
After the
component is
installed, the
component’s admin
interface is used to
configuring the
component
A HUBzero Extension for Automated Tagging
September 6, 2013
Component Configuration - Steps
1. Get and enter an NCBO API key
2. Download ontology information from
NCBO
3. Select the ontologies to use and a
primary/default ontology
4. Set tagging options
5. Turn tagging on
A HUBzero Extension for Automated Tagging
September 6, 2013
Component Configuration
Tagging Display Options
A HUBzero Extension for Automated Tagging
September 6, 2013
Extension Configuration
Tagging Update Options
•
•
•
•
•
Turn on/off tag updates
Limit IP addresses for tag updates
Time limit before tag updates are made
Pages to exclude from tagging
Components to exclude from tagging
A HUBzero Extension for Automated Tagging
September 6, 2013
Conclusions
Pros
• Automatically tags pages
• Will work on all pages (not component-dependent)
• Works with Joomla as well as HUBzero
Cons
• Tagging dependent on NCBO annotator:
o Does not seem to be very intelligent
o Too slow to have real-time tagging
o Extension will break if NCBO changes their web services
o Limited to biomedical ontologies
It’s possible to change the extension’s annotator, so this extension
could be used as a basis for using or testing other annotators.
A HUBzero Extension for Automated Tagging
September 6, 2013