Semantic Grid

Download Report

Transcript Semantic Grid

A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SeaLife
Simon Jupp
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SeaLife
Conception and realisation of a Semantic Grid Browser, which links the
current Web to the emerging eScience infrastructure
•
•
Partners: Manchester, Dresden, Edinburgh, London, Inria Sophia-Antipolis,
Scionics
Objectives:
– Many grids, few users: make Web servers and services accessible to end users
– Semantic Hyperlinks: use ontologies and background knowledge to map web
contents to services
– Shopping cart: Service composition and enactment module
•
Application: from cells, via tissue to patients
– Evidence-based medicine
– Patent and literature mining
– Molecular biology
•
Implementations:
– COHSE
– GoPubMed
– CORESE
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Objective
•
•
•
We have a World Wide Web of data
We have e-science and a grid of bioinformatics services
We have text-mining tools, ontologies, web services and W3C standards
Qu ic kTi me™ a nd a
de co mpre ss or
are n ee de d to s ee th is pi ctu re .
QuickTi me™ and a
decompressor
are needed to see t his pict ure.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTi me™ and a
decompressor
are needed to see t his pict ure.
Quic kTime™ and a
dec ompres sor
are needed to see this pic ture.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTime™ and a
decompressor
are needed to see t his picture.
QuickTime™ and a
decompressor
are needed to see t his picture.
QuickTime™ and a
decompres sor
are needed to s ee this picture.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickT i me™ and a
decom pressor
are needed t o see thi s pi cture.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTime™ and a
decompressor
are needed to see this picture.
Quic kTime™ and a
dec ompres sor
are needed to see this pic ture.
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Evidence based medicine
"Ribavirin with or without alpha interferon for
chronic hepatitis C"
•
•
Background Knowledge: MeSH, Disease Ontology, SNOMED…
UK based Resources:
– National Institute for Health and Clinical Excellence (NICE)
– National Electronic Library of Infection (NeLI)
– Health protection Agency (HPA)
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Molecular Biology
‘’Rabaptin-5 interacts with the small GTPase Rab5 and is an
essential component of the fusion machinery for targeting
endocytic vesicles to early endosomes’’
•
Background Knowledge:
– Rabaptin-5 and Rab5 are proteins
– endocytosis as GO biological process
– early endosome as GO cellular component.
•
Resources:
– Get sequences, execute alignment service
– Add proteins to “shopping cart” Rab5
– PubMed query for relevant abstracts
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
A Sealife browser
•
Definition: A SeaLife browser is any web browser that can identify domain
concepts in web documents via text-mining or use of background knowledge,
and provides context based links to related services/resources on the
web/grid.
•
Several exists: COHSE, GoPubMed, Magpie, PiggyBank, KIM, Concept Web
Linker….
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Implementations
•
COHSE - Conceptual Open Hypermedia Service
– Dynamic linking system for WWW documents
– Uses background knowledge (ontologies) to identify domain concepts
– Service module for navigating to relevant documents on the Web
•
GoPubMed
– Ontology based search engine: Query expansion and results filtering
– Supports What, Who, Where, When.
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Web Navigation
•
The Semantic Web is still a Web to be used by humans
–
•
Navigation is still an important aspect of information gathering on the Web
–
•
Serendipitous information retrieval
Problem
–
–
–
–
–
–
–
•
A collection of linked nodes
Links are typically embedded
Hard coded
Difficult to author
Ownership
Unary
Legacy resources
Offer little in the way of semantics
Approach
–
–
Exploit Semantic Web components to add links
dynamically to documents
Exploit knowledge structure to drive Navigation
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Web Navigation with COHSE
•
Knowledge Service
– Text processor and background knowledge identify concepts in a page
•
Resource Manager
– Finds links targets for concepts
found in the page
•
DLS
– Dynamically adds the links to
the page and manages requests
to the resource manager
•
Can be run as browser plugin
or through a proxy
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
NeLI use case
•
National Electronic Library of Infection, London, UK.
– Evidence based, quality tagged resource for public and clinical health records
– Diverse set of users
• GPs, Clinicians, Molecular biologists, General Public
– Many documents, few hyperlinks
•
Can COHSE provide useful links to relevant external documents?
– Evaluation is underway
•
Searching for guidelines on the use of "Ribavirin with or without alpha
interferon for chronic hepatitis C"
– Clinicians need up to date, authoritative information
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
COHSE-NeLI Demo
http://www.cs.man.ac.uk/~sjupp/downloads/COHSE-NELI-2009-demo.mov
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Background knowledge
•
What semantics do we need for the background knowledge to drive
navigation?
•
Richer and more granular knowledge is better for navigation.
•
The type of background knowledge varies between types users and the task
at hand.
– E.g. Nurses, doctors, public, medic etc..
A Semantic Grid Browser for the Life Sciences Applied to
the Study
of Infectious
-Mosquito
gross
anatomy Diseases
-Protein covalent bond
-Protein domain
-UniProt taxonomy
-Sequence types
and features
-Genetic Context
-Mouse adult gross anatomy
-Mouse gross anatomy and development
-C. elegans gross anatomy
-Arabidopsis gross anatomy
-Cereal plant gross anatomy
-Drosophila gross anatomy
-Dictyostelium discoideum anatomy
-Fungal gross anatomy FAO
-Plant structure
-Maize gross anatomy
-Medaka fish anatomy and development
-Zebrafish anatomy and development
-Pathway ontology
-Event (INOH pathway
ontology)
-Systems Biology
-Protein-protein
interaction
BRENDA tissue /
enzyme source
Proteins
Sequence
Pathways
Phenotype
Anatomy
Phenotype
Gene products Transcript
- Molecule role
- Molecular Function
- Biological process
- Cellular component
eVOC (Expressed
Sequence Annotation
for Humans)
Cell type
Development
-Arabidopsis development
-Cereal plant development
-Plant growth and developmental stage
-C. elegans development
-Drosophila development FBdv fly
development.obo OBO yes yes
-Human developmental anatomy, abstract
version
-Human developmental anatomy, timed version
Plasmodium
life cycle
-NCI Thesaurus
-Mouse pathology
-Human disease
-Cereal plant trait
-PATO PATO attribute and value.obo
-Mammalian phenotype
-Habronattus courtship
-Loggerhead nesting
-Animal natural history and life history
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Knowledge representation
Infectious Disease
TB
Bacteria
abbreviation
Caused by
BCG vaccine
Tuberculosis
vaccine
Chest X-ray
Is a
Diagnosis/
detection
Symptom
Isoniazid
drug
Affects
Similar to
Coughing
Lung
Mycobacterium bovis
Can’t make these close links with strict semantics!
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SKOS conversions
Infectious Disease
TB
Bacteria
skos:altLabel
skos:broader
BCG vaccine
skos:narrower
Chest X-ray
skos:broader
Tuberculosis
Isoniazid
skos:related
skos:narrower
skos:related
skos:related
Lung
skos:narrower
Coughing
Mycobacterium bovis
• We need “something to do with” semantics for Navigation
• SKOS provides standard for common representation with “enough” semantics
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
COHSE and e-science
•
Enhancements to COHSE, working prototype available
– Addition of text-mining component
• Identifies Genes, Proteins, Chemicals in text
•
Query service repositories
– E.g. myExperiment, BioCatalogue, Bio-moby
– Execute services and workflows within the browser
•
Edinburgh developed shopping cart and argumentation services
–
–
–
–
Shop online for your genes, proteins, sequences etc…
Shop online for services and workflows
All from within your web browser!
But that’s the future….
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Summary
•
Range of Semantic Web browsers under development
•
Semi-automated addition of semantic content to existing resources is the only
viable option in many cases
•
What are we waiting for?
– More background knowledge
– Semantic web services description