Applications of the Semantic Web in a Global Pharmaceutical

Download Report

Transcript Applications of the Semantic Web in a Global Pharmaceutical

Knowledge Management
Issues in a Global
Pharmaceutical R&D
Environment
W3C Workshop on Semantic Web for Life Sciences
27-28 October 2004
Cambridge, Massachusetts USA
Ted Slater
Proteomics Center of Emphasis
Pfizer Global R&D Michigan
About Pfizer Global R&D

The industry’s largest
R&D organization




>12,500 employees
worldwide
Estimated R&D budget in
2004:
$7.9 billion
Hundreds of research
projects over 18
therapeutic areas
(Not really using
Semantic Web
technologies just now)
Issues with Global R&D
Geographical (time & distance)
 Language (even if the language is the
same!)
 Cultural
 Increased reliance on electronic
communications

5:00
10:
00
2:0
0
18:
00
4:0
0
5:00
What’s in a Name?

“Releasing TaqMan® Data” use case
from John Wilbanks (17 Aug 2004)
GO annotation from a particular gene
 TaqMan® data from an exon proximal to
that gene
 Annotating the TaqMan® data with GO
annotation is not quite right
 Different perceptions of concept “gene”

Proteomics
Metabonomics
RNA Profiling
Current Tools Fall Short
100+ highly-specialized software tools
in place for ’omics technologies
 All query-centric

Single user
 Low bandwidth
 Ask a question, get a list

How to Drive a Biologist Crazy








gi|84939483 
gi|39893845 
gi|27394934 
gi|18890092 
gi|10192893 
gi|11243007 
gi|20119252 
gi|19748300 








gi|44308356 
gi|50021874 
gi|10003001 
gi|27762947 
gi|24537303 
gi|27284958 
gi|37373499 
…
How to Add Insult to Injury
Current State of KM
Data Tombs
Metadata?
Experimental protocols
 Model system descriptions
 Statistical criteria for data analysis and
acceptability
 Others

fan
wall
spear
snake
tree
rope
Hypothesis Generation

Our domain is too big and complex to fit
in our heads

Browsing and correlation can’t get us there
We need our machines to generate
testable hypotheses for us based on our
experimental results
 We need knowledge about causation

Clinical KM Needs

Aggregate and analyze:
Safety data
 Efficacy data
 Genomic data
 Healthcare data
 Performance data

Study metadata
 Staff and vendor performance
 Resource utilization

The Shape of Clinical Data
>2 GB each per Phase-2, -3, or -4
protocol, split over >100 different
datasets, each with 20-300 columns
 Metadata complex, hard to combine
across studies
 Sensitive data


Project teams can be reluctant to discuss
with other groups (e.g. in discovery)
Clinical Columns







Dosage and dose response data
Product differentiation
Patient demographics
Concurrent medications
Lab data
Subject experience & adverse events
How fast does it work? How long does it last?
Other Areas

Legal





HR
Finance
Strategic Alliances


“Patent searching is an art, not a science”
New cases, statutes, policies
PGRD has links with >250 partners in academia
and industry
More
Summary
KM needs in discovery and clinical are
complex, diverse, and sizeable
 We need a knowledge architecture that
can be used effectively by machines.

Ontologies
 Software
 Hardware

Acknowledgements
John Wilbanks (W3C)
 Enoch Huang (Pfizer)
 Eric Neumann (Aventis)
 Stephen Dobson (Pfizer)
 Mitch Brigell (Pfizer)
 Dave Lowenschuss (Pfizer)
 Ruth VanBogelen (Pfizer)
