Applications of the Semantic Web in a Global Pharmaceutical
Download
Report
Transcript Applications of the Semantic Web in a Global Pharmaceutical
Knowledge Management
Issues in a Global
Pharmaceutical R&D
Environment
W3C Workshop on Semantic Web for Life Sciences
27-28 October 2004
Cambridge, Massachusetts USA
Ted Slater
Proteomics Center of Emphasis
Pfizer Global R&D Michigan
About Pfizer Global R&D
The industry’s largest
R&D organization
>12,500 employees
worldwide
Estimated R&D budget in
2004:
$7.9 billion
Hundreds of research
projects over 18
therapeutic areas
(Not really using
Semantic Web
technologies just now)
Issues with Global R&D
Geographical (time & distance)
Language (even if the language is the
same!)
Cultural
Increased reliance on electronic
communications
5:00
10:
00
2:0
0
18:
00
4:0
0
5:00
What’s in a Name?
“Releasing TaqMan® Data” use case
from John Wilbanks (17 Aug 2004)
GO annotation from a particular gene
TaqMan® data from an exon proximal to
that gene
Annotating the TaqMan® data with GO
annotation is not quite right
Different perceptions of concept “gene”
Proteomics
Metabonomics
RNA Profiling
Current Tools Fall Short
100+ highly-specialized software tools
in place for ’omics technologies
All query-centric
Single user
Low bandwidth
Ask a question, get a list
How to Drive a Biologist Crazy
gi|84939483
gi|39893845
gi|27394934
gi|18890092
gi|10192893
gi|11243007
gi|20119252
gi|19748300
gi|44308356
gi|50021874
gi|10003001
gi|27762947
gi|24537303
gi|27284958
gi|37373499
…
How to Add Insult to Injury
Current State of KM
Data Tombs
Metadata?
Experimental protocols
Model system descriptions
Statistical criteria for data analysis and
acceptability
Others
fan
wall
spear
snake
tree
rope
Hypothesis Generation
Our domain is too big and complex to fit
in our heads
Browsing and correlation can’t get us there
We need our machines to generate
testable hypotheses for us based on our
experimental results
We need knowledge about causation
Clinical KM Needs
Aggregate and analyze:
Safety data
Efficacy data
Genomic data
Healthcare data
Performance data
Study metadata
Staff and vendor performance
Resource utilization
The Shape of Clinical Data
>2 GB each per Phase-2, -3, or -4
protocol, split over >100 different
datasets, each with 20-300 columns
Metadata complex, hard to combine
across studies
Sensitive data
Project teams can be reluctant to discuss
with other groups (e.g. in discovery)
Clinical Columns
Dosage and dose response data
Product differentiation
Patient demographics
Concurrent medications
Lab data
Subject experience & adverse events
How fast does it work? How long does it last?
Other Areas
Legal
HR
Finance
Strategic Alliances
“Patent searching is an art, not a science”
New cases, statutes, policies
PGRD has links with >250 partners in academia
and industry
More
Summary
KM needs in discovery and clinical are
complex, diverse, and sizeable
We need a knowledge architecture that
can be used effectively by machines.
Ontologies
Software
Hardware
Acknowledgements
John Wilbanks (W3C)
Enoch Huang (Pfizer)
Eric Neumann (Aventis)
Stephen Dobson (Pfizer)
Mitch Brigell (Pfizer)
Dave Lowenschuss (Pfizer)
Ruth VanBogelen (Pfizer)