neumann_SWLS
Download
Report
Transcript neumann_SWLS
Semantic Web in
18-Oct-04
1
Charter outline
One page graphic
Purpose: Mission & Vision
Perimeter: Core competencies & activities
Impact
Engagement priority principles
Operating principles with customer
Appendix
KM Definition
(SWOT)
TBD
Organizational Design
KM Charter
18-Oct-04
2
Industry productivity vs. investment
Total R&D investment ($ billions)
$30
$25
$20
$ 897 million
including post-approval R&D costs
to develop a new prescription drug
= 250% increase in a decade
Inflation-adjusted
Including failures
Rising clinical trial costs difficulty in recruiting patients
Expanding development
programs
More chronic &
degenerated diseases
Longer development times
NME
s
$35
400
300
200
$15
$10
Tufts Center, May 2003: $ 802 million excluding post-approval
R&D costs
100
$5
0
$0
1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002
Source: PhRMA & FDA 2003
KM Charter
18-Oct-04
Note:’00-27 ‘01-24, ’02-17 NMEs 23% NME
obtain first approval, D. Kessler, H&Q
3
R&D Challenges in Drug Discovery
Increase productivity
Improve submissions and approvals
Reduce costs: Clinical and preclinical studies ~80% of total
Segmented patient populations
Complexity of the science and technologies
Capturing the innovation and value
Drug-hunting ability
Knowledge creation and transfer
Consortium & Alliances
KM Charter
18-Oct-04
4
Facing a Technology Gap in
Drug Innovation
Need to utilize
Knowledge more
effectively
KM Charter
18-Oct-04
5
Knowledge Networks within
Pharma that need to be supported
Scientists and Researchers
Regulatory (FDA)
Industrial Operations
Research Alliances
Business Process and Management
Competitive and Market Information
Financial
KM Charter
18-Oct-04
6
How Can Scientists Work Together Better?
Biology
Geneticists
Pathologists
Molecular Biologists
Cytologists
ADME
Toxicologists
Clinicians
Informatics
Genomicists
Functional Genomicists
BioStatisticians
Bioinformaticists
Cheminformaticists
Dynamics Modelers
DB admins
KM Charter
18-Oct-04
?
Data
Integration
Chemistry
Compound
Library Chemists
HT Screening
Medicinal Chemists
Synthetic Chemists
Molecular Modelers
Rational Designers
7
Information Interpretation
!
?
“The data clearly shows that the
compound series has hERG issues that
are exacerbated by its side groups”
Sharing
Simply
Can
“Which side groups?”
data is not sufficient for sharing insights
annotating findings with TEXT does not solve how to locate such insights.
researchers find different meaning in the same data?
Merge
Legacy data with newly generated
Capture
Context!
KM Charter
18-Oct-04
8
A Major Unmet Challenge
- Recognizing Information Interpretation
Seeing the data the same way…
ℐi { I(x) } ~ ℐj { I(x) }
How can one guarantee that scientist i interprets data I(x)
the same way as j does?
KM Charter
18-Oct-04
9
Social Participation
“[It]
refers not just to local events of engagement in certain activities with
certain people, but to a more encompassing process of being active
participants in the practices of social communities and constructing
identities in relation to these communities…. Such participation shapes
not only what we do, but also who we are and how we interpret what we
do.”
-
Etienne Wenger, 1999
KM Charter
18-Oct-04
10
The Negotiation of Meaning
As
described by Friesen:
The
meaning of any set of terms, and the significance and utility of any
taxonomy, according to Wenger, can be evaluated only in the context of a
community whose members are involved in similar activities and share
similar values. Wenger calls this process the "negotiation of
meaning:" The production of meanings "that extend, redirect, dismiss,
reinterpret, modify or confirm… the histories of meanings of which they
are a part." (Wenger, 1999; p. 53)
Example: Functional Genomics and Pathologists
KM Charter
18-Oct-04
11
Research Process and Knowledge Flow
Knowledge Networks
Expt
Design
KM Charter
Data
Analysis
18-Oct-04
Interpretation
Decision
Action
12
Communities and Interoperability
Semantic interoperability is tied directly to communities of
practice:
“Within a community or domain, relative homogeneity
reduces interoperability challenges. Heterogeneity
increases as one moves outside of a focal
community/domain, and interoperability is likely [to be] more
costly and difficult to achieve” Moen, 2001
Meanings encoded with a (XML) schema, for use within
one community, are defined only implicitly.
Databases can only be used by those who define them;
KM Charter
18-Oct-04
13
Why a Semantic Web for
Life Science Applications?
Improve Scientific Interactions and Exchanges
Data Integration AND Interpretation
Web-compatible strategies for information encoding and sharing
Sharing Best Practices – Knowledge discovery rules
Knowledge Agents –How can they accelerate science?
KM Charter
18-Oct-04
14
Framework for Next Generation of the Web
Knowledge Exchange within a Semantic Web
OWL (Ontology Web Language)
W3C Ontology Specification
Goes beyond 1st order Logic (Frames & Descriptive Logic)
Extensible by members of any community
Structurally based on RDF
RDF (Resource Description Framework)
Basic XML Semantic Format that OWL is based upon
Allows users to merge and aggregate any set of related data and
relational components
Refers to Ontologies specified in OWL
Defines
OWL
RDF
Structured
KM Charter
18-Oct-04
15
Smarter, Searchable Annotations (Chemistry)
Free-Text
“The side chain on this compound
improves GI transport significantly”
Text found only if compound already
selected
Free-Text with Link
“As evidenced (PKID:392384), the side chain on this
compound improves GI transport significantly”
Link can be used to find all compounds
referencing it– but reason for link is unclear
RDF Statement
<side chain “#element=2”>
<improves><GI transport>
Search feasible for any side chain improving “GI
transport”, or semantically related impact
KM Charter
18-Oct-04
16
Smarter, Searchable Annotations (Proteins)
Free-Text
“The domain on this protein regulates
catalytic activity significantly”
Text found only if compound already
selected
Free-Text with Link
“As evidenced (PKID:8832), our compound
series interact with the catalytic site”
Link can be used to find all proteins
referencing this link– but reason for link is
unclear
RDF Statement
<domain “#element=2”>
<interacts><Cmp Series XV >
Search feasible for any protein domain interacting with
“Compound Series XV”, or semantically related
binding
KM Charter
18-Oct-04
17
Aggregation through Semantics (OWL)
PROTEIN
GENE
mRNA
CASCADE PATHWAY
MICROARRAY
EXPERIMENT
LOCALIZATION
BIO-PROCESS
INTERVENTION POINT
Data DISEASE
Sources
TARGET MODEL
DRUG
KM Charter
TREATMENT
18-Oct-04
18
KM Charter
18-Oct-04
Courtesy of
BeyondGenomics 19
New Data Paradigm for Research
More than a collection of tables for Set-selection
Query, Upload
Results
Search
Data can evolve with additions of attributes and properties as
well as through new inferences
Aggregate
KM Charter
18-Oct-04
20
New Sharing Paradigm for Research
Sharing discoveries in a Context
KM Charter
18-Oct-04
21
Semantic Communities
Vision
Local
Ontology
Ontology
Projects
Platform
Space
Disease
Area
FGx
rDB
Annotated
Literature
Extended
Ontology
Central
Referenced
DB
Chem
Local DB’s
KM Charter
18-Oct-04
22
Thank You
KM Charter
18-Oct-04
23
Science
January 24, 2003
KM Charter
18-Oct-04
24
Information vs. Knowledge
“Information
Converting
And
is data that is endowed with relevance or purpose.
data into information thus requires knowledge.
knowledge, by definition, is specialized. (In fact, truly knowledgeable
people tend toward overspecialization, whatever their field, precisely because
there is always so much to know.)” – Peter Drucker, 1988
The
conversion of data to information or knowledge is an interpretive process
that implies a sociological context:
“ItCharter
entails
KM
personal
involvement in and commitment to specific practices, and
18-Oct-04
25
Communities…
Encoding is not an isolated activity, defined by mechanical conciseness:
“All of this seems to suggest that the significance of words
and descriptions in metadata may not be so much a matter
of clear and unambiguous definition ... Instead, it is more a
matter of doing, acting, and belonging.”
- Norm Friesen, 2002
KM Charter
18-Oct-04
26
…and
Meaning
Practice both defines and requires meaning:
“This
focus on meaningfulness is… not primarily on the technicalities of
‘meaning.’ It is not on meaning as it sits locked up in dictionaries. It is not
just on meaning as a relation between a sign and a reference…. Practice
is about meaning as an experience of everyday life.
-
Etienne Wenger, 1999
KM Charter
18-Oct-04
27
Explicit
vs. helps
Tacit
Negotiating
Meaning
takeKnowledge
what is implicit and make it explicit,
tangible, and codifiable.
Context is essential for framing implicit knowledge
No Knowledge is formally either tacit or explicit – when meaning is
negotiated so that interpretations and insights can be effectively shared
within a context, then what was tacit is now reified.
Communities, if defined appropriately through common semantics, can
capture any knowledge that is viewed as relevant and timely, thereby
making it functional
KM Charter
18-Oct-04
28
Semantic Web for Life Sciences
What
SWLS is-
W3C Discussion Forum for Scientists and Informaticists
Identifying critical needs and defining them as use cases
Help define the relation between information and (codified) knowledge
Effective formation and interaction of research communities
What
SWLS isn’t-
Standards Group
SIG for Vendors
Closed
for Industry
KM Charter Consortium
18-Oct-04
29
Semantic Web Life Science
W3C Workshop Oct 27, 28 Formation of Work Groups
Activities
Mailing List: [email protected]
ISMB 2005 (Detroit) – Semantic Web Track
Coordination with BioPAX, GeneOntology, UniProt, NCI, etc
KM Charter
18-Oct-04
30
SWLS Resources
Semantic Web: http://www.w3.org/2001/sw/
RDF: http://www.w3.org/rdf
SWLS:
http://esw.w3.org/topic/SemanticWebForLifeSciences
DB wrapper: http://www.w3.org/2004/04/30-RDF-RDB-
access/
KM Charter
18-Oct-04
31