Dynamic Cataloging of the Autism Phenome: Using Structured
Download
Report
Transcript Dynamic Cataloging of the Autism Phenome: Using Structured
An Ontology-Based Approach for
Computational Phenomics:
Application to Autism Spectrum
Disorder
Amar K. Das, MD, PhD
Departments of Medicine and
of Psychiatry and Behavioral Sciences
Stanford Center
for Biomedical Informatics Research
Outline
Motivations
NDAR project
Phenologue project
Future Directions
NCBO Webinar
October 7, 2009
Motivation
Psychiatric Genetics
Phenotyping
Terminology
Ontology
Logic
NCBO Webinar
October 7, 2009
Hasler G,et al. Toward constructing
an endophenotype strategy for
bipolar disorders. Biological
Psychiatry (2006)
Represent
findings and
their links
using
structured
knowledge
NCBO Webinar
October 7, 2009
Phenomics
“A primary task for the new field of
phenomics will be to clarify what, in
practical terms, constitutes a
phenotype and then to delineate the
different phenotypic components that
compose the phenome.”
Freimer & Sabatti, Nature Genetics (2003)
NCBO Webinar
October 7, 2009
OMIM
NCBO Webinar
October 7, 2009
dbGaP
Mailman, M.D. Nature Genetics (2007)
NCBO Webinar
October 7, 2009
PhenoWiki
NCBO Webinar
October 7, 2009
PhenoWiki
NCBO Webinar
October 7, 2009
Current Approaches
Lack of standardization
Lack of organization
Lack of computability
NCBO Webinar
October 7, 2009
Autism DSM-IV Diagnosis
A total of six (or more) items from (1), (2), and (3), with at least
two from (1), and one each from (2) and (3)
(1) qualitative impairment in social interaction, as manifested by
at least two of the following:
a) marked impairments in the use of multiple nonverbal behaviors
such as eye-to-eye gaze, facial expression, body posture, and
gestures to regulate social interaction
b) failure to develop peer relationships appropriate to
developmental level
c) a lack of spontaneous seeking to share enjoyment, interests, or
achievements with other people, (e.g., by a lack of showing,
bringing, or pointing out objects of interest to other people)
d) lack of social or emotional reciprocity
NCBO Webinar
October 7, 2009
Autism DSM-IV Diagnosis
(2) qualitative impairments in communication as manifested
by at least one of the following:
a) delay in, or total lack of, the development of spoken
language (not accompanied by an attempt to compensate
through alternative modes of communication such as
gesture or mime)
b) in individuals with adequate speech, marked impairment
in the ability to initiate or sustain a conversation with others
c) stereotyped and repetitive use of language or
idiosyncratic language
d) lack of varied, spontaneous make-believe play or social
imitative play appropriate to developmental level
NCBO Webinar
October 7, 2009
Autism DSM-IV Diagnosis
(3) restricted repetitive and stereotyped patterns of
behavior, interests and activities, as manifested by at least two of
the following:
a) encompassing preoccupation with one or more stereotyped
and restricted patterns of interest that is abnormal either in
intensity or focus
b) apparently inflexible adherence to specific, nonfunctional
routines or rituals
c) stereotyped and repetitive motor mannerisms (e.g hand or
finger flapping or twisting, or complex whole body movements)
d) persistent preoccupation with parts of objects
Delays or abnormal functioning in at least one of the following
areas, with onset prior to age 3 years:(1) social interaction(2)
language as used
NCBO Webinar
October 7, 2009
NDAR (ndar.nih.gov)
NCBO Webinar
October 7, 2009
Goals of NDAR
Develop standards to promote metaanalyses and cross site research data
comparisons
Provide researchers access to useful
software tools and infrastructure
Promote the sharing of research data
relevant to ASD
NCBO Webinar
October 7, 2009
NIH Research Support in Autism
$100 million/year in funding
Investigator-initiated grants (R01’s)
Special initiatives, e.g. RFA for genetics
Centers and networks
Training grants (To institutions and individuals)
New initiatives
Intramural Research Program on Autism
Autism Centers of Excellence (ACE)
National Database for Autism Research (NDAR)
ARRA stimulus program
NCBO Webinar
October 7, 2009
BIRN Mediator
NCBO Webinar
October 7, 2009
NDAR System
Clinical Assessments
(OpenClinica)
Neuroimaging
Subject Tracking
& Management
Image Analysis
Common
Measures
Study
Management
Image
Processing
Genomics
BIRN Services
& Resources
Security
Genomics data
access
Portal
Image data
access
Grid
Computing
Collaboration
Data Integration
Query and Reporting
User
Management
Data Integration
Tools
Auditing
NCBO Webinar
October 7, 2009
Data Storage
Management
NDAR Codebook
NCBO Webinar
October 7, 2009
Phenotypes in Psychiatry
‘The observable structural and functional
characteristics of an organism determined by
its genotype and modulated by its
environment’
Diagnostic component
Intermediate phenotype
Quantitative phenotype
Covariates
NCBO Webinar
October 7, 2009
Example Query #1
Find all subject who are verbal (ADIR
A14). Then look at their IQ (Cognitive
Total IQ > 70) and whether or not they
have seizures (Medical History Q10).
Also find out if they have an abnormal
MRI or any genetic abnormalities.
NCBO Webinar
October 7, 2009
Example Query #2
Use head circumference to categorize
macroencephaly. Then see if the
subjects differ in their ADOS, ADI-R,
cognitive, and language profiles, and
combine this with genetic data.
NCBO Webinar
October 7, 2009
NDAR Project
Systematic Review
Ontology Development
Database Infrastructure
NCBO Webinar
October 7, 2009
Systematic Review
“(ADI-R or ADOS or Vineland) and
(genes or genetics) and autism”
26/43 papers relevant
Mean # phenotypes 4.1, range 1-13
Three basic types (1:1, sum, cutoff score)
Tu, S. W. AMIA Annual Proceedings (2008)
NCBO Webinar
October 7, 2009
Systematic Review
Different terms
e.g., ‘age of first phrases’ and ‘age of onset
of phrase speech’
Different cutoff scores
e.g., ‘delayed word’
Different definitions
e.g., ‘regression’
e.g., use of different instruments
NCBO Webinar
October 7, 2009
Ontology
A taxonomy with multiple link types,
each with precise meaning
Clinical Research Study
Case Study
Clinical Trial Study
Controlled Case Study
Study Arms
NCBO Webinar
October 7, 2009
Perspectives on ‘Ontology’
Philosophy: The study
of what entities and
what types of entities
exist in reality
Computer Science: A
schema that represents
a domain and is used to
reason about the
objects in that domain
and the relations
between them
NCBO Webinar
October 7, 2009
Critical to the ‘Semantic Web’
Shared research and development plan to
Provide explicit semantic meaning to data and
knowledge shared on the Web
Bring structure to Web content
Advance the current state-of-the-art in Web
information retrieval, which is keyword searching
Distributed applications will be able to
process data and knowledge automatically
through the use of ontologies
NCBO Webinar
October 7, 2009
OWL: Web Ontology Language
Advances current Semantic Web standards by
using ontologies to represent knowledge
OWL can be used to build ontologies of highlevel descriptions, based on three concepts:
Classes (e.g., Subject, Phenotype, Genotype)
Properties (e.g., isBearerOf, hasResults)
Individuals (e.g., “Macroencephaly”)
NCBO Webinar
October 7, 2009
OWL: Web Ontology Language
hasResult
Subject
011451
Genotype
mutInRELN
isBearerOf
Macroencephaly
Phenotype
NCBO Webinar
October 7, 2009
BIRNLex
A controlled terminology for annotation of
BIRN data sources, focusing on imaging
data from human subjects and mouse
models
Terms cover neuroanatomy, molecular
species, behavioral and cognitive
processes, subject information,
experimental practice and design
NCBO Webinar
October 7, 2009
Basic Formal Ontology
An upper ontology which can be used
to support the development of domain
ontologies used in scientific research
All concepts are subclasses of
Continuants: exists in full at any time in
which it exists at all
Occurants: has temporal parts and that
happens, unfolds or develops through time
NCBO Webinar
October 7, 2009
OBO Foundry
Ontologies should be orthogonal
Minimize overlap
Each distinct entity type (universal) should
only be represented once
Partition efforts in the OBO Foundry
rationally to help organize and
coordinate the ontology development
NCBO Webinar
October 7, 2009
RELATI ON TO
TIM E
GRAN ULARITY
ORGAN AND
ORGAN ISM
CELL AND
CELLULAR
COMPONENT
MOL ECULE
CONTINU ANT
INDEPENDENT
OCCURRENT
DEPENDENT
Organism
(N CBI
Ta xonomy)
Ana tomical
Entity
(FMA,
CARO)
Organ
Function
(FMP,
CPRO)
Cell
(CL)
Cellular
Compo nent
(FMA,GO)
Cellular
Function
(GO)
Molecule
(ChEBI, SO,
Rna O, PrO)
Phenot ypic
Qu ality
(P aT O)
Cellular
Process
(GO)
Molecular Function
(GO)
NCBO Webinar
October 7, 2009
OrganismLevel Process
(GO)
Molecular
Process
(GO)
Chris Mungall, PATO
SWRL: Semantic Web Rule
Language
W3C specification for expressing
logical rules that can be formulated in
terms of OWL concepts
Rules in SWRL can be used to deduce
new knowledge about an existing OWL
ontology
Specification can be extended through
the use of built ins
NCBO Webinar
October 7, 2009
Example SWRL Rule: hasUncle
hasParent(?x, ?y) ^ hasBrother(?y, ?z)
→ hasUncle(?x, ?z)
NCBO Webinar
October 7, 2009
Example SWRL Rule: hasSister
Person(Amar) ^ hasSibling(Amar, ?s)
^ Woman(?s)
→ hasSister(Amar, ?s)
NCBO Webinar
October 7, 2009
Example SWRL Rule: Child
Person(?p) ^ hasAge(?p,?age)
^ swrlb:lessThan(?age,17)
→ Child(?p)
NCBO Webinar
October 7, 2009
NCBO Webinar
October 7, 2009
Rule-Based Methods
Extensions to SWRL
Temporal
Query
Library of temporal built ins
Extraction of results as a table
MakeSet
Support for set-based operations
NCBO Webinar
October 7, 2009
Development Methods
Extensions to BIRNLex
Encoding of phenotypes
Querying of NDAR database
NCBO Webinar
October 7, 2009
Autism Assessment Result
Figure 1. The representation of data collected through the ADI-2003 autism assessment instrument as part of the autism
ontology.
NCBO Webinar
October 7, 2009
Phenotype Representation
Figure 2. The representation of the Status of age of words phentotype group as a OWL class partition by the possible statuses.
NCBO Webinar
October 7, 2009
Phenotype Rule
ADI_2003_result(?assessment) ^
acqorlossoflang_aword(?assessment,?wordage) ^
swrlb:greaterThan(?wordage, 24) ^
subject_id(?assessment, ?subjectId) ^
orgtax:Human(?subject) ^
subject_id(?subject, ?subjectId)
→ birn_obo_ubo:bearer_of(?subject, Delayed_word)
NCBO Webinar
October 7, 2009
Phenotype Rules
NCBO Webinar
October 7, 2009
Ontology-Driven Querying
Young, L. IEEE CBMS (2009)
NCBO Webinar
October 7, 2009
Phenologue Project
Develop an ontology of endophenotypes that maps brain
connectivity, neural deficits, and genetic markers into a
subject domain theory
Develop logic-based methods to encode and classify
endophenotypes based on multi-scale measurements
Create tools to acquire new endophenotypes and annotate
phenotype-genotype findings in online resources such as
published literature
Develop query-elicitation methods that can evaluate
hypotheses about the subject domain theory of
endophenotypes using deductive inference
NCBO Webinar
October 7, 2009
Phenologue Project
Query
Database
Catalog
Phenotype
Definitions
New
Associations
NCBO Webinar
October 7, 2009
Analysis
Rule Technologies
Rule paraphrasing
Rule elicitation
Rulebase visualization
Knowledge mining using rules
NCBO Webinar
October 7, 2009
Rule Paraphrasing
NCBO Webinar
October 7, 2009
Rule Elicitation
NCBO Webinar
October 7, 2009
Rulebase Visualization
NCBO Webinar
October 7, 2009
Computational Phenomics
Informatics methods to support
phenomics
Apply machine learning methods to discover
groups of rules with common semantics
Use natural language processing method to
discover phenotype rules in published text
NCBO Webinar
October 7, 2009
Semantic Similarity
NCBO Webinar
October 7, 2009
Future Directions
Expand phenotype categories
Use natural language processing
method to discover phenotype rules in
published text
Apply machine learning methods to
discover groups of rules with common
semantics
NCBO Webinar
October 7, 2009
Summary
The development of a standardized,
organized, and computable set of
phenotype terms is central to etiologic
studies of complex disorders
The use of ontologies and rules to
model phenotypes is feasible and can
enable automated discovery of new
phenotype-genotype relationships
NCBO Webinar
October 7, 2009
Acknowledgments
Stanford Group
Martin O’Connor
Saeed Hassanpour
Duriel Hardy
Ravi Shankar
Lakshika Tennakoon
Samson Tu
National Center for
Biomedical Ontology
Mark Musen
Daniel Rubin
NDAR/NIMH
Lynn Young
Matthew McAuliffe
Dan Hall
Lisa Gilotty
Biomedical Informatics
Research Network
Bill Bug
Maryann Martone
NCBO Webinar
October 7, 2009