Phenotypes in the Mouse Genome Database: functional screens to

Download Report

Transcript Phenotypes in the Mouse Genome Database: functional screens to

Ontologies and vocabularies
supporting data integration:
emphasis on mouse phenotypes
and disease model
Homozygous
Faslgld/Faslgld
The mouse generalized lymphoproliferative
disease (gld) mutation in the FAS ligand (TNF
superfamily, member 6) gene.
These mice model human Autoimmune
Lymphoproliferative Syndrome; ALPS, type IB
Control
C3H/HeJ
Janan T. Eppig
PATO Meeting, Dec. 2006
The genetic tools for mouse provide an
ideal platform for experimentation:
• Mammal : small, easy to breed and maintain, short lifespan
• Similar to human genetically & physiologically
…facilitating the use of the mouse as a model for human biology
by providing integrated access to data on the genetics, genomics,
and biology of the laboratory mouse.
Achondroplasia
Homozygous achondroplasia
mouse mutant and control
• short domed skull
• short-limbed dwarfism
• malocclusion
• bulging abdomen as adults
• respiratory problems
• shorted lifespan
www.informatics.jax.org
Objective
…make phenotype and disease model data robust and
accessible to researchers and computational biologists
• semantically consistent search methods
• integrated access to all phenotypic variation sources
(single-gene and genomic mutations, QTLs, strains)
• ability to query across sequence, orthology, expression,
function, phenotype, disease
• data on human disease correlation
• access to mouse models from various approaches
- Genetic
- Phenotypic
- Computational
Existing Wealth of Mouse
Phenotype Data in MGI
>16,800 phenotypic alleles representing
≈6,830 unique genes.
>71,000 annotations associating MP terms
to genotypes.
>6,550 phenotype records for 3,210 QTL.
>9,000 strains catalogued.
A few of the challenges
• alleles can produce pleiotropic phenotypic effects
• non-allelic mutations can produce indistinguishable phenotypes
• modifiers and epistasis can influence mutant phenotypes
• alleles of different genes can interact to produce unique phenotypes
• genetic background can greatly influence mutant phenotypes
• imprinted genes/alleles influence phenotype
• quantitative trait loci (QTLs) can contribute unequally to phenotypes
• genomic mutations can delete or disrupt multiple genes
• strains (“whole-genome”) have characteristic phenotypes
• complex genetically engineered and multiple mutation stocks are
often developed for disease models
• environmental influences and age can dramatically affect phenotype
Data Challenge
Mouse phenotype data from
• publications
• electronic submissions
• mutagenesis (ENU centers)
(≈ 300 new alleles; ≈ 700 publications per month on phenotypes)
New initiatives to knock-out every gene in
the mouse in next 5 years…
Need for efficiency, accuracy, full description of complex
observations, storage/analysis of individual and population data
Making semantic sense
Controlled vocabularies/nomenclatures
•
•
•
•
•
•
•
•
•
•
Strains
Genes
Alleles (phenotypic or variant)
Classes of genetic markers
Types of mutations
Types of assays
Developmental stages
Tissues
Clone libraries
ES cell lines
….. organized as lists or simple hierarchies
Gene
Symbols
Inbred
Strain
Names
Clone
Library
Names
Assay
Gene nomenclature
Specimen
Results
Hbp1 (high mobility group
box transcription factor 1)
gene expression
differences in KitW-e/KitW-e
homozygotes vs wild-type
Semantics plus relationship data
Ontologies/structured vocabularies
• Gene Ontology (GO)
DAGs
• Molecular function
• Biological process
• Cellular component
• Mouse Anatomy (MA)
• Embryonic
• Adult
• Mammalian Phenotype (MP)
• Sequence Ontology (SO)
….. organized as directed acyclic graphs (DAGs)
1.Gene
Page
2.Phenotype
Query
Summary: phenotype
classes & human
disease associated
3.MP
Ontology
Summary:
genotype, MP
term, & ref
4.Disease
vocabulary
5.Sequence
(GBrowse)
Human/mouse
disease
relationships
Phenotype detail, including genotypes
for mouse models of human diseases
Navigating the views of phenotypes & disease
Genotype = allele combinations carried in the
context of a specific genetic background (strain)
enlarged brain
ventricles
postnatal death
TMEV viral
susceptibility
L1camtm1Mtei/Y
Gnastm1Kel-pat/Gnas+
Cd8atm1Mak/Cd8atm1Ma
129/SvEv
none affected
C57BL/6J
high percentage affected
129/Sv * C57BL/6J
most die by P2; all by P9
129/Sv * C57BL/6J
* CD-1
most die by P9; 10-20%
survive past P21
C57BL/6
Inflammation after infection
resolves by 45 days; disease
is absent by 10 mo.
PL/J
viral infection persists
k
Mammalian
Phenotype
Ontology
• Structured as DAG
• Over 4,500 terms
covering physiological
systems, behavior,
development and
survival
• Available in browser
and OBO formats from
MGI ftp and OBO sites
• Each term linked to
all annotations to the
term or its children
Summary
Results
• Genotypes that
are annotated
to a term or
children of the
term
• References
supporting
annotation
• Links to allele
detail pages
for full mutant
phenotype
Allele Detail
Page
• full phenotype
annotations (MP) for
each genotype
• specific detail for MP
terms
• each MP annotation
referenced
• human diseases for
which genotype is used
as a model
Genes
associated with
phenotypes
characteristic
of a disease in
human, mouse,
or both
Mouse model
genotypes
linked to
phenotype
details
osteopetrosis
Human-mouse disease relationships
OMIM terms
Genotypes associated w/ OMIM
OMIM associated w/ genotypes
6,113
1,847
720
to Human Disease and
Mouse Model Page
Vocabularies in MGI
Definition
Vocabulary
Synonyms
MP:1956
Note
DAGs
Terms
Growth retardation
Genotype
EE
J:65322
Dilated renal tubules
IDA
J:62648
Postnatal lethality
TAS
J:65378
Respiratory failure
…
…
Annotations
Strain: AEJ
Alleles:bd/bd
Strain: C57BL/6
Alleles:
Ppp1r3atm1Adpt/
Ppp1r3atm1Adpt
Making Mammalian Phenotype
Ontology Work
DAG
•
•
•
•
•
accommodate bio-specific terms
computationally useful
human accessible
practical for curation
cross-reference to other ontologies
Terms in MP
MP term
Entity
Quality
Other
Info
microphthalmia eye
small
size
hydrocephaly
cerebrospinal fluid
increased excessive
brain
large
size
(dilated)
trauma
observed
brain
?
increased
increased
blood pressure
Future MP Ontology Development
• New terms from ongoing curation process
• Collaborative community efforts
• identify new terms
• revise organization of existing
terms within particular branches
MP Ontology Growth
4500
4000
• Recruit domain experts for
systematic review
• Cross-ref and comparison
to other relevant ontologies
(GO, Anatomy, Cell Type, Mpath, etc.)
3500
3000
2500
2000
1500
1000
500
0
1/1/00
2003
1/2/00
2004
1/3/00
2005
1/4/00
2006
Collaborators
…currently annotating with MP and
contributing to MP development
•
•
•
•
Rat Genome Database (RGD)
Mouse Mutagenesis Centers
Human (NCBI/dbSNP)
Online Mendelian Inheritance in Animals
(OMIA)
…under discussion
• Teratology Society
• Animal Traits
Summary
• Structured vocabularies and ontologies support semantic
integration for the MGI system and promote broader integration of
mouse knowledge
• To meet community needs, practical implementations parallel
formal ontological development
• MGI has implemented a generalized structure for vocabularies and
ontologies in MGI
•
The Mouse Genome Informatics group continues its strong interest
and participation in community bio-ontology efforts
www.informatics.jax.org
Human FOXN1
forkhead box N1
T-CELL
IMMUNODEFICIENY,
CONGENITAL ALOPECIA,
AND NAIL DYSTROPHY
Frank J, et al. Nature 398, 473 - 474 (1999)
Mouse Foxn1
Homozygous “nude” mouse. One of
eight known phenotypic mutations in
mouse (6 spontaneous; 2 engineered)
for the forkhead box N1 gene.