- Vancouver Bioinformatics User Group

Download Report

Transcript - Vancouver Bioinformatics User Group

InnateDB – Facilitating Systems Level Analyses of the
Mammalian Innate Immune Response
Lynn et al., Molecular Systems Biology 4:218
(www.nature.com/msb)
David Lynn M.Sc., Ph.D.,
Research Associate, Brinkman Lab., Department of Molecular Biology and Biochemistry, Simon Fraser University
&
Hancock Lab., Centre for Microbial Diseases & Immunity Research, University of British Columbia, Vancouver, Canada.
VanBug Sept. 11th 2008.
The Innate Immune Response

Traditionally, the immune response divided into two different branches
 the adaptive immune response and the innate immune response.

The Innate immune response is the first line of defence against invading pathogens.
research.dfci.harvard.edu/innate/innate.html
The Innate Immune Response

In last decade or so, there has been an explosion of interest in the innate immune
response.

Now appreciated that most pathogens to which we are exposed are eliminated via the
innate immune response without necessarily requiring the activation of adaptive
immunity.

Importance of the innate immune response is being recognized in the
 initiation of and interplay with the adaptive immune response.
 mechanism by which vaccine adjuvants operate in boosting immunity.

Can also be a double-edged sword  If not tightly regulated, an overwhelming
immune response can lead to what is sometimes called a cytokine storm.

One such out-of-control response, sepsis, results in more than 200,000 deaths a year
in the United States alone (Angus et al, 2001).
Systems Biology Approaches to Investigating
the Innate Immune Response:

Progress has been made in understanding the innate immune response including the detailed
dissection of some of the critical signaling pathways involved.

Key Pathways: Toll-like Receptor (TLR) pathways; nucleotide binding and oligomerization domain
(NOD)-like receptors (NLRs); retinoic acid-inducible gene 1 (RIG-1)-like receptors (RLRs); NFkB
signalling, JAK-STAT, MAPK and Complement pathways.

However  now becoming clear that the innate immune response does not involve simple linear
pathways but rather complex networks of pathways and interactions, negative feedback loops and
multifaceted transcriptional responses.

To better understand the complexities of the innate immune response and the cross-talk between
its components, complementary systems level analyses and more focused follow-up experimental
approaches are now needed.
Overview of InnateDB Project (www.innatedb.ca)

InnateDB is a molecular interaction and pathway database and analysis platform
that has been developed to facilitate systems level analyses of the complex
networks of pathways and interactions that govern the innate immune response,
the wider immune system and the entire mammalian interactome.

To date, more than 5,000 innate immunity relevant interactions have been
contextually annotated through the review of 1,400 plus publications.


Integrated into InnateDB are novel bioinformatics resources including

network visualization software

pathway & ontology over-representation analysis

orthologous interaction network construction

ability to overlay user-supplied gene expression data in an intuitively displayed
molecular interaction network and pathway context.
Enable biologists without a computational background to explore their data in a
more systems-oriented, yet user-friendly, manner.
Malaria (Stanford)
Cerebral Malaria (IMR, Australia)
Typhoid (Vietnam)
Salmonella (Sanger & UBC)
HIV (Univ. London)
TB (AECM)
+
Shigella (Pasteur)
Modulating innate immune response via
Natural HDPs
Project-developed synthetic peptides
~100 Mouse KOs (Sanger)
Integrating Gene Expression Data and
Interaction Network Data from a Variety of
Infection Models to Uncover Involvement of
Common/Different Innate Immunity
Pathways & their Central Regulators.
The Need for InnateDB & the Manual Curation of Innate
Immunity Relevant Molecular Interactions & Pathways.

Despite the enormous efforts of the major publicly available interaction and pathway
databases to provide as wide-ranging cover as possible  quickly apparent that
currently available resources provided poor coverage and detail of the molecular
interactions and pathways relevant to innate immunity.

This information is essential for the systems-orientated interpretation of large scale
genomics data.

TLR4  one of the most important molecules in the innate immune response, has
relatively few molecular interactions annotated in the major publicly available
interaction DBs.

5 of these DBs combined contained annotated molecular interactions between TLR4
and just 11 other proteins.

Through a review of the literature we have curated, in detail, a further 16 unique
interactions, and provided annotation of nearly 60 different lines of evidence
supporting these interactions.

Relatively new pathways (NLR, RLR pathways) not annotated at all in major pathway
databases.
Contextually Curating Innate Immunity-Relevant
Interactions

Manual curation > 5,000 innate immune-relevant
interactions (human and mouse).

Involving 1,700+ genes from review of 1,400+
unique publications.

We can often double # of interactions for a given
gene.

Pathways & interactions are curated with contextual
annotations
 (supporting publication; participant molecules;
the species; the interaction detection method;
the host system; the interaction type; the cell,
cell-line and tissue types etc).

Developed InnateDB submission system software
to allow submission of interaction annotation in an
ontology-controlled and MIMIx & PSI-MI 2.5
compliant manner.

Developed curator tool software to allow curators
modify existing annotations.
Contextually Curating Innate Immunity-Relevant
Interactions

Curation team  focused on reviewing interactions not annotated in other interaction
databases.

Ensured only interactions with published experimental evidence of a direct physical or
biochemical interaction are submitted to InnateDB.

Other types of evidence such as co-localization, over-expression, microarray, or other
inferences, are not deemed acceptable.

Defining what is an innate immunity relevant interaction  difficult task due to the
increasing understanding of the complexity of the response & difficulty in drawing the
line between innate and adaptive immunity.

Prioritized systematic curation of molecules that are well described members of key
innate immunity signalling pathways.

Then curate experimentally-verified interactions between these molecules and any
other molecule, regardless of whether the interacting molecule has any known role in
innate immunity.
 quickly expands upon the simple linear view of innate immunity pathways into a
more comprehensive interaction network perspective.
Going Beyond Innate Immunity – A Centralized Resource
for Interactions & Pathways

Aside from the well known signalling pathways  a range of other disparate
processes, including apoptosis, ubiquitination, endocytosis, cell activation and
recruitment  all required to mount effective innate immune response.

Adding to this complexity  borders between the innate and adaptive immune
responses are becoming increasingly blurred.

Furthermore, if we hope to identify new networks or pathways involved in innate
immunity, analyzes must include genes and proteins that are, as yet, not known to
play specific roles in the innate immune response.

To address these issues  InnateDB also incorporates data on the entire human and
mouse interactomes.
Going Beyond Innate Immunity – An Integrative Biology
Resource

>105,000 human and mouse interactions
extracted & loaded from BIND, INTACT,
DIP, BIOGRID & MINT DBs.

Cross-referenced genes to > 2,500
pathways from KEGG, PID, BIOCARTA,
INOH, NetPath & Reactome DBs.
 Allows one to visualize/analyze
interactions associated with specific
pathway.
 Pathway ORA.

Annotation from Ensembl provides details of
human & mouse genes, transcripts and
proteins.

UniProt, Entrez, Gene Ontology  rich
protein & gene annotation.
Through manual curation & integration of existing data from
publicly available databases we can greatly increase innate
immunity relevant networks
TLR4 direct and secondary
interactions annotated by MINT
Database
TLR4 direct and secondary
interactions annotated by
InnateDB
Direct and Secondary Interactions of TLR4 in InnateDB
(~20% of these interactions unique to InnateDB)
Robust Orthology & Gene Order Predictions – Facilitating
Comparative Analysis

Majority of mammalian interaction data
available in InnateDB and other interaction
databases primarily refers to human genes
and proteins.

To facilitate comparative network-based
analysis of the human, mouse and bovine
interactomes, detailed orthology predictions
have been integrated into InnateDB.

Orthology predictions generated using an inhouse method, Ortholuge, which provides
accurate predictions of orthology using a
phylogenetic distance-based approach.

Orthology predictions are further supported
through the development of a human and
mouse gene order and synteny browser.
Advanced Search for Genes & Proteins
Search using a wide variety
of terms
Construct more complex
queries using Boolean
operators
Advanced Search for Interactions
Return direct interactors or their secondary
interacting neighbors.
Allows users search for interactions in
particular cell/tissue types, particular
interaction types e.g. phosphorylation,
or experimental type e.g. coimmunoprecipitation.
To reduce redundancy, interactions in
InnateDB that have the same
participants and interaction type are
grouped together by default. Choose
'No' to return all redundant
interactions separately.
Search for Particular Interactions or Genes that are in a Specific
Pathway
Select a pathway from the list or
search more than 2,500 pathways by
typing the pathway name in the box.
Pathways
and the
in them
By default,
a listinteractions
of molecular
are species
specific.
interactions
are returned
which is
restricted to interactions only between
annotated members of the pathway. If
"No" is selected, a more
comprehensive list of interactions is
returned, displaying interactions
between pathway members and all
other molecules with which they
interact.
.
Search Results: searching for genes of interest e.g. IRAK
View interactions involving
this gene and its encoded
proteins
Interaction Results Page.
View evidence supporting this
interaction & contextual
details e.g. cell type etc
View evidence supporting this
interaction & contextual
details e.g. cell type etc
There may be multiple
evidence references for an
interaction.
Network/Pathway Visualization
using Cerebral and Cytoscape
Barsky A, Gardy JL, Hancock REW, and Munzner T. (2007)
Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular
localization annotation.
Bioinformatics 23(8):1040-2.
How a biologist thinks of a pathway ….
Pathway Visualization in Cytoscape
Pathway Visualization using Cerebral
(Bioinformatics 2007)
www.pathogenomics.ca/cerebral
Visualize Interactions in a subcellular localization-based layout using
the Cerebral plugin for Cytoscape.
Click here to visualize
interactions in Cerebral
You must have a recent
version of Java installed.
Network Visualization using Cerebral Plugin for Cytoscape
www.pathogenomics.ca/cerebral (Bioinformatics 2007)
InnateDB Web Start Version of Cerebral

Cerebral uses subcellular localization annotations to provide more biologically intuitive pathwaylike lay-outs of interaction networks.

Note: the subcellular localizations in Cerebral should only be used as a guide.


There are many proteins with no annotated subcellular localizations and many others that
have multiple possible localizations (only 1 will be shown, nuclear, extracellular and
membrane localizations will take precedence over cytoplasm if there are multiple).

Localizations are inferred from manual curation or Gene Ontology annotation in InnateDB 
either directly based on cellular compartment annotation or indirectly via functional
annotation e.g. proteins annotated with term transcription factor activity will be placed in
nucleus.

Details of all annotated subcellular localizations are provided in the node attribute browser
tab.
Lines connecting nodes represent interactions. Dashed lines have only 1 supporting publication in
InnateDB. The thicker the line the more publications support the interaction.
Integrating Gene Expression Data in
a Molecular Interaction Network and
Pathway Context
InnateDB – Integrating Gene Expression Data in a Molecular Interaction
Network and Pathway Context
InnateDB Data Analysis Page – File Upload
Specify column in your file
containing external database IDs
you wish to use to cross-reference
to InnateDB.
Ensembl, Entrez Gene, RefSeq and
UniProt IDs are all accepted.
InnateDB Data Analysis Page – Return associated Pathways,
Genes, Interactors or Interactions
Choose Interactors to return list of
molecules interacting with genes in
uploaded list.
Choose Genes to return associated
gene annotation for genes in
uploaded list.
Useful for finding gene ontology
annotation, orthologs and other
annotation.
Choose Pathways to return
associated pathways and do
pathway over-representation
analysis
Integrate Gene Expression Data with Gene, Pathway and Interaction
Annotation in InnateDB
Pathway & Gene Ontology Overrepresentation Analysis
Pathway & Gene Ontology Over-representation Analysis

Several different statistical methods are available to determine if particular pathways/GO terms
are significantly associated with DE genes - Hypergeometric, Fisher & Chi Square.

Statistics are returned individually for up-regulated and down-regulated pathways/terms.

Two options to correct for multiple testing are included - The Benjamini & Hochberg correction for
the FDR and the more conservative Bonferroni correction.

Pathway analysis tool  expected to be more powerful than many of the existing tools that tend to
use only one or two sources of pathway annotation for analysis.

Recommend that All genes from an array dataset not just differentially expressed (DE) genes are
uploaded.
 InnateDB uses the proportion of DE genes on the whole array to more accurately determine
if a particular pathway is significant.

Also option to carry out pathway/ontology ORA on a subset of genes
 Uses slightly different algorithm that does not take gene expression values into account.

If there are multiple probes for the same gene these values will be averaged for the purposes of
the pathway and ontology ORA.
Pathways Associated with Uploaded List
Click here to do pathway overrepresentation analysis.
Choose Parameters for Pathway ORA
Pathways significantly
associated with up-regulated
genes.
Display Pathways significantly
associated with downregulated genes.
Click here to see a summary
of the pathway and what
genes are DE.
Pathway
Summary Page
for All Conditions
Constructing & Analysing Molecular
Interaction Networks
Constructing & Analyzing Networks Using InnateDB

Pathway analysis can be very powerful in determining which annotated pathways are
most significantly associated with DE genes.

Pathway analysis, however, relies on using the association of DE genes to known
biological pathways, which are often annotated as relatively simple linear cascades.

Network analysis  move from this simple view of the signaling response to a more
comprehensive analysis of the molecular interactions between DE genes and their
encoded proteins & RNAs.

Potentially uncover as yet unknown signaling cascades or pathways, functionally
relevant sub-networks and the central molecules, or hubs, of these networks.

All Interaction networks in InnateDB can be constructed with integrated gene
expression or other quantitative data from up to 10 conditions.

Networks can be interactively visualized using Cerebral plugin in Cytoscape or the
Cerebral layout may be deleted and any other Cytoscape plugin used for layout and
analysis.
Constructing & Analyzing Networks Using InnateDB

By default if one uploads a list of genes and returns a list of interactions
 Network of all interactions both between genes in list & any other molecules with
which they interact are returned.
 Useful to identify components of network that interact with DE genes or other
genes of interest.
 Potential to identify key regulators of gene expression, even though these
regulators themselves may not be differentially expressed.
Constructing & Analyzing Networks Using InnateDB

Options to reduce complexity of network:

Limit network to show only interactions between molecules in uploaded list.
 Useful, for example, to visualize and analyze network of DE genes.
Constructing & Analyzing Networks Using InnateDB

Options: Limit network to show only interactions between molecules annotated to
belong to particular pathway.
Results: Visualize Gene Expression Data in an Interaction
Network Context
Click here to visualize
interactions in Cerebral with
overlaid gene expression
data.
Multi-experiment View in Cerebral
Click on these buttons
in 2 different miniwindows to display
changes in gene
expression from 1
condition to another in
the bigger window.
Cerebral – Multi-Array Viewer
Click “Node
Attribute
Browser” and
select from
list to see
node
attributes.
Cerebral – Multi-Array Viewer
Click here to display
graphs of change in
expression for each
gene.
Clicking a line
highlights that gene in
the network.
K-means clusters of
genes with similar
patterns of gene
expression.
Interactively Link back to InnateDB to Look up Information on Particular
Genes/Interactions of Interest.
Future Plans for InnateDB

Continued manual curation of innate immune relevant interactions.
 Community participation.

Pathway curation – BIOPAX compliance – working on new pathway curation tool.

Improved functionality & performance on InnateDB website.

Improving ontology ORA & Transcription factor over-representation tools, Network
analysis tools.

Working with other DBs e.g. ISB, Immport, pathway DBs for better integration and
coordinating curation efforts.
Community Annotation & Submission

We hope to recruit members of the innate immunity research community to
participate in the InnateDB project.

What can you do?
 Submission of interaction data from your own published research – submission
to an interaction DB will likely be required prior to publication in near future.
 Review of existing data in InnateDB.
 Curation of molecules or pathways of interest to you.
 Report bugs/errors.

Mail us at [email protected]

Why get involved?
 InnateDB development and curation teams can only do so much.
 Community participation will help increase data coverage and accuracy and help
InnateDB become a better resource for all.
Malaria (Stanford)
Cerebral Malaria (IMR, Australia)
Typhoid (Vietnam)
Salmonella (Sanger & UBC)
HIV (Univ. London)
TB (AECM)
+
Shigella (Pasteur)
Modulating innate immune response via
Natural HDPs
Project-developed synthetic peptides
~100 Mouse KOs (Sanger)
Integrating Gene Expression Data and
Interaction Network Data from a Variety of
Infection Models to Uncover Involvement of
Common/Different Innate Immunity
Pathways & their Central Regulators.
Acknowledgements – The Bioinformatics Team

Principle Investigators

Fiona Brinkman

Bob Hancock

InnateDB Project Leader:

David Lynn

PI2 Project Management

Bernadette Mah

InnateDB Database Development:

Matthew Laird

Nicolas Richard

Avinash Chikatamarla

Fiona Roche

Timothy Chan

Michael Acab


InnateDB Search Engine & User Interface:

Geoff Winsor

Nicolas Richard
Manual Curation:

Misbah Naseer

Melissa Yau

Raymond Lo

Anastasia Sribnaia

Jaimmie Que

InnateDB Submission System & Curator tool:

Calvin Chan

Naisha Shah

Cerebral – Pathway Visualization Software:

Aaron Barsky

Jennifer Gardy

Tamara Munzner

Orthologs & Gene Order:

Dan Tulpan

Matthew Whiteside

Mark Sun

Systems Administration:

Matthew Laird

Meta-analysis Team:
David Lynn, Jennifer Gardy, Chris Fjell, Karsten
Hokamp, Nicolas Richard.