MBG404_LS_11

Download Report

Transcript MBG404_LS_11

Computational Biology
Networks and Pathways
Lecture Slides Week 11
Data is Interconnected
What is a Graph
Complexity
A network is a collection of interactions
Pathways are a subset of networks
All pathways are networks of interactions
not all networks are pathways
Young et. al: Transcriptional Regulatory Networks in
Saccharomyces cerevisiae; Science 2002
A network is a collection of interactions
Pathways are a subset of networks
All pathways are networks of interactions, however not all
networks are pathways!
Pathway is a biological network that corresponds to
a specific physiological process or phenotype
Biological pathways
Biological components interacting with each other
over time to bring about a single biological effect
Pathways can be broken down sub-pathways
Some common pathways: signal transduction
metabolic pathways, gene regulatory pathways
Entities in one pathway can be found in others
3 types of interactions that can be mapped into
pathways
protein (enzyme) – metabolite (ligand)
metabolic pathways
protein – protein
cell signaling pathways, protein complexes
protein – gene
genetic networks
Available resources
KEGG http://www.genome.jp/kegg/
BioCyc http://www.biocyc.org/
Reactome http://www.reactome.org/
GenMAPP http://www.genmapp.org/
BioCarta
http://www.biocarta.com/
TransPATH http://www.biobaseinternational.com/pages/index.php?id=transpathda
tabases
Pathguide – the pathway resource list
http://www.pathguide.org/
Network Topology
(PPI)
Network analysis and visualization
tools
Databases for analysis
Text mining algorithms (e.g., natural language processing (NLP)) technologies
Expert human curation
Ingenuity Pathway Analysis
http://www.ingenuity.com/products/pathways_analysis.html
PathwayStudio
http://www.ariadnegenomics.com/products/pathway-studio/
PathwayArchitect
http://www.selectscience.net
Cytoscape
http://www.cytoscape.org/
Biological Networks
http://biologicalnetworks.net/
GeneGO
http://www.genego.com/
Nanduri etal (unpublished)
GO term enrichment
Nanduri etal (unpublished)
Nanduri etal (unpublished)
Nanduri etal (unpublished)
Nanduri etal (unpublished)
End Theory I
5 min mindmapping
10 min break
Practice I
Cytoscape
Download and install cytoscape
Add the reactome app
Initialize the reactome app
Inspect some metabolic pathways
End Practice I
15 min break
Theory II
Pathways vs. networks
Gene networks
• Clusters of genes (or gene products) with evidence of coexpression
• Connections usually represent degrees of co-expression
• In-depth knowledge of process is not necessary
• Networks are non-predictive
Biochemical pathways
• Series of chained, chemical reactions
• Connections represent describable (and quantifiable) relations
between molecules, proteins, lipids, etc.
• Enzymatic process is elucidated
• Changes via perturbation are predictable downstream
Pathways vs. networks
Gene networks
Curation Relatively easy:
Biochemical pathways
Difficult: mostly manual
automated and manual
Nodes Genes or gene products
Any general molecule
Edges Levels of co-
Representation of possibly
quantifiable mechanisms
between compounds
expression/influence or a
qualitative relation
Fidelity Low – usually very little
High – specific processes
detail
Predictive power Relatively low
Relatively high
Effort to curate
Pathway and network granularity
Level of detail
Introduction to pathways and networks
Examples of pathways and networks
Review of pathway databases and tools
Representing pathways and networks
Methods of inferring pathways and networks
Pathway and cellular simulations
Yeast gene interaction network
Tong, et al., Science 303, 808 (2004)
Characteristics of the yeast gene network
Some genes (e.g. regulatory factors) act as ‘hubs’ in a
network and have many interactions
Degrees of connectivity follows the power law
Hubs may make interesting anti-cancer targets
Clusters of genes with known function suggest function for
hypothetical genes in same cluster
Network characteristics can be used to predict proteinprotein interactions
Path between two genes tends to be short
(average ~3.3 hops)
Tong, et al., Science 303, 808 (2004)
E. coli metabolic pathway
glycolysis
Karp, et al., Science 293, 2040 (2001)
Pathways: E. coli metabolic map
Encompasses >791 chemical compounds in >744
noted biochemical reactions
Pathway was compiled via literature information
extraction and extensive manual curation
System allows for users to indicate evidence of pathway
annotations
Curation is done collaboratively with numerous experts
outside of EcoCyc
Karp, et al., Science 293, 2040 (2001)
Pathways in bioinformatics
Most resources for pathways focus on metabolic
pathways (signaling and regulatory gaining
prominence)
Pathways as a very specific subtype of networks
Like networks, can be made in computable (symbolic)
form
Specificities in chemical reactions are more predictive
Pathways can chain together, forming larger pathways
Karp, et al., Science 293, 2040 (2001)
Pathway repositories
BioCyc/MetaCyc
Kyoto Encyclopedia of Genes and Genomes (KEGG)
PATHWAY DB
BioCarta
BioModels database
BioCyc database
http://www.biocyc.org
Pathway/genome database (PGDB) for organisms
with completely sequenced genomes
409 full genomes and pathways deposited
Species-specific pathways are inferred form
MetaCyc
Query/navigation/pathway creation support
through the Pathway Tools software suite
http://www.biocyc.org
MetaCyc database
http://www.metacyc.org
Non-redundant reference database for metabolic pathways,
reactions, enzymes and compounds
Curation through experimental verification and manual
literature review
>1200 pathways from 1600+ species (mostly plants and
microorganisms)
http://www.metacyc.org
Glycolysis pathway in MetaCyc
http://www.metacyc.org
KEGG PATHWAY database
http://www.kegg.com
Consolidated set of databases that cover genomics
(GENE), chemical compounds (LIGAND) and reaction
networks (PATHWAY)
Broad focus on metabolics, signal transduction, disease,
etc.
Species-specific views available (but networks are static
across all organisms)
http://www.kegg.com
Glycolysis pathway in KEGG
http://www.kegg.com
Global Pathway Map
BioCarta database
http://www.biocarta.com
Corporate-owned, publicly-curated pathway database
Series of interactive, “cartoon” pathway maps
Predominantly human and mouse pathways
Contains 120,000 gene entries and 355 pathways
http://www.biocarta.com
Glycolysis pathway in BioCarta
http://www.biocarta.com
BioModels database
http://www.biomodels.net
Database for published, quantitative models of biochemical
processes
All models/pathways curated manually, compliant with
MIRIAM
Models can be output in SBML format for quantitative
modeling
86 curated models, 40 models pending curation
http://www.biomodels.net
Glycolysis pathways in BioModels
http://www.biomodels.net
Comparison of pathway databases
MetaCyc/
BioCyc
Curation Manual and
KEGG
PATHWAYS
BioCarta
BioModels
Automated
Manual
Manual
~289 reference
pathways
~355 pathways
~126 models
EC, KO
None
GO
Various
Primarily human
and mouse
~475 species
Reference and
species-specific
Animated,
cartoonish
Non-standardized
PGDB, pathway
comparisons
Human
pathways,
disease
Simulations,
modeling
automated
Size ~621+ pathways
Nomenclature EC, GO
Organism ~500 species
coverage
Visuals Species-specific
custom
Primary usage PGDB,
computational
biology
Introduction to pathways and networks
Examples of pathways and networks
Review of pathway databases and tools
Representing pathways and networks
Methods of inferring pathways and networks
Pathway and cellular simulations
Inferring pathways and networks
Experimental methods
Microarray co-expression
Quantitative trait locus mapping (QTL)
Isotope-coded affinity tagging (ICAT)
Yeast two-hybrid assay
Green florescent protein tagging (GFP tagging)
Computational methods
Database-driven protein-protein interactions
Expression clustering techniques
Literature-mining for specified interactions
Introduction to pathways and networks
Examples of pathways and networks
Review of pathway databases and tools
Representing pathways and networks
Methods of inferring pathways and networks
Pathway and cellular simulations
Cellular simulations
Study the effect perturbation has on a pathway
(and thus the organism)
Generally require extensive detail on the pathway
or reactions of interest (flux equations,
metabolite concentration, etc.)
Cellular pathway simulations must manage both
temporal and spatial complexity
microsec. millisec. sec. min. yr.
Temporal intervals
nanosec.
picosec.
0.1 nm
10nm
1um
1mm
1cm
1m
Spatial dimension
Adapted from Kelly, H., http://www.fas.org/resource/05242004121456.pdf , via Neal, Yngve 2006 VHS, UW MEBI 591
Simulation methods and techniques
Biological process
Phenomena
Metabolism Enzymatic reaction
Signal transduction Binding
Computation scheme
Differential-algebraic equations,
flux-based analysis
Differential-algebraic equations,
stochastic algorithms, diffusionreaction
Gene expression Binding
Polymerization
Degradation
Object-oriented modeling,
differential-algebraic equations,
stochastic algorithms, boolean
networks
DNA replication Binding
Polymerization
Object-oriented modeling,
differential-algebraic equations
Membrane transport Osmotic pressure
Membrane potential
Differential-algebraic equations,
electrophysiology
Adapted from Tomita 2001
Research in simulation and modeling
Virtual Cell (National Resource for Cell Analysis and
Modeling)
MCell (the Salk Institute)
Gepasi (Virginia Tech)
E-CELL (Institute for Advanced Biosciences, Keio
University)
Karyote/CellX (Indiana University)
End Theory II
5 min mindmapping
10 min break
Term Project
Max 3000 words
Focus on results and their discussion
Make sure to incorporate all the little hints we gave
Incorporate runtime for the new dataset as another
performance measure
Practice
Perform the steps as described here:
http://wiki.cytoscape.org/GettingStarted