Transcript Slides

Chapter 4: Protein Interactions
and Disease
Mileidy W. Gonzalez, Maricel G.
Kann
Presented by Md Jamiul Jahid
What to learn in this chapter
• Experimental and computational methods
to detect protein interactions
• Protein networks and disease
• Studying the genetic and molecular basis
of disease
• Using protein interactions to understand
disease
What is Protein interaction
• Protein is the main agents of biological
function
– Protein determine the phenotype of all
organisms
• Protein don't function alone
– interaction with other proteins
– interaction with other molecules (e.g. DNA,
RNA)
What is Protein interaction
• Protein interaction generally means
physical contact between proteins and
their interacting partners.
• Protein associate physically to create
macromolecular structures of various
complexities and heterogeneities
• Protein pair can form dimers, multi-protein
complexes or long chains
What is Protein interaction
• But it always need not to be physical
• Besides physical interactions protein
interaction means metabolic or genetic
correlation or co-localization
• Metabolic -> in same pathway
• Genetically correlated -> co-expressed
• Co-localization -> protein in the same
cellular compartment
PPI Network
• PPI network represents interaction among
proteins
• Each node represent a protein
• Each link represents an interaction
PPI Network
A PPI network of the proteins encoded by radiation-sensitive genes in
mouse, rat, and human, reproduced from [89].
PPI Network
• Some use of PPI network
– To learn the evolution of different proteins
– About different systems they are involved
– Network can be used to learn interaction for
other species
– Helpful to identify functions of uncharacterized
proteins
Experimental Identification of PPIs
• Biophysical Methods
• High-Throughput Methods
– Direct high-throughput methods
– Indirect high-throughput methods
Biophysical Methods
• Mainly biochemical, physical and genetic
methods
– X-ray
– Crystallography
– NMR spectroscopy
– Fluorescence
– Atomic force microscopy
Biophysical Methods
• Biophysical methods identify interacting
partners
• Chemical features of the interaction
• Problem:
– Time and resource consumption is high
– Applicable for small scale
High Throughput Methods
• Direct high-throughput methods
• Indirect high-throughput methods
Direct high-throughput methods
• Yeast two-hybrid (Y2H)
– Most common
– Fuse two protein in a transcription binding
domain
– If the protein interact->transcription complex
activated
Direct high-throughput methods
Y2H overview
Image courtesy Wikipedia.org
Direct high-throughput methods
• Problem (Yeast two-hybrid)
– Cannot identify complex protein interaction
means more than two interaction
– Interaction of proteins initiating transcription
Indirect high-throughput methods
• Looking at characteristics of the gene
encode that produce that protein
• Gene co-expression
– Assumption: genes of interacting protein must
co-expressed to provide the product of protein
interaction
Computational Predictions of PPIs
• Empirical predictions
• Theoretical predictions
– Coevolution at the residue level
– Coevolution at the full sequence level
Empirical predictions
• Based on
– Relative frequency of interacting domains
– Maximum likelihood estimation
– Co-expression
• Disadvantage
– Rely on existing network
– Propagate inaccuracies
Theoretical Predictions of PPIs
Based on Coevolution
• Coevolution at the residue level
• Coevolution at the full sequence level
• In biology, coevolution is "the change of a
biological object triggered by the change
of a related object."
Coevolution at the residue
• Paris of residues of the same protein can
co-evolve for three dimensional proximity
or shared functions
• A pair of protein is assumed to interact if
they show enrichment of the same
correlated mutations
Coevolution at the full sequence
level
• Basic idea: changes in one protein are
compensated by correlated changes in its
interacting partners to preserve interaction
• ->> interacting protein have phylogenetic
trees with topologies more similar than by
chance
• Mirrortree is most accurate option to
indentify interaction
Mirrortree
• Identify the orthologs of both proteins in
common species
• Creating multiple sequence alignment
(MSA) with each orthologs
• Create distance metric from MSA
• Calculate correlation coefficient between
distance metric
Mirrortree
Different methods for computing
PPI
Protein Network and Disease
• Studying the Genetic Basis of Disease
• Studying the Molecular Basis of Disease
Studying the Genetic Basis of
Disease
• After Mendelian genetics in the 1900, a lot
of effort to categorize disease genes
• Positional cloning: the process to isolate a
gene in the chromosome based on its
position
• Genes identified by this approach
– cystic fibrosis, HD, breast cancer etc.
– still mutation in gene not correlate with
symptoms
Studying the Genetic Basis of
Disease
• Several reasons
– pleiotropy
– influence of other genes
– environmental factors
Studying the Genetic Basis of
Disease
• Pleiotropy: when a single gene produce
multiple phenotype
• Problem: complicates disease elucidation
process because mutation of such gene
can have effect of some, all or none of its
traits.
• Means, mutation of a pleiotrophic gene
may cause multiple syndrome or only
cause disease in some of the biological
process
Studying the Genetic Basis of
Disease
• Influence of other genes
– Interact synergistically
– Modify one another
Studying the Genetic Basis of
Disease
• Environmental factors
– diet
– infection etc.
• Cancer are believed to be caused by
several genes and are affected by several
environment factors
Studying the Molecular Basis of
Disease
• Genes associated with disease is
important
• Molecular details is also important to
identify the mechanism triggering,
participating and controlled perturbed
biological functions
The role of protein interaction in
disease
• Protein interaction provide a vast source of
molecular information because their
interaction involve in
– metabolic
– signaling
– immune
– gene regulatory networks
• Protein interaction should be the key
target to understand molecular based
disease understanding
The role of protein interaction in
disease
• Protein-DNA interaction disruption
• Protein misfolding
• New undesired protein interaction
Protein-DNA interaction
disruption
• p53 tumor suppressor
• Mutation on p53 DNA-binding domain
destroy its ability to bind its target DNA
sequence
• Cause preventioning of several anticancer
mechanism it mediates
Protein misfolding and undesired
interaction
• Protein misfolding
– protein folding: A process by which a protein goes to its
3D functional shape
• New undesired protein interaction
– Main cause of several disease like Huntington disease,
Cystic fibrosis, Alzheimer's disease etc.
Using PPI network to understand
disease
• PPI Network can help identify novel
pathway
• PPI network can be helpful to explore
difference between healthy and disease
states
• Protein interaction studies play a major
role in the prediction of genotypephenotype association
Using PPI network to understand
disease
• New diagnostic tools can result from
genotype-phenotype associations
• Can identify disease sub networks
• Drug design
PPI Network can help identify
novel pathway
• PPI network: Maps physical and functional
interaction of protein pairs
• Pathway: Represents genetic, metabolic,
signaling or neural processes as a series
of sequential biochemical reaction
PPI Network can help identify
novel pathway
• Pathway alone cannot uncover disease
detail
• When performing pathway analysis to
study disease differential expression is the
key
• Majority of human genes haven't been
assigned to pathway
PPI Network can help identify
novel pathway
• In this scenario PPI network can be helpful
to identify novel pathway
• Some key findings
– Disease genes are generally occupy
peripheral position in PPI network
– Few cancer genes are hubs
– Disease genes tend to cluster together
– Protein involved in similar phenotype are
highly connected
PPI network can be helpful to
explore difference between
healthy and disease states
Source: Dynamic modularity in protein interaction networks
predicts breast cancer outcome, Nature Biotechnology 27, 2009
Genotype-phenotype
association and new disease
genes
• Disease gene by interacting partners of
already known disease genes
• Topological features to predict disease
genes
– 970/5000 genes are disease genes
Disease subnetwork identification
Disease subnetwork
identification
Drug design
• Hub node in PPI are not good for drug
target
• Less connected nodes may be good target
for drug
Exercise
• Objective: investigate Epstein-Barr Virus
pathogenesis using PPI
• EBV is most common human virus
• 95% adult infected to this virus
• EBV replicates in epithelial cells and
establish latency in B lymphocytes
– 35-50% time mono-nucleosis
– Sometimes cancer
Dataset
• Dataset S1: EBV interactome
• Dataset S2: EBV-Human interactome
• Software requirement:
– Cytoscape (DL link: www.cytoscape.org)
Questions
• How many nodes and edges are featured in this
network?
• How many self interactions does the network have?
• How many pairs are not connected to the largest
connected component?
• Define the following topological parameters and explain
how they might be used to characterize a protein-protein
interaction network: node degree (or average number of
neighbors), network heterogeneity, average clustering
coefficient distribution, network centrality.
Questions
• How many unique proteins were found to interact in each
organism?
• How many interactions are mapped?
• How many human proteins are targeted by multiple (i.e.
how many individual human proteins interact with >1)
EBV proteins?
• How does identifying the multi-targeted human proteins
help you understand the pathogenicity of the virus? —
Hint: Speculate about the role of the multi-targeted
human proteins in the virus life cycle.
Questions
• Based on the ‘degree’ property, what can
you deduce about the connectedness of
ET-HPs? What does this tell you about the
kind of proteins (i.e. what type of network
component) EBV targets?
Questions
• What do the number and size of the
largest components tell you about the
inter-connectedness of the ET-HP
subnetwork?
Questions
• Why is distance relevant to network
centrality? What is unusual about the
distance of ET-HPs to other proteins and
what can you deduce about the
importance of these proteins in the
Human-Human interactome?
Questions
• Based on your conclusions from questions
i-iii, explain why EBV targets the ET-HP
set over the other human proteins and
speculate on the advantages to virus
survival the protein set might confer.
Thanks