Young-RaeCho
Download
Report
Transcript Young-RaeCho
KOCSEA Technical Symposium 2010
Systematic Analysis of Interactome:
A New Trend in Bioinformatics
Young-Rae Cho, Ph.D.
Assistant Professor
Department of Computer Science
Baylor University
History of Bioinformatics
Stage 1.
Sequence Analysis
•
Gene sequencing
•
Sequence alignment
•
Homolog search
•
Motif finding
History of Bioinformatics
Computational Biology
Stage 1.
Sequence Analysis
Stage 2.
Structure Analysis
•
Gene sequencing
•
Protein folding
•
Sequence alignment
•
Homolog search
•
Homolog search
•
Binding site prediction
•
Motif finding
•
Function prediction
History of Bioinformatics
Computational Biology
Stage 1.
Sequence Analysis
Stage 2.
Structure Analysis
Functional Genomics
Stage 3.
Expression Analysis
•
Gene sequencing
•
Protein folding
•
Function prediction
•
Sequence alignment
•
Homolog search
•
Gene clustering
•
Homolog search
•
Binding site prediction
•
Sample classification
•
Motif finding
•
Function prediction
History of Bioinformatics
Computational Biology
Stage 1.
Sequence Analysis
Stage 2.
Structure Analysis
Functional Genomics
Systems Biology
Stage 3.
Expression Analysis
Stage 4.
Network Analysis
•
Gene sequencing
•
Protein folding
•
Function prediction
•
Network modeling
•
Sequence alignment
•
Homolog search
•
Gene clustering
•
Interaction prediction
•
Homolog search
•
Binding site prediction
•
Sample classification
•
Function prediction
•
Motif finding
•
Function prediction
•
Pathway identification
•
Module detection
Biological Networks
Definition
Maps of biochemical reactions, interactions, regulations between genes or proteins
Importance
Provide insights into the mechanisms of molecular function within a cell
Significant resource for functional characterization of genes or proteins
Require computational and systematic approaches
Examples
Metabolic networks
Protein-protein interaction networks
Genetic interaction networks
Gene regulatory networks (Signal transduction networks)
Protein Interaction Networks
Determination
Experimental methods: Y2H, MS, Protein Microarray
Computational methods: Homolog search, Gene fusion analysis, Phylogenetic profiles
Genome-scale protein-protein interactions Interactome
Representation
Un-weighted, undirected graph
Challenges
Unreliability
Large scale
Complex connectivity
Network Re-structuring
Strategy
unweighted network
To resolve complex connectivity
Converts the complex graph to
a hierarchical tree structure
Uses the concepts of path strength,
functional linkage, and centrality
Process
Input: a protein interaction network
edge weighting
weighted network
functional linkage measurement
score matrix
network restructuring
structured network
Output: a list of functional modules
hub confidence measurement
hubs
network clustering
clusters
Path Strength
Path Strength Model
Assumption: each node in a path chooses a succeeding edge based on the weighted
probability
Path Strength Factors
Edge weights
Path length
Node weighted degree
Functional Linkage
Measurements
Path strength of the strongest path between two nodes
Computational problem
Needs a heuristic approach
Uses a user-specified threshold of the max path length
Formula
k-length path strength:
Functional linkage:
shortest path length
threshold
Network Restructuring
Centrality
Weighted closeness:
Algorithm
Computes centrality for each node a
Selects a set of ancestor nodes, T(a), of a by
Selects a parent node, p(a), of a by
Example
Hub Confidence
Measurement
Selects a set of child nodes, D(a), of a by
Selects a set of descendent nodes, La, of a by
Computes the hub confidence, H(a), of a by
Example
Clustering
Algorithm
Iteratively select a hub a with the highest hub confidence
Output the sub-tree La including a as a cluster (functional module)
Cluster Depth
The max path length from the root of the sub-tree to a leaf
Example
Topological Assessment of Hubs
Network Vulnerability
Random attack: repeatedly disrupt a randomly selected node
Degree-based hub attack: repeatedly disrupt the highest degree node
Structural hub attack: repeatedly disrupt the node with the highest hub confidence
For each iteration, observes the largest component
Results
fraction of largest component
1.00
0.95
0.90
0.85
0.80
0.75
random attack
degree-based hub attack
structural hub attack
0.70
0.65
0.60
0
20
40
60
80
100
number of nodes
120
140
160
Biological Assessment of Hubs
Protein Lethality
Determines lethal / viable proteins by knock-out experiment
Lethality represents functional essentiality
Orders proteins by degree and hub confidence
Observes the cumulative proportion of lethal proteins for every 10 proteins
Results
1.0
structural hubs
average lethality
0.9
degree-based hubs
0.8
0.7
0.6
0.5
0.4
0
20
40
60
80
number of hubs
100
120
140
Topological Assessment of Clusters
Modularity
A combined measure of density within each cluster and separability among clusters
Estimated by the ratio of the number of edges within a cluster (sub-graph)
to the number of all edges starting from the nodes in the cluster (sub-graph)
Observes the average modularity of clusters with respect to the cluster depth
Results
More specific function module has
Justify the general-to-specific concepts
of hierarchical functional modules
160
average modularity
higher modularity
180
140
120
100
80
60
40
20
0
1
2
3
4
5
6
7
8
cluster depth
9
10 11 12
Biological Assessment of Clusters
f-Measure
Compares each output cluster X with the real functional annotation Y (from MIPS)
Recall = (# of common proteins of X and Y) / (# of proteins in Y)
Precision = (# of common proteins of X and Y) / (# of proteins in X)
f-measure = 2 × Recall × Precision / (Recall + Precision)
Results
Compared with the results from previous hierarchical clustering methods, e.g.,
edge-betweenness (top-down approach) and ProDistIn (bottom-up-approach)
Conclusion
Motivation
Significant functional knowledge in protein interaction networks (interactome)
Complex connectivity
Contributions
Convert an unstructured network to a structured network
Conserve functional information through pathways
High network vulnerability, low functional lethality at hubs as a drug target
Applicable to various fields, e.g., social networks, WWW
Foundation of structural dynamics during network evolution
Questions ?
Reference
Y.-R. Cho and A. Zhang, “Identification of functional modules by converting
interactome networks into hierarchical ordering of proteins”. BMC Bioinformatics,
11(Suppl 3):S3, 2010