PPT - Stockholm Bioinformatics Center

Download Report

Transcript PPT - Stockholm Bioinformatics Center

Sunday, April 10, 2016
T
From high-throughput data to
network biology: gain in statistical
power and biological relevance
Stockholm Bioinformatics Centre
Andrey Alexeyenko
PLoS Med 2005 2(8):e124
Why Most Published Research Findings Are False
Statistical model: no positive facts,
and an allowed rate of Type I error
True negatives
False positives
Biological reality: negative facts are the
vast majority, positive facts are yet to be
discovered
Positive facts
True positives
Negative facts
“Positive facts”: the discoveries we are after, e.g. genomic associations,
differentially expressed genes, relations “phenotype<->disease” etc.
Network is just a graph!
The fact that I can draw a
network does not yet make
it a biological reality!..
Conversion “data pieces  confidence”
in a Bayesian framework
D. rerio, 17.3%
D. melanogaster, 9.8%
C. elegans, 9.3%
R. norvegicus, 5.1%
S. cerevisiae, 10.2%
M. musculus, 25.4%
A. thaliana, 6.5%
H. sapiens, 16.5%
A
Phylogenetic profiling, 18.6%
Protein interactions, 10.6%
Protein expression, 6.1%
T F targeting, 12.3%
miRNA targeting, 2.0%
Sub-cellular localization, 7.3%
mRNA expression, 43.1%
Enrichment of functional groups
Enrichment analysis in the networks turns to be
more powerful than on gene lists
Enrichment of functional groups
Partial correlations
rPLC = 0.95
rPLC = 0.88
rPLC = 0.76
Benjamini-Hochberg correction
Quantitative modeling of multi-component
system with mutually dependent elements
Why going “list  network” is
an advancement?
• Functional context
• “Anchoring”, i.e. interdependence
• Biological interpretability
• Statistical features
• Data integration
Many of those can be applied to the lists as
well, but mind the flexibility!
Ways to augment confidence
Trivial:
1) increase power
2) decrease false prediction rate
•
Data integration
– Evaluation prior to integration!
•
•
•
Consider biological context
Remove spurious edges
Generalize to a higher level of organization
Ways to evaluate confidence
• Supervised learning
• Balance comprehensiveness and
complexity (s.c. information criteria)
• Benjamini-Hochberg
• Show it a biologist
• Go out to the real world and test
Ways to employ confidence
•
•
•
•
Initialize network
Add node and edge attributes to the network
Filter network elements for higher relevance
Build more complex models accounting for
confidence