FunctionSIGTalkx - Scholars at Harvard

Download Report

Transcript FunctionSIGTalkx - Scholars at Harvard

Predicting Functional
Relationships in
Osteoblasts
JACOB M. LUBER1, 2, CATHERINE SHARP2, KB CHOI2, CHERYL
ACKERT-BICKNELL2,3 & MATTHEW A. HIBBS1,2
1DEPARTMENT
2THE
OF COMPUTER SCIENCE, TRINITY UNIVERSITY, SAN ANTONIO, TEXAS 78212 USA
JACKSON LABORATORY, BAR HARBOR, MAINE 04609 USA
3UNIVERSITY
OF ROCHESTER MEDICAL CENTER, ROCHESTER, NEW YORK 14642 USA
CORRESPONDING AUTHOR EMAIL: [email protected]
Tissue Context Specificity
Bicknell & Hibbs, 2012
Functional Relationship Networks
Node for each gene
Edges between
functionally related (or
predicted genes)
Correlation-based measures
examine trends, rather than
absolute values
Unrelated pairs not
connected
Steps to Predict Improved Pathways

Mouse Biology

Genomic Data

Features

Gold Standard

Heterogeneous
Data Integration

Machine Learning

Predictions!
Machine Learning & Context Specificity
We need to consider both:
What Context Our
Data Come From
&
All Mouse Data
Tissue Specific Data (Bone Element)
How We Handle
Ground Truth
ROC Curves
Bone Only Model
All Tissue Model
Curated GS
MODEL TRAINED ON
ALL TISSUE DATA
WITH A MANUALLY
CURATED GOLD STANDARD
MODEL TRAINED ON
BONE ELEMENT DATA
WITH A MANUALLY
CURATED GOLD STANDARD
GO Derived GS
MODEL TRAINED ON
ALL TISSUE DATA
WITH A GO DERIVED
GOLD STANDARD
MODEL TRAINED ON
BONE ELEMENT DATA
WITH A GO DERIVED
GOLD STANDARD
ROC Curves
Bone Only Model
All Tissue Model
Curated GS
GO Derived GS
PR Curves
Bone Only Model
All Tissue Model
Curated GS
GO Derived GS
WNT Signaling (KEGG)
WNT Signaling
Bone Only Model
All Tissue Model
Curated GS
GO Derived GS
BMP Signaling (KEGG)
BMP Signaling
Bone Only Model
All Tissue Model
Curated GS
GO Derived GS
Key Takeaways
■
Predictions made by the four classifiers are very dissimilar
■
Likely that some of the highly predicted edges in classifiers
trained on all data may not actually be related within the
context of bone biology
■
Literature evidence suggests classifiers trained on manually
curated data and applied to only bone element data
provides most accurate picture of bone biology (Cain et.
al.)
■
Curated gold standard contains edges not supported by
bone only data---suggesting that only a subset of FRs in the
literature are supported by co-expression data
■
Methods are a next step of current state of the art methods
like FNTM
Acknowledgements
■
Matt Hibbs
■
Carol Bult
■
KB Choi
■
Adam Lavertu, Evan Cofer
■
Cheryl Ackert-Bicknell, Catherine Sharp
■
Troyanskaya Group & Casey Greene
■
Huttenhower Group
■
NIH
■
NSF
■
Trinity University Mach Research Fellowship
More Details @ scholar.harvard.edu/~jluber
Contact Me @ [email protected]
Gaussian Fits