Transcript Gonzalez
Knowledge Integration for Gene
Target Selection
Graciela Gonzalez, PhD
Juan C. Uribe
Contact: [email protected]
GeneRanker in a Nutshell
• Integration of knowledge from
– biomedical literature
– curated PPI databases, and
– protein network topology
• Seeks to prioritize lists of genes on
their association to specific diseases and
phenotypes [1],
• Such associations may or may not have
been published (thus, not text mining)
[1] Gonzalez G, Uribe JC, Tari L, Brophy C, Baral C. Mining Gene-Disease relationships from Biomedical
Literature: Incorporating Interactions, Connectivity, Confidence, and Context Measures. Pacific Symposium in
Biocomputing; 2007; Maui, Hawaii; 2007.
GeneRanker Interface
1. The user types a disease or
biological process to be
searched.
2. Genes found to be in
association to the disease
are extracted from the
literature.
3. Protein-protein interactions
involving those genes are
then pulled from the
literature & curated sources
4. The protein network is built
and each gene ranked
GeneRanker Interface
Collaboration: Application of GeneRanker to
a biological context, with Dr. Michael Berens,
Director of the Brain Tumor Unit at the
Translational Genomics Institute (TGen).
GeneRanker is available as an online
application at http://www.generanker.org.
• Each gene is scored and can be annotated (count of
co-occurrences and statistical representation)
Evaluation of GeneRanker
Mining genes related to gliom a: Precision by Method
Ranked list (top 50)
Ranked list (top 100)
Ranked list (top 200)
Gene-disease search
Random List
0%
10%
Related (>10 articles)
•
•
•
20%
30%
40%
50%
60%
Possibly Related (1 to 10 articles)
70%
80%
90%
100%
No evidence of relation or not a gene
Contextual (PubMed search) based shows > 20% jump in precision
over NLP based extraction.
Synthetic network results show AUC > 0.984
Empirical validation against a glioma dataset shows consistent results
(118 vs 22 differentially expressed probes from top vs bottom of list)
Complementary Work
• CBioC: www.cbioc.org shows PPIs,
gene-disease, and gene-bioprocess
associations extracted from abstracts
• BANNER: sourceforge.banner.org
(presenting a poster on this one). An
open source entity recognizer available
now.
• Gene normalization: a similar open
source system soon to be available.