EGAN - iPlant Pods

Download Report

Transcript EGAN - iPlant Pods

EGAN: Exploratory Gene
Association Networks
by Jesse Paquette
Biostatistics and Computational Biology Core
Helen Diller Family Comprehensive Cancer Center
University of California, San Francisco
(AKA BCBC HDFCCC UCSF)
EGAN
•
http://akt.ucsf.edu/EGAN/
Features
– Downloadable Java application –
• but could be re-composed as components for web service architecture
– Graphics provided by Cytoscape; graph layout algorithms imported from open
source
– Data pre-loaded for analysis. Each data set must include assay id, a measure
(e.g., correlation coefficient, expression level) and significance value (e.g., p
value)
– Currently for Human and Rat Genome, but other model species in August
(including arabidopsis)
•
Key focus- interactive analysis of sets of genes
– User identifies the sets interactively
– Enrichment -- uses Fishers exact test to see whether genes in a pathway are
“overrepresented” relative to chance selection. Based on hypergeometric
distribution, an n choose k sampling distribution
– Gene sets graphed based on relationships
• Counts (simply connect each gene to others in the set– can graph multiple sets)
• Protein-protein interaction
• Co-occurrence in literature
– Access to pub med literature and external links
•
For demos, slides, presentations
http://akt.ucsf.edu/EGAN/documentation.php
Producing insight from clusters and gene lists
•
Summarize: find enriched pathways (and other gene sets)
– Hypergeometric over-representation
• DAVID
– Global trends
• GSEA
•
Visualize: gene relationships in a graph
– Protein-protein interactions
• Cytoscape
– Network module discovery
• Ingenuity IPA
– Literature co-occurrence
• PubGene
•
Contextualize: pertinent literature
• PubMed
• Google
• iHOP
High-throughput experiments
• EGAN applies to
–
–
–
–
–
–
–
–
Expression microarrays
aCGH
SNP/CNV arrays
MS/MS Proteomics
DNA methylation
ChIP-Seq
RNA-Seq
In-silico experiments
• If parts of the output can be mapped to gene IDs
– You can use EGAN
Gene sets
• EGAN contains a database of gene sets
– You can also add your own
– Download from MSigDB (Broad)
• A gene set defines a semantically-meaningful subset of genes
–
–
–
–
–
–
–
–
–
Signaling or metabolic pathway
Gene Ontology (GO) term
Previously-reported gene list (“signature”)
Cytoband
Transcription factor targets
miRNA targets
Conserved domain
Drug targets
&c.
Gene-gene relationships
• EGAN contains
– Protein-protein interactions (PPI)
– Literature co-occurrence
– Chromosomal adjacency
– Kinase-target relationships
The article will be shown in your default web browser.
Finding Counts
EGAN Summary: Exploratory Gene Association
Networks
•
Methods: state-of-the-art analysis of clusters and gene lists
–
–
–
–
–
•
User Interface: responds quickly to new queries from the biologist
–
–
–
–
•
Hypergeometric enrichment of gene sets
Global trends of gene sets
Graph visualization
Literature identification
Network module discovery
Fluid adjustment of p-value cutoffs
Point-and-click interface
All data in-memory for immediate access
Links to external websites
Modular: integrates as a flexible plug-and-play cog
–
–
–
–
–
All data is customizable
Proprietary data can be restricted to the client location
Java runs on almost every OS (PC, Mac, LINUX)
Can be configured and launched from a different application (e.g. GenePattern)
Analyses can be scripted for automation
Keys to getting the most out of
EGAN
•
•
Don’t panic!
Load as much data as possible
•
•
– Assay results for every gene
– Multiple experiments
– Pathways and gene sets
•
• MSigDB
– Previously-published gene lists
and clusters
• Supplementary data
• Oncomine
•
Think about the context of the
experiment
– Show appropriate genes on graph
•
Think about the semantic meaning
of the enriched gene sets
– Show appropriate gene sets on
graph
Follow links to literature
Use appropriate Google/PubMed
search queries
Create high-quality reports
– Save your custom gene sets
– Export graph screenshots to PDF
– Export tables with enrichment
scores to Excel
– Record details in your lab
notebook
Where to find EGAN
• Website
– http://akt.ucsf.edu/EGAN/
• 2010 paper in Bioinformatics
– http://www.ncbi.nlm.nih.gov/pubmed/19933825