geWorkbench - Association for Pathology Informatics

Download Report

Transcript geWorkbench - Association for Pathology Informatics

geWorkbench
John Watkinson
Columbia University
geWorkbench


The bioinformatics platform of the National
Center for the Multi-scale Analysis of
Genomic and Cellular Networks (MAGNet).
Also, part of the NCI’s cancer Biomedical
Informatics Grid (caBIG) initiative. The project
was formerly called caWorkbench.
geWorkbench (cont.)





A desktop application for integrative
genomics.
Runs on Windows, Linux and Macintosh.
Includes a variety of informatics tools, but
specializes in microarray analysis.
Open-source and free for non-commercial
use.
Includes an API for plugin development.
geWorkbench (cont.)
Integrative Genomics



Increasingly, researchers need to combine several
data sources (microarray assays, DNA/RNA/protein
sequences, protein structure, gene ontology, clinical
data, etc.)
geWorkbench attempts to move past simple
microarray analysis to include integrative methods.
Plugin framework allows geWorkbench to interact
with other major software packages, including
BioConductor, GenePattern and Cytoscape.
Data Support







Microarray assays (one-color and two-color,
as well as caARRAY assays).
Sequence files.
BLAST queries.
Gene-Gene interaction networks
(Interactomes).
Gene Ontology Terms.
caBIO pathways and annotations.
Protein structure files (PDB).
Components



geWorkbench has a plugin interface for the
development of 3rd-party components.
Documentation and developer support is
available from the geWorkbench team.
All visualizations and analyses have been
written using the API. Several groups at
Columbia are developing for the platform.
Microarray Analysis






Summarization of raw chip data (via
BioConductor).
Normalization and Filtering.
Differential expression analysis.
Clustering (Hierarchical and Self-Organizing
Maps).
Classification (SVM and SMLR).
Many visualization tools.
Hierarchical Clustering
Scatter Plot Visualization
caBIO Pathway Viewer
Sequence Analysis





BLAST and HMM search interface.
Pattern discovery.
Synteny analysis.
Promoter region analysis.
A variety of sequence viewers.
Pattern Discovery Viewer
Promoter Viewer
GO Term Enrichment




Traditional t-tests on microarray data determine
differentially expressed genes between two different
phenotypes.
Gene Ontology (GO) term enrichment can
determine which functional or structural categories
show significant differentiation.
Supported in geWorkbench’s GO Panel component.
A similar technique can be applied to other gene
sets, such as KEGG pathways.
GO Terms (cont.)
Reverse Engineering



Microarray data can be used to infer
biological pathways.
geWorkbench’s Reverse Engineering
component uses the ARACNE algorithm to
build gene-gene interaction networks.
These can be compared and combined with
an online database of interactions, curated by
Columbia.
Reverse Engineering (cont.)
Reverse Engineering (cont.)
Matrix REDUCE


Given microarray data and upstream
sequences for genes, transcription factor
binding sites can be inferred.
The Matrix REDUCE component in
geWorkbench provides this analysis and
tools to visualize the results.
For More Information



http://www.geworkbench.org
Mailing List:
[email protected]
John Watkinson: [email protected]
Acknowledgements



ARACNE algorithm by Califano et al.
Matrix REDUCE algorithm by Bussemaker, et
al.
geWorkbench team: Aris Floratos, Eileen
Daly, Kenneth Smith, Kiran Keshav, Xiaoqing
Zhang, Manjunath Kustagi, Matthew Hall,
Bernd Jagla, Mary VanGinhoven, John
Watkinson.