Transcript Slide 1

EGAN tutorial:
Loading experiment
results
October, 2009
Jesse Paquette
UCSF Helen Diller Family
Comprehensive Cancer Center
[email protected]
Preamble
• This document has many slides with multi-step
animations
– Best viewed in Slide Show mode
• The EGAN graphical user interface is evolving
– Icons may change
– Menus may change
– Button/widget placement may change
– This document probably won’t change as quickly
– Please contact the developers if you notice major
discrepancies between this and EGAN
Loading experiment results: An overview
• EGAN is designed to help you interpret the results of
exploratory assays
• EGAN does not actually do the multivariate statistical
analysis for your experiment
– It picks up where many useful analysis programs stop: at
the gene list
• If the entities measured in an assay can be mapped
to genes, the results can be loaded in EGAN
–
–
–
–
Expression microarrays
MS/MS peptide identifications
Genome-wide SNP/CNV assays
Next-gen sequencing
– DNA methylation assays
– ChIP chips
Loading experiment results: An overview
• EGAN works best when you load results
for all entities measured in the assay
– i.e., don’t apply a p-value cutoff on the results before
loading into EGAN
– Just because a gene missed the cutoff at p < 0.001, there’s
still a good chance that it is a significant hit
• Especially if it is related to other top hit genes
– EGAN will allow you to adjust the statistic/p-value cutoff
dynamically
• Then you can directly observe how networks/enrichment
scores change with different cutoff values
– Of course you can still load post-cutoff experiment results
• If that’s all you have…
Loading experiment results
into EGAN
Loading experiment results into EGAN:
The file format
•
Tab-delimited text
– Easy to create in Excel from existing result files
•
Header line required
– Header of statistic (second) column will become the experiment name
in EGAN
•
Three columns
– 1) Entity ID
• i.e. probe set ID, UniProt ID, refSNP ID, etc.
• You can use any IDs that can be mapped to Entrez Gene IDs
• EGAN provides a wide variety of mapping file options
–
HUGO Gene Symbol, AffymetrixAgilent/Illumina IDs, GenBank, Ensembl, UniProt, etc.
• EGAN expects that all entity IDs are the same type
– 2) Statistic (fold-change, regression coefficient, log-odds ratio, etc.)
• EGAN visualization schemes are best when the statistic column is centered
around 0
• Ratio and fold-change data can be 0-centered by logarithm
– 3) P-value (unadjusted, adjusted or q-value)
Loading experiment results into EGAN:
An example
Header line: the statistic (second) column header should be descriptive
Each row
represents the
analysis result
for one entity in
the experiment
Three columns: ID, statistic, p-value
Loading experiment results into EGAN:
An example
Save as tab-delimited text
Loading experiment results into EGAN:
An example
• Download or construct an experiment result file
– This example will use two pre-made experiment
result files (download these files to follow along)
• Affymetrix (expression) predictors of Herceptin
resistance in HER2 over-expressing breast cancer cell
lines
• aCGH (copy number) predictors of Herceptin
resistance in HER2 over-expressing breast cancer cell
lines
– …and one custom mapping file
• Launch EGAN H. sapiens
Loading experiment results into EGAN:
An example
Click “Browse…”
Click “Browse…”
This experiment result
Enrichment calculations in EGAN are dependent on
file uses Affymetrix
how we define the background population of genes.
HG-U133A probe set
In this case we only want genes to be in the
identifiers.
background if they are present in all experiment
For simplicity’s sake,
results.
Select “Affymetrix HGwe’re
not
going toresults
cover
Select
Selectyour
the experiment
aCGH
experiment
and
click
Now
both
experiment
Select
the
mapping
file
and
For
the
aCGH
clones
we
We “Specify
want
load
a new
experiment
U133A” from the
items
3-5
right
now.
and to
click
empirical
“Specify
data
empirical
set”
The
expression
results
are
ready
Select “intersection”
from
the
drop-down
are
ready
to
list.
be
loaded.
click “Specify mapping
file” mapping
have a custom
drop-down menu.
data set” to be loaded.
Let’s load the
file.Click “Add Experiment”.
aCGH
results.
There’s
thing to
Clickone
“Addmore
Experiment”
consider
before launching
Click on “6) Experiments”
Click “Browse…”
ClickEGAN...click
“New Data Set”.
on “5) Gene
Nodes”.
Finally, click “Finish – Launch
EGAN”
Loading experiment results into EGAN:
An example
Whenever you change the network configuration by
adding or removing files, you will be given the option to
save the new configuration to a tab-delimited text file.
If you choose to save a .config file, next time you will
only need to specify that file (item 3 in the Launch
EGAN Wizard).
Loading experiment results into EGAN:
An example
Your experiments are now accessible in EGAN:
as columns in the Entrez Gene Node Table and
as rows in the Experiments Table.
Questions/comments?
• Visit http://groups.google.com/group/ucsf-egan
for downloads, documentation and discussion
– Requires an account with Google Groups