Transcript BiGCaT

Methods to select
physiological function related
changes
(Dis)advantages of the z-score
Rachel van Haaften and Chris Evelo
BiGCaT Bioinformatics Group – BMT-TU/e & UM
How to find the most relevant
pathways ?

MAPPFinder and z-score
Other initiative available like (StratiGO/ MAGO)
but to us they appear even less useful
MAPPFinder and z-score
Number of changed genes at a specific level minus
the expected number at that level from the overall fraction,
divided by the standard deviation of the observed number of genes
and normalized for the size of the observation
MAPPFinder and GO hierarchy
Pitfalls

Go levels are not independent
(one gene can appear in more than one level)

Number of genes in a level is arbitrary
(due to isoenzymes etc.)
example:

gene X, Y, Z (and Z contains 10 isoenzymes); when both genes x and y
changes it is the same as changing of two isoenzymes of Z
More than one reporter on the array for the
same gene
This is why we can calculate scores not P-values
Pitfalls 2

Z-score calculation starts with:
– Number of genes on map
– Number of genes detectable
– Number of reporters detected

For 2D gels (and in silico studies):
detectable = detected z-score=0
More challenges
Can we use fold changes?
(1.4 fold changed < 5 fold)
 Can we combine fold changes and
change statistics (Affy specific)
 Can we use pathway connections?
(e.g. gene Y important and
regulation for gene Y known to be
affected)

More challenges 2
Can we combine transcriptomics and
proteomics
1 on 1 (mRNA + protein changed)
 Causal e.g.:
protein modified and
modifier mRNA regulated

Our current answer
It is just a tool!
Use your own judgment
Future answer?
Top scoring pathways are:
the most likely influenced
and
the most likely important
Now What?
We need some defined follow-up