Transcript ppt

GENIE – GEne Network Inference
with Ensemble of trees
Van Anh Huynh-Thu
Department of Electrical Engineering and Computer Science,
Systems and Modeling, University of Liege, Belgium
Inference of GRNs
Gene regulatory networks (GRNs) are
behind the scene players in gene
expression
 How do we determine the regulators of
each gene?
 Input:

 Gene expression data in different
conditions/time points
 A subset of the genes that contains all the
regulators (without GENIE accuracy plummets)
Underlying Model
Every reverse engineering tool assumes
an underlying model
 GENIE assume that the GRN is a
Boolean network
 Therefore, the regulation of each gene is
a Boolean function

GENIE Strategy Outline
Not to make strong assumptions about
the possible regulatory interactions (e.g.
a strong assumption is linearity)
 Treat time-series as static experiments
 Solve the problem for each gene
separately, and combine the results
 The final output is a ranking of potential
interactions in descending confidence

GENIE workflow
Tree-based Ensemble Methods
A regulation function is a binary tree – at
each node a binary test according to a
different regulator is performed
 The prediction is at the leaf
 For each gene, randomly select a set of
samples and produce a tree from each one
(the root is the single gene that splits K
random conditions of the target best, and
so on)
 Rank the regulators according
to their importance in the trees

Ranking of regulators
#S is the number of samples that reach the node N
#St (Sf) is the number of samples with output true
(false)
Var() is the variance of the output
In order to avoid bias towards highly variable genes, the
expression values are first normalized to unit variance
Best performer in DREAM5
network inference
The Genetic Landscape of the
Cell
Charles Boone
University of Toronto, Donnelly Center
Synthetic Genetic Arrays
•Single mutant strand (query gene) is
crossed with all other single mutants
•Double mutants are selected
•Currently done for budding yeast,
e.coli and s.pombe
No growth
Genetic Interactions
Positive interaction: The double knockout
is more viable than would be expected by
the separate contributions of the single
knockouts
 Negative interaction: The double knockout
is less viable than would be expected by
the separate contributions of the single
knockouts
 They crossed ~1700 yeast single mutants
with ~3,800 single mutants, and after
filtering failures they got ~5.4 million double
mutants

Yeast Interaction Map
Edges are
interactions that
pass cutoff
threshold (170,000)
Proximity in the
layout is according
to similarity in
interaction profiles
Colored sets = GO
enrichment
Proximity between clusters and
related functions
Proximate clusters
Both require
cytoskeleton genes
Zoom in on pathway
Required for polarization
and growth
Cell division
Red – Negative
Green - Positive
Translation
Budding
Interactions between pathways and complexes were often monochromatic
Positive vs. negative interactions
No interaction
Negative interactions are ~two times more prominent
than positive
Degree distribution
Hubs are less numerous
Severe fitness defects in single
mutants correlate with degree
Gene duplicates interact less
Correlation between degree and
gene properties
Black - PPI
# morphological phenotypes
# chemical perturbations
unstable structure
Genetic interactions between
cellular processes
Cell cycle is more buffered?
Hubs in the chemical interaction
networks match hubs in GI
network
Single mutant +
chemical =
chemical
interaction
Hydroxyurea blocks
DNA synthesis
Erodoxin (new)
similar to protein
Folding-related gene
DNA repair
Discovering Master Regulators of
Alcohol Addiction
William Shin
Center for Computational Biology and
Bioinformatics
Columbia University
Rat Model of Alcohol Addiction
Alcohol Self Administration
Alcohol Vapor Treatment
(Chronic alcohol addiction)
Dependent
Control
No Alcohol Vapor
Non
Dependent
Control
Rat model of alcohol addiction
Induction of alcoholdependence
Alcohol Intake during
early withdrawal
Alcohol responding (0.5 hr)
Alcohol selfadministration
(lever pressing)
100
*
75
50
25
0
Baseline
Non-dependent
(exposed to air)
Dependent
(exposed to alcohol vapor)
Identification of TF-target
interactions

Rat Brain regions were sliced and used as
microarray samples
 92 samples from Dependent, Non-Dependent,
Control Rats across 8 regions that are known as
sites-of-action for of addictive drugs.

Applied ARACNE to this data
 Information-theory based (MI)
 Tests triplets of genes for indirect interactions

130,000 TF-target interactions in total
Screening of false positives
TF1
TF2
Targets
of TF1
Targets
of TF2
TF1 shadows TF2:
TF2 appears enriched only
because it shares common
targets with TF1
THE MASTER REGULATORS ARE ENRICHED TFS NOT
SHADOWED BY ANY OTHER
Masters regulators in the
Accumbens shell
Activity profile at different brain
regions
siRNA validation has 50-75%
success rate
NOT ALL TARGETS WERE TESTED YET