An investigation of conserved coexpression amongst seven

Download Report

Transcript An investigation of conserved coexpression amongst seven

An investigation of conserved
coexpression in bacteria
Nels Thorsteinson
Research and Training Centre on Bioinformatics
Institute for Information Transmission Problems
Russian Academy of Sciences
Биоинформатика
Introduction
• Coexpression
– groups of genes with similar
expression profiles
– measured by Pearson correlation
– involved in similar functions
• Conserved coexpression
– groups of genes which are coexpressed in multiple species
– involved in core biological processes
Methods
• Public data from GEO, Array Express, Stanford
Escherischia coli
Bacillus subtilis
Mycobacterium tuberculosis
Vibrio cholera
Streptococcus pneumonia
Campylobacter jejuni
Streptomyces coelicor
• NCBI’s COG database
– Orthologue assignment
• STRING database
– Evaluation of coexpression networks
Figure 1: Similarity of single genome coexpression sets
a
b
0.16
Pearson correlation between
coexpression sets
Pearson correlation between
coexpression sets
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-0.02
0
20
40
60
evolutionary distance
80
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-0.02
61.5
62
62.5
63
63.5
evolutionary distance
64
64.5
Figure 2: Correlation of coexpression sets
to STRING's neighbourhood score
0.25
conserved
coexpression
correlation
0.2
averaged single
genomes
0.15
0.1
0.05
0
E
V
C
M
B
S
S
1
2
3
4
5
6
7
number of genomes
Figure 3: Correlation of conserved coexpression sets to STRING's neighbourhood
score
b
1.8
fold difference of correlations
fold difference of correlations
a
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
20
40
60
evolutionary distance
80
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
61
62
63
64
evolutionary distance
65
Methods
• Functional classification of the genes in the conserved
coexpression network
Functional Classification
Number of genes
Translation, ribosomal
47
Energy production
9
Transcription
7
Carbohydrate transport
6
Intracellular
trafficking
4
• Only
one third
of gene pairs consist of genes belonging
to the
Cell operon
motility
3
same
Posttranslational modification
2
Amino acid
2
Replication, recombination
1
Conclusion
• The more genomes used when calculating a conserved
coexpression network, the higher the correlation to functional
interactions
• The further the distance between the species for which a
conserved coexpression network is calculated, the higher the
correlation of the resulting network to functional interactions
• Presented conserved coexpression network
Acknowledgements
Mikhail Gelfand
Anya Gerasimova
Alexey Kazakov
Artem Cherkasov
Research and Training Centre on Bioinformatics
Institute for Information Transmission Problems
Russian Academy of Sciences
Биоинформатика