Diapositive 1 - Institut Pasteur

Download Report

Transcript Diapositive 1 - Institut Pasteur

Putative Biology of PE/PPE Multigene
Families of Mycobacteria and their
Relevance with Regard
to Evolution
Helmi Mardassi, Institut Pasteur de Tunis
M. tuberculosis, the most successful human pathogen,
displays restricted genetic polymorphism and appears
to be exceptionally stable
LAM , Haarlem…
H37Rv, H37Ra…
Beijing…
Sreevatsan et al., 1997
PE/PPE gene families have been uncovered
owing to the availability of the M. tuberculosis
genome sequence
(S.Cole et al., 1998)
Structure of the PE family members (99 members)
PE
PE
Any aa sequence
(N=34)
PE_PGRS
PE
(GGAGGA)n
Conserved
N-terminal sequence
(~ 110 aa
PE region
)
C-terminal extension with variable
length
PGRS portion
(N=65)
Structure and sequence of the PE_PGRS33 member
(M. tuberculosis Rv1818c)
Delogu and Brennan,, 2002
The PPE family (69 members)
PPE
MPTR
180 aa
PPE
~200 to
SVP
180 aa
PPE
180 aa
PPE
180 aa
(NxGxGNxG)n
N=23
~ >3 500 aa
GxxSVPxxW
N=24
~200 to ~ 400 aa
PPW
PxxPxxW
N=10
~200 to ~400 aa
Unique sequence
0 to ~400 aa
N=12
Constraints
1.
Repetitive sequences: extensive cross reactivity (hybridization
and antigenic assays)
2.
high G+C content: very difficult to amplify by PCR
3.
Poorly expressed in the conventional host E.coli (Toxic effect?
Instability?)
4.
Very acidic and membrane-associated proteins: notoriously
refractory to analysis by two-dimensional electrophoresis and
mass spectrometry
PE/PPE genes are particularly abundant within the
M. tuberculosis complex
Association of PE/PPE with the ESAT-6 (esx) gene cluster
Gey van Pittius,2006
A plausible scenario for the expansion of the PE/PPE
gene families has been recently proposed
Several questions relating to these multigene families
came to mind
1.
Do all PE and PE_PGRS proteins share similar functions?
2.
What is the extent of their genetic variability?
3.
Do they really play a role in antigenic variations
4.
Are they associated to enhanced virulence and /or transmissbility?
5.
Which members are conserved and essential
Two functions were immediately proposed
Antigenic variability
Interference with antigen processing and presentation in
the context of MHC I molecules
(Cole et al., 1998)
The PGRS domain of PE_PGRS proteins dispalys significant
sequence similarity with the EBNA1 antigen of EBV
(Brennan & Delogu, 2002)
The PGRS domain confers increased stability
to GFP protein in eucaryotic cells
(Brennan & Delogu, 2002)
PE_PGRS33
Michael Brennan
TM region
PE region
PGRS region
PGRS33 DNA vaccine – reasonable level of protection
Vaccine produced ab response to only PGRS tail region and not PE
PE region – high Inf-γ (low IL-10, no ab) – high Th1 response
PGRS region – low Inf-γ (high IL-10, high ab) – high Th2 response
PE/PPE are genetically variable
◤ In silico comparative sequence analysis
al., 2002, Garnier et al., 2003)
(Cole et al., 1998, Gordon et al., 2001, Fleishmann et
◤ Sequence analysis of clinical isolates
[PE_PGRS33 (Talarico et al., 2005), PPE8 (Srivastava et
al., 2006, PE_PGRS17, PE_PGRS18 (Karboul et al., 2006)]
◤ Microarray data (Tsolaki et al., 2004; Garcia-Pelayo et al., 2004)
The Molecular mechanisms that operate to generate the
genetic variability in PE/PPE genes
● Dislocations between a replicating strand and its template at repetitive DNA
sequences (replication slippage) (Cole et al., 1998, Machowski et al., 2007)
● Intergenic and intragenic recombiantion/gene conversion events
Gutacker et al. 2006, Karboul et al., 2006, Lui et al., 2006)
(Cole et al., 1998,
● Microsatellite polymorphism (Sreenu et al., 2006)
● Insertion deletion events of IS and phage sequences within PE/PPE genes
A genomic library-based amplification strategy (GL-PCR)
for efficient mapping of insertion sequences
A typical GL-PCR profile
A
B
M
1
2
3
4
Control vector
M
1
2
3
MTB14323
4
M
1
2
3
4
Haarlem3 MDR-TB
Outbreak isolate
Rv1755 plcD
Rv0403c mmpS1
Rv2819c
Rv0794c:Rv0795c
Rv2815c :Rv2816c
Rv2017:Rv2018
Rv0171 mce1c
Genomic location of IS6110 in
the M. tuberculosis reference
strain MTB14323
Rv2328 PE23
Rv2352c PPE38
Rv2336
PE/PPE genes are differentially expressed
◤ DFI
◤ Promoter trap
◤ cDNA microarray
◤ RT-PCR and QRT-PCR
Subcellular location
● PE/PPE proteins are associated with the cell wall and cell membrane
fractions and appears to be partly exposed on the cell surface (Doran et al., 1992;
Brennan et al., 2001; Banu et al., 2002, Sampson et al., 2001; Okkels et al., 2003; Delogu et al., 2004; Le Moine et al.,
2005)
● In silico analysis identified in 40 PE/PPE proteins potential beta-barrel
outer-membrane structures (Pajon et al., 2006)
● PE_PGRS33 influences the cellular architecture, colony morphology and
bacillus-bacillus interaction (Brennan et al., 2001)
Structural genomics/structural biology determined the crystal structure
of a PE/PPE protein complex
Strong et al., 2006
Additional putative function(s) and
relevance of the PE/PPE proteins
1.
Antigenic variations
2.
Interference with antigen processing and presentation
3.
Necessary for replication and persistence of the bacillus within
the host cell
4.
Vaccine candidate
5.
Pre-clinical expression
(in the mouse model)
6.
Architecture of the bacillus
diagnostic potential
Phenotypic characteristics as inferred from gene
inactivation based experiments
► A transposon mutant of PPE 46 was found to be attenuated for growth in macrophages
(Camacho et al.,1999)
► In the M. marinum model, two PE_PGRS genes were found to be essential for the bacillus to
replicate in macrophages and persists in the host granulomas (Ramakrishnan et al., 2000)
► M. Bovis BCG strain, whose PE_PGRS33 expression is abrogated could not infect and
survive in macrophages (Brennan et al., 2001)
► PPE31, PPE68, and PE35 are required for growth in vivo during infection of mice (Sassetti et
al., 2003 )
► PPE25 ( Li et al., 2005) and PPE10 (Stewart et al., 2005) mutants seem to be associated
with the control of phagosomal acidification
Effect of expression of certain PE_PGRS genes in the non
pathogenic M. smegmatis
● M. Smegmatis expressing PE_PGRS33 displayed enhanced colonization of
BMM macropahages and increased cell necrosis (Dheenadhayalan et al., 2005)
● PE_PGRS33 elicits TNF-alpha release from macrophages in a TLR2dependent manner (Basu et al., 2007)
● M. smegmatis expressing the PE_PGRS gene Rv3812c display increased
resistance in vitro to low pH (Karboul et al., in preparation)
CONCLUSION
●Expansion of PE/PPE proteins in pathogenic mycobacteria seem to have been
accompanied with functional divergence
● Although several members are homologous, there does not seem to be any
compensatory effect. Thus, a high level of functional specialization could have been
reached during evolution
● From these preliminary studies, certain members seem to be endowed with multiple
functions
● Although deletion analyses of PE/PPE genes were accompanied with phenotypic
characteristics, the detailed molecular mechanisms responsible for the observed
effects remain to be demonstrated
The research program relating to PE/PPE gene
families carried out at IPT
Primary objective:
Evaluate the distribution of PE/PPE genes among mycobacteria and
the extent of their genetic variability
Specific aims:
-Comparative sequence analyses of selected genes
-Development of efficient tools for the specific detection of PE/PPE
genes (identification of specific probes)
PE/PE_PGRS
members subjected to
comparative
sequence analysis
PE34
Rv3872
Rv3020c
Rv1089
Rv3893c
Rv0335c
Rv3018A
Rv3022A
Rv0285
Rv1386
Rv2408
Rv2328
Rv3812
Rv1646
Rv2107
Rv1169c
Rv3097c
Rv2431c
Rv0151c
Rv0152c
Rv0159c
Rv0160c
Rv1430
Rv3650
Rv1172c
Rv2099c
Rv1788
Rv1791
Rv2769c
Rv1040c
Rv3622c
Rv1195
Rv3477
Rv0754
Rv1806
Rv2340c
Rv0916c
Rv3652
Rv1088
Rv1983
Rv2519
Rv1214c
Rv0977
Rv0109
Rv1768
Rv1087
Rv2162c
Rv1091
Rv1840c
Rv1651c
Rv3653
Rv2098c
Rv1803c
Rv0532
Rv0124
Rv1396c
Rv0578c
Rv1067c
Rv1068c
Rv1468c
Rv3388
Rv0278c
Rv0279c
Rv0747
Rv0742
Rv2741
Rv1818c
Rv0746
Rv2396
Rv1325c
Rv0834c
Rv2591
Rv3344c
Rv3512
Rv0833
Rv2126c
Rv3507
Rv0297
Rv1243c
Rv2490c
Rv0872c
Rv2853
Rv2487c
Rv3367
Rv3595c
Rv0832
Rv3590c
Rv2634c
Rv1441c
Rv3345c
Rv0978c
Rv0980c
Rv2371
Rv2615c
Rv1450c
Rv1452c
Rv3508
Rv3514
Rv3511
Gr.1
Gr.2
Gr.3
Gr.4
Gr.5
PCR amplification of PE members through the Mycobacterium tuberculosis
complex
Rv 0978
Rv 0285
Rv 0160
Rv 1169
Rv 3367
Rv 1195
Rv 1040
Rv 0980
Rv 1441
Sequence analysis of 22 PE members in Mycobacterium
tuberculosis complex
RESULTS
0 PE
5 PE_PGRS
Conserved
1 PE
2 PE_PGRS
4 PE_PGRS
10 PE
variable
Highly variable
Among the highly variable genes, two conform with the definition of a
duplicated gene pair
A
Rv0981
PE_PGRS16
Rv0982
RpmF
FadE13
Rv0979c
PE_PGRS17
Rv0976c
PE_PGRS18
B
-235
MAR1
PE_PGRS17
97 %
identity
996
507
506
1
MAR2
85 % nt identity
(90 % aa similarity)
98 % nt identity
(98% aa similarity)
729
PE_PGRS18
98 % nt identity
(98 % aa similarity)
30% nt identity
(38 % aa similarity)
1386
596
PE_PGRS45
MAR1
PE
1374
MAR2
MAR1
98 %
identity
1206
PGRS
PE_PGRS17
250
260
270
280
290
300
310
320
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
Q
A
G
S
T
Y
A
V
A
E
A
A
S
A
T
P
L
Q
N
V
L
D
A
H37Rv and H37Ra
CAAGCTGGCAGCACCTACGCGGTCGCCGAAGCGGCCAGCGCAACACCGCTGCAGAA------------CGTGCTCGATGC
CDC1551
......................................................C.GATCGAGCAGGC.C..T.G.GG.T
M.bov AF2122/97
......................................................C.GATCGAGCAGGC.C..T.G.GG.T
Q
330
340
350
360
370
I
E
Q
380
A
L
G
390
V
400
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
I
N
A
P
V
Q
S
L
T
G
R
P
L
I
G
D
G
A
N
G
I
D
G
T
G
Q
H37Rv and H37Ra
GATCAACGCACCCGTTCAGTCGCTGACCGGGCGCCCATTGATCGGCGACGGCGCGAACGGGATCGACGGGACCGGGCAAG
CDC1551
.......A.G..GAC.G..G.....GTG......AAGC.......T........CC.....GCGCC...C........G.
M.bov AF2122/97
.......A.G..GAC.G..G.....GTG......AAGC.......T........CC.....GCGCC...C........G.
T
T
410
E
A
V
420
K
430
H
440
450
....|....|....|....|....|....|....|....|....|....|
A
G
G
N
G
G
W
L
W
G
N
G
G
N
G
G
S
H37Rv and H37Ra
CCGGCGGTAACGGCGGGTGGCTGTGGGGCAACGGCGGCAACGGCGGGTCG
CDC1551
.......GGC......CATCT...................T.........
M.bov AF2122/97
.......GGC......CATCT...................T.........
A
I
A
P
PE_PGRS18
250
260
270
280
290
300
310
320
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
Q
A
G
S
T
Y
A
V
A
E
A
A
S
A
T
P
L
Q
N
V
L
D
A
H37Rv and H37Ra
CAAGCTAGCAGCACCTACGCGGTCGCCGAAGCGGCCAGCGCAACACCGCTGCAGAA------------CGTGCTCGATGC
CDC1551
......................................................C.GATCGAGCAGGC.C..T.G.GG.T
M.bov AF2122/97
........................................................------------............
Q
330
340
350
360
370
I
E
Q
380
A
L
G
390
V
400
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
I
N
A
P
V
Q
S
L
T
G
R
P
L
I
G
D
G
A
N
G
I
D
G
T
G
Q
H37Rv and H37Ra
GATCAACGCACCCGTTCAGTCGCTGACCGGGCGCCCATTGATCGGCGACGGCGCGAACGGGATCGACGGGACCGGGCAAG
CDC1551
.......A.G..GAC.G..G.....GTG......AAGC.......T...C....CC.....GCGCC...C........G.
M.bov AF2122/97
................................................................................
T
T
410
E
A
V
420
K
430
R
440
450
....|....|....|....|....|....|....|....|....|....|
A
G
G
N
G
G
W
L
W
G
N
G
G
N
G
G
S
H37Rv and H37Ra
CCGGCGGTAACGGCGGGTGGCTGTGGGGCAACGGCGGCAACGGCGGGTCG
CDC1551
.......GGC......CATCT...................T.........
M.bov AF2122/97
..................................................
A
I
H
A
P
The distribution of the 12/40 polymorphism could define three new
PE_PGRS-based groups
PE_PGRS18
PE_PGRS17
M. tb H37Rv
and H37Ra
MAR1
MAR2
MAR1
MAR2
PGRST3
M. tb CDC1551
PGRST2
M. tb 210 and
M. bovis AF2122/97
PGRST1
PE
PGRS
PE
PGRS
A worldwide collection of tubercle bacilli strains was subjected to sequencing analyses
The 12/40 polymorphism was not randomly distributed
PGRST1
PGRST1>PGRST2>PGRST3
PGRST1
PGRST1
PGRST1
Development of a reverse hybridization assay (PEGAssay) for the
large scale analysis of the 12/40 polymorphism distribution
17
M. tuberculosis
(H37Rv)
M. tuberculosis
(H37Ra)
Negative
control (
Buffer
)
M. tuberculosis
( Erdman
)
M. tuberculosis
(CDC1551)
M. smegmatis
( mc ² 155)
M. africanum
(ATCC 25420 )
(ATCC 35782)
M. microti
(FCC69)
M. pinnipedii
M. caprae
(CIP 105776)
M. bovis
(AF2122/97)
M. bovis
BCG
(A TCC 27290)
18
Overall, 521 MTBC isolates were analyzed
415
M. tuberculosis [108 PGG1(57 Ancestral), 259 PGG2, 48 PGG3]
42
M. bovis (5 BCG strains)
30
M. africanum (14 A1, 6 A2, 8 A3)
17
M. microti (9 voles, 3 llama, 2 cat, 2 human, 1 pig)
3
dassie
4
M. pinnipedii
2
M. caprae
6
“M. canettii” and 2 smooth tubercle bacilli
Within the whole collection of MTBC strains, only the three newly
defined PGRST types could be identified
PE_PGRS17
PE_PGRS18
PGRST1 (+/-)
PGRST2 (+/+)
PGRST3 (-/-)
PGRST1 was associated with all ancestral tubercle bacilli
and was the most abundant
The whole collection
New York et New Jersey
Tunisia
South Africa
Gene conversion involving the two paralogous PE_PGRS
genes appears to play a crucial role in the diversification
of the modern M. tuberculosis population
Gene conversion is a class of homologous recombination
Recombinant DNA
Gene conversion
Parental DNAs
Crossing over
Recombinant DNAs
Strand break coupled to mismatch repair as the most plausible
explanation for gene conversion
+
-
+
+
The gene conversion event occurs independently multiple times
CONCLUSION
As far as could be ascertained, this is work provided the most obvious gene
conversion event in the natural evolution of the mycobacterial species
The findings reinforce the role of gene conversion as a mechanism for the
generation of genetic variability associated with PE/PPE families
Strains of the M. bovis lineage appear to be refractory to gene conversion
The study offers a new perspective to trace back the evolution of tubercle
bacilli
and other smooth tubercle bacilii (-/-)
Phylogenetic analysis of smooth tubercle bacilli (referred to as M.
prototuberculosis) provided insights into the genetics of strains that might have
predominated prior to the expansion of the MTBC
Gutierrez et al., 2005
The sequence polymorphism within the housekeeping genes of the smooth
tubercle bacilli group shows gene mosaicism
Genetic variability of the PE_PGRS duplicated genes
The duplicated PE_PGRS members were previously shown to be
preferentially upregulated in vivo
Development of efficient tools for the
specific detection of PE/PPE genes
Development of a perl scripting program for the
identification of PE/PE_PGRS member specific
sequence
PE/PE_PGRS
database
30-base window size
ATCGGGATCCAGGAAT
TCGATCCCCGGTTTTA
ACTATACGCATGTCAT
GCAAGTCCCGTGGGGG
Script 1
extracts a specified length of the sequence from the gene
sequence starting from the first and shift to the second
and so on until the last possible sequence of the desired
length is obtained. This constitute the candidate primers
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG
Script 2
converts a sequence in FASTA format
to a format that enable the candidate
primers to search the gene sequence
Candidate primers
database
Script 3
search for number of times the
candidate primer occurs in the gene
sequence. Extract the one which occur
once.
Putative
PE/PE_PGRS
member specific
primer
30 mer signature sequences (100% identity)
PE subfamily
PE30_175_204
PE30_176_205
PE30_177_206
PE30_178_207
PE30_179_208
PE30_180_209
PE30_181_210
#catggtcaggactatcaagctcttagcgca#
#atggtcaggactatcaagctcttagcgcac#
#tggtcaggactatcaagctcttagcgcaca#
#ggtcaggactatcaagctcttagcgcacag#
#gtcaggactatcaagctcttagcgcacagc#
#tcaggactatcaagctcttagcgcacagct#
#caggactatcaagctcttagcgcacagctt#
Rv3097c
Rv3097c
Rv3097c
Rv3097c
Rv3097c
Rv3097c
Rv3097c
PE22_46_75
PE22_47_76
PE22_48_77
PE22_49_78
#gcgacactggagtcccttggttcccacatg#
#cgacactggagtcccttggttcccacatgg#
#gacactggagtcccttggttcccacatggc#
#acactggagtcccttggttcccacatggcg#
Rv2107
Rv2107
Rv2107
Rv2107
PE2_776_805
PE2_777_806
PE2_778_807
PE2_779_808
#ttgcaggcatcacattcgtacacaccaagt#
#tgcaggcatcacattcgtacacaccaagta#
#gcaggcatcacattcgtacacaccaagtat#
#caggcatcacattcgtacacaccaagtatt#
Rv0152c
Rv0152c
Rv0152c
Rv0152c
PE26_1290_1319
PE26_1288_1317
PE26_1289_1318
#tatctcaatctcaatacatgacaaccagac#
#gttatctcaatctcaatacatgacaaccag#
#ttatctcaatctcaatacatgacaaccaga#
Rv2519
Rv2519
Rv2519
PE4_862_891
PE12_242_271
PE18_229_258
PE1_1209_1238
PE24_134_163
#ccggcgaatagtccctacccgacacacatt#
#tgagagcaagtgcagacgcgtatgcaaccg#
#gtcaacactctacagatgagctcagggtcg#
#cgaaccgaacttggaagtaatcgtcaatct#
#caattgccgcaatattgctgtcacacgccc#
Rv0160c
Rv1172c
Rv1788
Rv0151c
Rv2408
PE_PGRS subfamily
PE_PGRS21_1027_1056
PE_PGRS21_1028_1057
PE_PGRS21_1029_1058
PE_PGRS21_1030_1059
PE_PGRS21_1031_1060
PE_PGRS21_1032_1061
#gtcaccttcagtagtagcttaagtggcctt#
#tcaccttcagtagtagcttaagtggccttt#
#caccttcagtagtagcttaagtggcctttc#
#accttcagtagtagcttaagtggcctttcc#
#ccttcagtagtagcttaagtggcctttccg#
#cttcagtagtagcttaagtggcctttccgg#
Rv1087
Rv1087
Rv1087
Rv1087
Rv1087
Rv1087
PE_PGRS62_585_614
PE_PGRS62_586_615
PE_PGRS62_587_616
PE_PGRS62_588_617
PE_PGRS62_589_618
PE_PGRS62_590_619
PE_PGRS62_591_620
#ggcgtactacatccaacagattattagctc#
#gcgtactacatccaacagattattagctcg#
#cgtactacatccaacagattattagctcgc#
#gtactacatccaacagattattagctcgca#
#tactacatccaacagattattagctcgcag#
#actacatccaacagattattagctcgcaga#
#ctacatccaacagattattagctcgcagat#
Rv3812
Rv3812
Rv3812
Rv3812
Rv3812
Rv3812
Rv3812
Using a window size of 30 mers
9 PE member specific sequences
2 PE_PGRS member specific sequences
Results of the probe search
Window size
25 mers
20 mers
16 mers
Total (including 30 mers)
Newly identified member
specific sequences
PE
0
PE_PGRS
1
PE
5
PE_PGRS
10
PE
13
PE_PGRS
31
27 PE + 44 PE_PGRS = 71 PE member
specific sequences
82.5% coverage
Overall
2187 member specific sequences targeting PE and PPE
families were derived
Hybridization of a biotinylated PCR product corresponding to a
PE_PGRS gene with member specific sequences of 34 other
PE/PE_PGRS gene probes
a3812 Ps1
a3812 Ps1
Set up of a 50 mer-based PE/PPE
specific microarray protocol
Initial conventional
hybridization
conditions
Improved hybridization
conditions
29 polymorphic sites out of a total of 177 could be detected
58 Tunisian isolates
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
?
?
?
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
?
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
?
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
?
?
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Controls
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
?
+
?
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
?
+
?
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
?
+
+
+
+
+
+
+
+
+
M. bov
M. bov
H37Rv
CDC1551
Towards PE/PPE-based phylogenomics
Acknowledgements
Part of this work was supported by funds from the United Nations
Development Program/World Bank/World Health Organization Special
Program for Research and Training in Tropical Diseases (TDR).
Special thanks to Anis Karboul and Amine Namouchi, Nico Gey van
Pittius (US, Cape Town), Roland Brousseau (BRI, Montreal), and Cristina
Gutierrez (IP, Paris).