Genomes of bacterial pathogens and their diversity

Download Report

Transcript Genomes of bacterial pathogens and their diversity

Genomes of bacterial pathogens and
their diversity
Philippe Glaser - [email protected]
1. Introduction: general concepts on pathogenic
bacteria and their genomes
2. How to sequence a bacterial genome
3. Two examples: the genus Listeria and
Streptococcus agalactiae
Examples of bacterial species and diseases
Tuberculosis
Leper
Cholera
Whooping cough (coqueluche)
Soar throat
Meningitis
Mycobacterium tuberculosis
Mycobacterium leprae
Vibrio cholera
Bordetella pertussis
Streptococcus pyogenes and viruses
Neisseria meningitidis and other bacteria
Gonococci
Plague (la peste)
Dysentery
Gastric cancer, ulcer, gastritis
Multiple diseases
….
Neisseria gonorrhoeae
Yersinia pestis
Shigella flexneri
Helicobacter pylori
Escherichia coli, Staphylococcus aureus
Published genome sequence of bacterial
pathogens
Shigella
Escherichia coli
Salmonella
Helicobacter
Pseudomonas
Yersinia
Stenotrophomonas
Burkholderia
Flavobacterium
Acinetobacter
Vibrio
Campylobacter
Staphylococcus
Enterococcus
Streptococcus
Listeria
Nocardia
Corynebacterium
Mycoplasma
2
4+2
3
3
1+2
3
0
0
0
0
4
1
4
1
9
4+1
0
1+3
6
1
12
1
3
2
Chlamydiae
Neisseria
Branhamella
Bordetella
Pasteurella
Actinobacillus
Haemophilus
Bartonella
Legionella
Leptospira
Borrelia
Treponema
Mycobacterium
Rickettsia
Anaplasma
Coxiella
Ehrlichia
Clostridium
4+1
2
0
3
1
0
2
3
3
2
1
2
5
3
0
1
0
2+1
2
4
2
1
Total: > 80 published genomes
Biodiversity of the microbial world
4 000 000 000 000 000 000 000 000 000 000 bacteria on hearth
3,5 billion years of evolution
5000 culturable species - 500 000 (?) species
Bacterial diversity in a Yellowstone
hot spring
Principle of the experiment:
Sample
PCR amplification of
16 S RNA
Cloning
300 clones
84 sequences
14 phyla
First analysis
by restriction
DNA Sequencing 54 bacterial
of 122 clones
groups
(Hugenholtz et al., J. Bacteriol 1998, 180 366-376)
38 sequences
12 new phyla
Diversity of the non-culturable bacterial world
How to define a bacterial species
• For eukaryotes the species definition is based on
sexual reproduction.
Not possible for bacteria
1. Phenotypic definition
2. Molecular definition:
70% of “similarity” by genomic DNA hybridization
More than 97% of identities between the 16S RNA
genes
=>A convenient definition but not fully satisfactory
Interactions between humans (the host)
and bacteria
• The human body constitutes multiple ecosystems for
bacterial communities:
–
–
–
–
›
›
›
•
The digestive tract
The throat
The skin
Other places are normally sterile (urine, milk, blood)
Symbiotic bacteria
Commensal bacteria
Pathogenic bacteria
Opportunistic pathogens and obligatory pathogens
Bacteria and their environments
Reservoir
Animals
Water
Soil
Food
…
Human
host
Vectors
The ecology of the pathogenic bacteria or understanding its
adaptation to these environments (growth conditions)
Some questions in the study of human
bacterial pathogens
• What are the virulence factors and the host - pathogens interaction
factors?
• What is the physiology (the metabolism) of the bacteria in interaction
with the host?
• What is the evolution of the bacteria which lead to its adaptation to its
host, and the relation with the non-pathogenic related species?
• The identification of diagnostic and typing molecular tools
• The identification on a rational basis of antigens for a-cellular vaccines
• The identification of drug targets
 How to use genomics (and post-genomics) to solve these questions
Evolution & Biodiversity
Genome variability
DNA repair
Barriers to DNA transfer
Selection
Point mutation
Genome rearrangement
Gene duplication
Horizontal gene transfer
Biodiversity
=> virulence and pathogenicity
Size of bacterial genomes
Nanoarchaeum equitans
Mycoplasma genitalium :
Minimal genome
Escherichia coli
Mesorhizobium loti
Streptomyces coelicolor :
Human
<500 kb
0.580 Mb
4.6-5.6
7.036 Mb
8.667 Mb
3,000.000 Mb
481 genes
300-400 genes
4289-5648 genes
6752 genes
7825 genes
30000 genes
Adaptation : Transcription regulators - vs
genome size
(http://www.regx.de/m_project_bioinformatics.php)
Gene transfers in bacteria
Bacteriophages
Transduction
Plasmids
Transposons
Conjugation
Competence
Transformation
Mobile elements and gene gain
• IS elements => no associated function, gene integration by IS
mediated homologous recombination, gene inactivation.
• Transposon => carry functional genes
• Integron => a platform to incorporate new functions, multi-antibiotics
resistance.
• Phages => may carry virulence genes (cholera toxin)
• Pathogenicity (functional) islands
• Plasmids => may also carry transposons or integrons
• + gene duplication
 Identification of such elements in genome sequences
Gene lost
• By homologous recombination
• By insertion of IS elements
• By mutation : gene => pseudogene
 Evolutionary impact
 Reductive evolution (M. leprae, Y. pestis, B. pertussis)
 Role in virulence: lysine decarboxylase in Shigella (cadA+
derivative are less virulent)
Antigenic variation
• By recombination: a gene cassette is inserted in front of an active
promoter or remove from this position. (Brucella, Mycoplamsa
galisepticum)
• By mutation: variation of a micro satellite sequence length (homo
polymer tract) lead to frameshift deletion or reversion (Helicobacter
pylori, Neisseria meningitidis)
Protein families and gene duplications
• May arise by gene duplication or horizontal gene acquisition
• Metabolic functions, surface proteins (antigens)
• Correspond to a specificity of a species
• Frequently discovered after whole genome sequencing
Analysis of the genome of a bacterial
pathogen
• Annotation of the genome
• Analysis of regulatory genes
• Analysis of inactivated genes (pseudogenes)
• Identification of protein families and mechanisms of phase variation
• Identification of mobile elements
• Identification of atypical regions (recently acquired)
 Information obtained from comparative genomics
DNA sequencing
DNA automated sequencing machines produce 800
bases long sequences with an accuracy of 99 %.
=> How to sequence a 4 Mb bacterial genome with
an accuracy higher than 99.99%?
Two strategies : directed or random
Directed strategy
Chromosome
Ordering clones
of a large-insert library
(cosmids, lambda or BAC)
Sequencing clone by clone
of the minimum tiling path
Complete sequence
Random strategy
Chromosome
Random sequencing
of a large number
of clones
Sequence
assembly
Complete sequence
‘Whole genome shotgun’
Large-insert
library
(pSYX34 and BAC)
Chromosome
End-sequencing
(large-insert fragments)
Small-insert library
(pcDNA2.1)
End-sequencing
(small-insert fragments)
Assembly of sequences in contigs
Annotation
closure
Complete
Genome sequence
Organization of a project
Choice
of the strategy
Library
construction
DNA preparation of plasmid clones
High throughput sequencing of both ends of inserts
Assembly
Finishing: gap
Annotation
closure and resequencing of low quality regions
Libraries
Libraries of insufficient quality => No sequence
Important features : coverage of the chromosome, absence of
co-ligation, absence of clones without an insert, size of the inserts.
Different types of libraries:
* size of the inserts
* copy number of the vector
High-copy number vector : 1 to 3 kb inserts
Low-copy number vector : 8 to 12 kb inserts
Bacterial artificial chromosome : 50 to 100 kb inserts
Construction of a 1 - 3 kb long inserts library
Chromosomal
DNA
pcDNA: high copy
number vector
Two repeated BstXI sites
5’CCAG
TGTG ATGG…CCAG CACA CTGG3’
3’GGTC ACAC TACC…GGTC GTGT GACC5’
Nebulization
End repair by
T4 polymerase
Ligation of BstX I
adaptors,
Size selection of the
inserts
Purification of the
digested vector
(two 5’ protruding ends)
5’pCTTTCCAGCACA3’
3’GAAAGGTCp 5’
TGTG
ACAC
Ligation, transformation
CACA
GTGT
Recombinant plasmid
Bacterial artificial chromosome (BAC)
Vector based on naturally occurring F-factor plasmid found in E. coli
Cloning of DNA fragments of 100- to 300-kb (average, 150 kb) in E. coli
» strict copy number control
»stably maintained at 1-2
copies per cell
»lacZ-based color
selection of BAC clones
with inserts
BAC library construction
Preparation of chromosomal DNA in agarose plugs
Partial digestion with HindIII or BamHI
200 kb
150 kb
100 kb
50 kb
Ligation vector + DNA purified from agarose plugs
Electroporation into E. coli DH10B
Verification of insert size on PFGE gels after NotI digestion
Inserts of 70 - 150 kb
Linearized BAC vector (7kb)
200 kb
150 kb
100 kb
50 kb
Automation
High throughput sequencing
DNA Sequencing 15 years ago!
Automated DNA sequencing
Automated sequencing
Sequence
assembly
Phred, Phrap,
Consed
http://www.phrap.o
rg
Statistics and progress of the project
Finishing
Re-sequencing
Sequencing
of regions containing low ‘quality’ sequences
of ‘missing’ regions
Contig A
Contig B
Sequence gaps
Cloning gaps
Contig A
Contig D
Contig B
Contig E
Contig C
Contig F
Timing of a bacterial genome project
Library construction and verification (one month)
Plasmid preparation 5000 minipreps per Mb (7 days)
Sequencing : 10000 sequences per Mb (20 days, ABI 3700)
PCR : highly variable (250 reactions per Mb)
Consumable costs : 10 000 Euro per Mb
Listeria monocytogenes
foodborne pathogen
Transmission:
dairy products, meat, vegetables, fish
Disease:
meningitis, encephalitis, septicemia,
abortions, neonatal infections, gastroenteritis
Population at risk: elderly, newborns, immuno-comprimised,
pregnant women
Mortality rate:
30%
Concern for public health
Problem for food industry
Ecology of L. monocytogenes
• Ability to survive and to grow in extreme conditions: low
temperature, low water activity, broad ranges of pH…
• Ubiquitous in the environment but at very low count
• Variable count depending on the microenvironment and the
season at a single location
• Interaction with the vegetal world (silage) and the animal
world (waste)
Interaction of Listeria with its hosts
•
•
•
•
Carriage is frequent but transient
Low concentration of Listeria in feces
Intracelullar parasite
Ability to cross three barriers: intestinal, hemato-encephalic
and placental barrier
• Provokes a broad range of diseases : gastroenteritis,
septicemia, meningitis, encephalitis, abortions
• At risk population : immuno-compromised, elderly, pregnant
women and new-born
What are the relations between the two facets of this
bacterium?
Phylogenetic tree of the genus Listeria
L. ivanovii
L. grayi
L. seeligeri
L. innocua
L. welshimeri
L. monocytogenes
(Pathogenic species)
B. subtilis
Vaneechoutte et al. Int J Syst Bact. (1998) 48, 127-139
Genome comparison
L. monocytogenes L. monocytogenes L. innocua L. ivanovii
EGDe
4b
Genome size
rRNA operons
CDS
Phages
IS
Plasmide
2944 kb
2943 kb
3011 kb
2929 kb
6
6
6
6
2848
2795
2968
2782
1
0
5
0
1 (3 copies)
1 transposon
0
0
5
--
--
81.9 kb
--
L. monocytogenes/B. subtilis synteny
4500000
4000000
3500000
3000000
2500000
2000000
1500000
1000000
500000
0
0
500000
1000000
1500000
2000000
Listeria monocytogenes
2500000
3000000
L. innocua
L. ivanovii
Synteny between Listeria genomes
L. monocytogenes EGDe
L. monocytogenes EGDe
 Absence of rearrangement between genomes
 Rare translocations : probably deletion + insertion
L. monocytogenes
chromosome map
L . monocytogenes
270 ‘specific’ genes
L. innocua
149 ‘specific’ genes
http://genolist.pasteur.fr/listilist
G+C
content
G+C content of the 270 CDSs
specific for L. monocytogenes
14
Total
12
Nb of CDSs (%)
10
Specific
8
6
4
2
0
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 52
G+C%
Competence operons in L. monocytogenes
A
B
C
D
E
F
G
comG
37-34
32-21
A
34-37 30-18 33-18 32-23
B
C
39-69
34-32
31-17
comE
34-44
comC
27 - 24
comF
A
38-43
C
37 : GC%
34 : % identities Bs ortholog
35-35
2695.1 and 2014.1 two comEC paralogs (DNA binding protein)
Amino acids
41 surface proteins with an
LPXTG motif
2500
2000
*
1500
1000
500
*
*
*
* *
*
*
*
*
*
*
*
*
** *
0
InlA-like
*
= absent from the L. innocua genome
*
**
L. monocytogenes / L. inocua comparison
Known virulence factors missing in L. innocua
Surface proteins missing in L. innocua
Metabolic pathways missing in L. innocua
Sugar PTS
Hexose phosphate permease
Bile acid hydrolase
Arginine deimidase
Glutamate decarboxylase
L. monocytogenes - L. ivanovii
2944 / 0
L . monocytogenes
345 ‘specific’ genes
Virulence gene cluster
inlA inlB
hpt
L. ivanovii
350 ‘specific’ genes
bsh
inlC
L. ivanovii
L. grayi
L. seeligeri
The virulence gene cluster
L. innocua
prs
gcaD
spoVC
ctc
L. welshimeri
L. monocytogenes
mfd
yabK
B. subtilis
prs
prfA
plcA hly mpl actA plcB orfX orfZ orfB orfA ldh
prfA
ldh
ctc
plcA hly mpl i-actA plcB orfS orfT orfB orfA
ldh
ctc
orfT orfB orfA
ldh
ctc
prs
prs
L. monocytogenes
orfZ orfB orfA
prs
prs
ctc
prfA
plcA hly
orfT orfB orfA
prs
orfB 5,5 kb
ldh
ldh
ctc
L. innocua
L. ivanovii
L. welshimeri
L. seeligeri
L. grayi
 Complex history with several events of insertion and deletions
B. subtilis
Entrée
InlA, InlB
The inlA - inlB region
Lyse de la double
membrane
LLO, PlcB
Lyse de la vacuole
LLO, PlcA
Passage de cellule à cellule
ActA
Mouvement intracellulaire
ActA
L. monocytogenes EGDe
lmo0415
lmo0432 inlA
inlB
lmo0435 (LPXTG)
lmo0439
17
gènes
L. monocytogenes 4b
lmo0432
wapA-like
inlA
Lin439*
lmo0439
inlB
L. innocua
Lmo0435 (LPXTG)
wapA-like
lmo0439
lin439
lmo0432
L. ivanovii
lmo0415 amidase
inl-like
inlA
inlB-like
inl-like
inlB
lmo0439
Other virulence genes
bsh, bile salt shydrolase
2066 bsh
groEL
L. monocytogenes
bsh
groEL
LPXTG
L. ivanovii
groEL
L. innocua
PrfA box is not conserved
hpt, hexose phosphate transport
837
295nt
hpt
107nt
839
L. monocytogenes
457nt
hpt
153nt
L. ivanovii
L. innocua
31nt
The PrfA box is conserved
: pseudogene
Listeria ivanovii - closer to a real pathogen?
Some specific functions related to virulence
tRNA
lmo1240 1241
1242
L. monocytogenes and L. innocua
tRNA
i-inlB2
sphingomyelinase-c
i-inlL i-inlK
i-inlB
i-inlJ i-inlI i-inlH i-inlG
i-inlF i-inlE
lmo1242
lmo1240
A second pathogenicity island
L. ivanovii
Lmo2699
: soluble internalin
2700
L. monocytogenes and L. innocua
Lmo2699
L. ivanovii
2700
LPXTG
Capsule biosynthesis ?
And 96 inactivated genes (pseudogenes)
Conclusions
 Contrary to the rest of the genome, virulence genes have a complex
history.
 Possible cycle of virulence genes gain and lost. These cycle may play a
role in the evolution of the genus and in the emergence of species.
 Functions required for intracellular multiplication are conserved between
the two pathogenic species.
 Interactions with the host and physiopathology are probably different and
involve different factors.
 The specialization of L. ivanovii is linked to the presence of specific
genes and to the lost of a large number of functions.
What is the diversity within the species L. monocytogenes
Listeria monocytogenes
Serovars
1/2a
1/2b
1/2c
3b
3c
3a
4a
4ab
4b
4c
4d
4e
7
Epidemiological data
• The great majority of human listeriosis
cases is caused by 1/2a, 1/2b and 4b strains
• Serovar 4b strains are responsible for almost
all major epidemics of human listeriosis as well
as for most of the sporadic cases
AscI profiles of L. monocytogenes strains WHO-multi center study
AscI genomic fingerprints of 62 representative
Listeria monocytogenes strains
Genomic Division I
1/2a, 3a
1/2c,3c
Genomic Division II
1/2b, 3b
4b , 4d, 4e
kb
582
485
388
291
242
194
145
97
48
23
Brosch et al., 1994, AEM 60:2584-92,
High density membranes for Listeria
hybridisation with chromosomal DNA of
• clinical (epidemic) isolates
• food isolates
• environmental isolates
Correlation of genomic and epidemiological data
Should allow the:
Identification of genes consistently absent or present in e.g.
epidemic and clinical isolates
Development of:
New tools for genomic typing
New accurate methods for diagnostics
gene A
gene B gene C
control
L. monocytogenes EGDe
1/2a
L. innocua 6a L. monocytogenes
4b
Hybridization patterns of L. monocytogenes
Hybridized with
genomic DNA of:
L.m.
sv. 1/2a
L.m.
sv. 1/2c
Hybridized with
genomic DNA of:
L.m.
sv. 4b
L.m.
sv. 1/2b
Hybridisation with different Listeria strains
L. monocytogenes
94 strains
Serovar: 1/2a, 1/2c, 1/2b, 3a, 3b, 3c, 4a, 4b, 4c, 4d, 4e, 7
Origin: Environment, food, animals, production environnement
human (sporadic and epidemic cases)
Listeria ivanovii
5 strains
Listeria innocua
7 strains
Listeria welshimeri
2 strains
L isteria seeligeri
2 strains
In total 110 strains belonging to all species of the genus Listeria
Grouping 460 genes for 112 strains of Listeria
Sérovar:
4b, 4e, 4d
Sérovar:
1/2b, 3b, 7
Sérovar:
1/2a, 3a
Sérovar:
1/2c, 3c
Sérovar:
4a, 4c
Listeria sp.
L. monocytogenes
I
I.1
II
I.2
II.2
III
II.1
ORF0799
ORF2372
ORF2110
ORF2819
ORF3840
ORF2568
ORF1761
ORF0029
Lmo0171
Lmo0172
Lmo0525
Lmo0734
Lmo0735
Lmo0736
Lmo0737
Lmo0738
Lmo0739
Lmo1060
Lmo1061
Lmo1062
Lmo1063
Lmo1968
Lmo1969
Lmo1971
Lmo1973
Lmo1974
Conclusion
 The L. monocytogenes species shows a broad genomic
diversity
 Genomes are stable and horizontal genetic exchanges are
rare.
 The species and subspecies are well defined by a set of genes
and it seems that there is no continuum between groups.
 The notion of species is probably not only an arbitrary one.
 DNA array is a powerful genome-level typing tool for
epidemiological studies and research.
Streptococcus agalactiae (group B)
Part of the normal flora colonizing the gastrointestinal tract,
of an important part of the population, and may colonize
the urogenital tract.
Disease: Rare infections of immuno-compromised adults
Leading cause of invasive infections in neonates
septicemia (early onset disease)
pneumonia (early onset disease)
meningites (late onset disease)
=> Surveillance of pregnant women to avoid mother-infant
transmission
=> Development of a vaccine ?
Biodiversity within the species S. agalactiae
Two ecovars
Characterstics
Human
Bovine mastitis
__________________________________________________
Pigment
Lactose
Salicin
Beta-galactosidase
Bacitracine sensitivity
Protein antigens
+
+
R, Icp
+
+
+
+
X
(Finch & Martin, 1984)
Other animal origins: diseases in various mammals and fishes
Human origin: carriage or invasive strains
MLEE, MLST pointed the existence of an hypervirulent lineage.
Q. What is the genomics basis of this diversity?
Phylogenetic relationship among Streptoccocci
S. agalactiae
S. pyogenes
S. equi
S. anginosus
S. pneumoniae
S. mitis
S. uberis
S. sanguis
S. suis
S. salivarus
S. bovis
S. mutans
S. pleomorphus
(from Kawamura et al. J. Syst. Bacteriol. 1995)
Genome comparison
S. agalactiae
NEM316
S. pyogenes
S. pneumoniae
2160 kb
Size of the genome
2 206 kb
1852 kb
Ribosomal operons
8
6
4
2182
1752
2236
CDSs
Mobile
elements
8 IS
17 IS
12 phage like
4 bacteriophages
integrases
2 integrated plasmid
(1 with 3 copies, 42kb)
105 IS
Synteny between S. agalactiae and S. pneumoniae
(1141 pairs of orthologous genes)
S. pneumoniae
2000000
1500000
1000000
500000
0
0
500000
1000000
1500000
S. agalactiae
2000000
Synteny between S. agalactiae and S. pyogenes
(1170 pairs of orthologous genes)
2000000
1800000
S. pyogenes
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0
500000
36 recombination breakpoints
1000000
1500000
S. agalactiae
2000000
14 mobile islands (532 genes)
G+C/G-C
G+C%
Genes related to mobile element within the islands
tRNA-A
I
II
tRNA-L int rep rep mob.
int rep tra
III VII VIII
IV tRNA-R
V
16 kb
rep parA plasm.tra ssb
XII tRNA-K
XIII
XIV
46 kb
19 kb
11 kb
tnp
59 kb
tra
IX
XI
rep
Plasm. Phage int tRNA-T
int tnp tnp
VI
X
18 kb
46 kb
mob plasm. tra phage rep
int
tRNA-A
tnp
Int
46 kb
tnp
pol
int rep
int
rel
hel
25 kb
tra
rep
33 kb
phage
86 kb
rep phage int int
45 kb
23 kb
NEM316 - SAG2603 genome comparison
• No chromosomal rearrangement between the two strains
• No integrated plasmid in SAG2603 but three prophages
•1799 orthologs among these two genomes (633 100% identical)
=> 241 Nem316 genes are missing in SAG2603 (37,
backbone)
=> 258 Sag2603 genes are missing in NEM316 (42,
backbone)
Although highly variable 10 mobile islands are conserved.
NEM316 / SAG2603 - conserved backbone
0
gbs1823
His triad prot
gbs1740-1749
ABC transporter
gbs0046-47
gbs0086-87
gbs0162-163
sga0046
sag0086-88
sag1780
sag1697-1703
gbs0493
gbs1400-1401
ABC transporter
sag1330-1331
protein R5
cpsJDNMH
NEM316
SAG2603
gbs1240-1242
Comparative analysis of island XII
NEM316
Lmb scpB
Lactose
utilization
SAG2603 A/B
Mercuric and cadmium
resistance
98%<
95%<
90%<
80%<
70%<
60%<
<100%
<98%
<95%
<90%
<80%
<70%
<60%
adhP : alcohol dehydrogenase
pheS : Phenylalanyl tRNA synthetase
atr Amino acid transporter
glnA glutamine synthetase
sdhA serine dehydratase
glcK glucokinase
tkt transketolase
MLST results for
S. agalactiae
Sag2603
« Hypervirulent »
NEM316
(Jones et al., 2003 Int. J. Clin. Microbiol.)
DNA arrays hybridization for genome
characterization
68 strains analyzed by MLST and
hybridization
• 10 invasive ST-17 strains (MLST study)
• BM110, hypervirulent clone defined by
MLEE
• 18 invasive strains (Hôpital Necker)
• 13 carriage strains (Hôpital Necker)
• 14 strains from bovine mastitis
• 12 strains of animal origin (horse, dog, cat,
rabbit, guinea pig, fish)
Genome diversity is essentially located within
genomic islands
300
250
200
islands
backbone
150
100
50
0
lapin_6144_98
gui_pig_622
chien_928662
chat_693
chat_3448_97
poisson_2_22
bov_44
bov_501_19
bov_549.13
bov_547.25
bov_543.05
bov_527.25
bov_411.07
port_60_36bis
port_65.8bis
port_37.39
port_41bis
port_38bis
inv_1573
inv_318
inv_1568
inv_1560
inv_1002
inv_1572
inv_1000
inv_wc3
inv_mk2
inv_j95
inv_j81
inv_h11
inv_b9
Hierarchical clustering of 69 strains and comparison
with MLST data
st19
st1
st10,6,9
st23
st17
st103
st23
Two loci heterogeneously distributed among
isolates
I
rofA hemagglutinin
II
Glycosyl transferase
secY
secA
fibronectin
rogB binding protein LPXTG srtB srtC LPXTG
rofA and rogB are
mutated in sag2603
I ------------------++++++++++++++++++++++++++++++++--++++++++++
II ------------------++++++++++++++++++++++++++++--------+-+-++++
Conclusion
• Strains from different origin do not cluster except invasive
ST17 strains.
• ST17 strains constitute a highly homogenous group
• Diversity reside mostly within islands
• Antigenic diversity is highlighted by genome analysis and is
found both within and outside islands
• DNA arrays, a powerful method for molecular epidemiology