Transcript Document
Pathogenomics: An interdisciplinary approach for the study of infectious disease
Fiona S. L. Brinkman 1,2, Yossef Av-Gay 3, David L. Baillie 4, Stefanie Butland 5, Rachel C. Fernandez2, B. Brett Finlay 2,6, Robert E.W. Hancock 2,
Christy Haywood-Farmer 7, Steven J. Jones 8, Audrey de Koning 7, Don G. Moerman 7,9, Sarah P. Otto 7, B. Francis Ouellette 5, Iain E. P. Taylor 10, and
Ann M. Rose 1.
1
Dept of Medical Genetics, 2 Dept of Microbiology and Immunology, 3 Dept of Medicine, 6 Biotechnology Laboratory, 7 Dept of Zoology, 9 C. elegans Reverse
Genetics Facility, 10 Dept of Botany, University of British Columbia, 4 Dept of Biological Sciences, Simon Fraser University, 5 Centre for Molecular Medicine and
Therapeutics and 8 BC Genome Sequence Centre, Centre for Integrated Genomics, Vancouver, British Columbia, Canada.
Goal
This project brings together a unique combination of UBC researchers
and affiliates who, through exchange of new data and ideas, and
capitalizing on new genomic and bioinformatic tools, will develop an
automated approach to identify previously unrecognized mechanisms of
pathogenicity.
Rationale and Power of the Approach
The processes of microbial pathogenicity at the molecular level are still
minimally understood. Genomics and bioinformatics provide powerful new
tools for the study of pathogenicity, hence the initiation at UBC by Dr.
Julian Davies of a new field, Pathogenomics. The specific approach we
are proposing is anchored in the fact that, as part of the infectivity
process, many pathogens make use of host cellular processes. We
hypothesize that some pathogen genes involved in such processes will be
more similar to host genes than would be expected (based on phylogeny).
We will identify such genes by applying specific bioinformatic and
evolutionary analysis tools to sequenced genome datasets, and further
examine such genes in the laboratory (both the pathogen gene and
homolgous model host gene). We hypothesize that this approach will
reveal new mechanisms of pathogen-host interaction, leading to a deeper
understanding of the fundamentals of pathogenicity.
Power of the Approach
•Expression-independent method for identifying possible pathogenicity
factors.
•Interdisciplinary team fosters unique ideas and collaborations.
•Automated approach can be continually updated.
•Enables better understanding of both the pathogen gene and
homologous host/model host gene.
•Provides insight into horizontal gene transfer events and the evolution of
pathogen-host interactions.
•Public database of findings, to be developed, will enable other
researchers to capitalize on the findings and promote further
collaboration.
An Interdisciplinary Team
Genomics and
Bioinformatics
Pathogen
Functions
Evolutionary
Theory
Project Summary
Pathogens being Studied
We are utilizing bioinformatics tools to identify pathogen genes which interact
with their host proteins and pathways. A unique combination of informatics,
evolutionary biology, microbiology and eukaryotic genetics is being exploited to
identify pathogen genes which are more similar to host genes than expected,
and likely to interact with, or mimic, their host’s gene functions. We are building
a database of the sequences of these proteins, based on the increasing number
of pathogen genomes which have been, or are currently being, sequenced.
Candidate functions identified by our informatics approach will be tested in the
laboratory (see flow chart) to investigate their role in pathogen infection and host
interaction. All information will be eventually made available in a public
Pathogenomics Database.
(selected examples)
Iteratively refine the
initial screening
methods and
candidate ranking.
Evolutionary significance.
Manually inspect candidates. Are these valid cases of horizontal
transfer, convergance and co-evolution or are they similar by
chance? If horizontal transfer may be involved, when did this transfer
occur?
Prioritize for further biological study.
Has the candidate pathogen gene or a eukaryotic homolog been
previously studied biologically? Can a putative function be inferred
from its sequence? Is there a C. elegans homolog? Is the pathogen
currently studied by UBC functional pathogenomics bacterial group?
Has the genetic pathway of the host protein been dissected?
If C. elegans homolog exists:
target gene for knockout by
knockout facility.
Analysis of knockout through
expression chip, and
susceptibility to infection by
pathogen.
Host Functions
Target for GFP fusion analysis
to see when and where the
gene is expressed in C.
elegans
If pathogen being studied by
UBC functional
pathogenomics bacterial
group: Examine subcellular
localization and obtain a
knockout of the gene.
If pathogen is not a focus
of UBC group: Contact
other groups regarding
results – instigate
collaboration for further
study.
Analysis of knockout and
gene through expression chip
analysis and infectivity in an
animal/tissue culture model,
and C. elegans model if
appropriate
Primary Disease
Bordetella pertussis
Whooping cough
Borrelia burgdorferi
Lyme disease
Campylobacter jejuni
Gastroenteritis
Chlamydia pneumoniae
Chlamydial pneumonia
Chlamydia trachomatis
Chlamydia
Escherichia coli
Diarrheal and urinary tract infections
Haemophilus influenzae
Upper respiratory infections and Meningitis
Helicobacter pylori
Peptic ulcers and gastritis
Leishmania major
Leishmaniasis (kala azar)
Listeria monocytogenes
Listeriosis
Mycoplasma pneumoniae
Mycoplasmal pneumonia
Mycobacterium tuberculosis
Tuberculosis
Neisseria gonorrhoeae
Gonorrhea
Neisseria meningitidis
Meningitis
Plasmodium falciparum
Malaria
Selected examples of pathogen proteins with higher than expected similarity
to host/eukaryotic proteins:
Pseudomonas aerguinosa
Variety of mucosal infections (opportunistic)
Rickettsia prowazekii
Epidemic typhus
Yop proteins of Yersinia species
Salmonella typhi
Typhoid fever
The Yop virulon is an integrated system allowing extracellular Yersinia
bacteria to disarm host cells involved in the immune response, to
disrupt their communications (or even to induce their apoptosis) by
the injection of bacterial effector proteins (for review, see Cornelis,
1998). YopH, a protein-tyrosine phosphatase, is a member of this
system and it shares higher than expected similarity to eukaryotic
protein-tyrosine phosphatases.
Streptococcus pyogenes
Strep throat, scarlet fever, necrotizing fasciitis
Treponema pallidum
Syphillis
Ureaplasma urealyticum
Urethritis
Vibrio cholerae
Cholera
Yersinia pestis
Plague
Examples
Initial screen for candidate genes.
Search pathogen proteins against sequence databases. Are the
results inconsistent with the phylogeny (i.e. does the protein match
more strongly the host, or its relatives, than you would expect?) Use
low complexity filtering such as SEG.
Rank candidates.
Rank pathogen protein in terms of how much more they resemble
their host phyla than their own (e.g. the difference in BLAST score,
through phylogenetic tree building, and by identifying unusual codon
usage). Is the gene or gene's pathway a usual component of the
pathogens phyla? Also rank based on other factors such whether the
candidate gene encodes a probable surface-exposed or secreted
protein.
Pathogen
Isoleucyl-tRNA synthetase of Staphylococcus aureus and others
Resistance to mupirocin, a topical antimicrobial agent used against S.
aureus, appears to be mediated by amino-acid substitutions in
isoleucyl-tRNA synthetase (ITS) which mupirocin normally inactivates.
The source of this mutant ITS is not recent random mutation of S.
aureus ITS, but rather a plasmid containing an ITS gene that is more
similar to eukaryotic ITS than organism phylogeny would predict
(Brown et al., 1998). Other bacteria have been identified that contain
this mutant ITS (that is similar to eukaryotic ITS), and all of these
bacteria share resistance to mupirocin. Based on phylogenetic
analysis, Brown et al., propose that a eukaryotic ITS gene was
transferred to an unknown bacteria shortly after Eukarya and Archaea
divergence, and that this gene was then recently transferred via a
plasmid to S. aureus. Since Pseudomonas fluorescens naturally
produces mupirocin (as pseudomonic acid), resistance to this
compound may have conferred a competitive advantage to specific
bacteria.
C. trachomatis contains a number of “eukaryotic-like” genes involved
in functions such as fatty acid biosynthesis. Most of these group
phylogenetically with plant proteins (see tree below). Stephens et al.
(1998) have proposed that the evolution of chlamydiae as intracellular
parasites started with an opportunistic interaction with amoebal hosts,
and the protochlamydiae became amoebal parasites or symbionts for
a period long enough to acquire the "plant-like" genes, whose origin
may actually be amoebal.
Aquifex aeolicus
96
100
Escherichia coli
Continually exchange pathogen gene
information with collaborators and with
eukaryotic geneticists studying
homologous gene in C. elegans
Anabaena
100
Synechocystis
100
Chlamydia trachomatis
63
64 Petunia x hybrida
83
Our team comprises an unique group of Bioinformaticians,
Evolutionary Theorists/Mathematical Modelers,
Microbiologists, Geneticists and an Ethicist
(not all are shown above).
Database development. Create and maintain a database of pathogen-host
interactions. Establish this as a platform for accelerating the study of pathogenicity
and the identification of therapeutic drug targets.
H. pylori
H. sapien
M. musculus
B. burgdorferi
S. cerevisiae
M. genitalium
M. pneumoniae
R. prowazekii
P. aeruginosa
E. coli
EUKARYA
C. trachomatis
C. elegans
C. pneumoniae
H .influenzae
Aquifex
B. subtilis
S. ynechocystis
P. falciparum
M. tuberculosis
T. maritima
A. fulgidus
P. furiousis
A. pernix
M. thermoautotrophicum
0.1
M. jannaschii
Leishmania
ARCHAEA
Above: Small subunit rRNA tree for organisms whose genomes are completed (plus
selected reference eukaryotes). Neighbor-joining tree constructed using Ribosomal
Database Project (www.cme.msu.edu/RDP/html) alignments.
Enoyl-ACP reductase of Chlamydia trachomatis
Haemophilus influenza
Continually exchange C.
elegans gene information:
with microbiologists studying
homologous pathogen gene
T. pallidum
BACTERIA
Nicotiana tabacum
Brassica napus
99
Arabidopsis thaliana
0.1
52
Oryza sativa
Left: Phylogeny of
chlamydial enoyl-acyl
carrier protein reductase
(a protein involved in
lipid metabolism) using
the neighbor-joining
distance method
(Felsenstein, 1996).
Numbers at forks
indicate the number of
times out of 100 that the
given node was
observed.
Acknowledgements
This project is funded by the Peter
Wall Institute for Advanced Studies,
which supports “fundamental,
interdisciplinary research and
creative activities, which have the
potential to result in significant
advances to knowledge.”
References
1.
Felsenstein, J.. 1996.
Methods Enzymol. 266: 418427.
2.
Stephens, R.S., S. Kalman, C.
Lammel, J. Fan, R. Marathe,
L. Aravind, W. Mitchell, L.
Olinger, R.L. Tatusov, Q.
Zhao, E.V. Koonin, R.W.
Davis. 1998. Science 282: 754
– 759.
3.
Brown, J.R., J. Zhang, J.E.
Hodgson. 1998. Current Bio.
8:R365-R367.
4.
Cornelis, G.R. A. Boland, A.P.
Boyd, C. Geuijen, M. Iriarte,
C. Neyt, M.P. Sory, I. Stainier.
1998. Microbiol Mol Biol Rev
62:1315-1352.