Presentation - Cloudfront.net

Download Report

Transcript Presentation - Cloudfront.net

Microbial genome analysis and
comparisons
Dave Baumler
Genome Center of Wisconsin, UW-Madison
[email protected]
Today’s session overview:
Introduction
Module #1) Microbial genomes at NCBI
(http://www.ncbi.nlm.nih.gov/Class/minicourses/)
-familiarize the tools and options using the NCBI tutorial “Microbial genomes
Quickstart”, learn how to download genome (.gbk) files
Module #2) Conduct genome alignments of phage genomes
-using Mauve to conduct whole genome alignments, familiarize yourself
with Mauve
Module #3) Compare genomes from 3 outbreaks of E. coli O157:H7
-identify genomic islands using Mauve & conservation of virulence factors
Module #4) Compare genomes from 5 strains of Yersinia pestis
-identify genomic islands, conservation of virulence factors, analyze
mutations with phenotypic consequences due to insertion and/or deletion
events and Single nucleotide polymorphisms (SNP’s), and paleomicrobiology
Conclusion
Choose one of the two Problems:
#1 Escherichia coli O157:H7 strain
Sakai
#2 Rickettsia prowazekii strain
Madrid E
Lists of all complete and in progress microbial genomes
Download full genome sequence (.gbk) files
Downloading Microbial Genome Files
#1) Look for the largest .gbk file
which is the main genome,
smaller .gbk files are plasmids
#2) Double click
on the file
#3 From the file
pull down choose
“Save page as”
give the file a
name with a .gbk
at the end
Links to other E. coli
and database and/or
resources
Brief information about the organism
Overview with links to assorted tools
Search on page for words using the Edit>>Find
in this page pulldown
Entrez
protein view
Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein
sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG
consists of individual proteins or groups of paralogs from at least 3 lineages and thus
corresponds to an ancient conserved domain.
COG link
Geneplot
Entrez Genome offers a new pairwise
comparison tool called GenePlot to visualize
similarities among bacterial genomes.
Support for fungal genomic comparisons is
also planned. To construct a GenePlot, genes
are numbered sequentially along the genomic
sequences of two organisms and the two
corresponding sets of predicted proteins are
compared using BLAST. For every case in
which a pair or proteins, one from each
genome, are mutual best matches, a point is
plotted using the indices of the equivalent
gene in the two genomes as the X and Y
coordinates. Use the GenePlot link from an
organism’s genome record to see a GenePlot
against the organism with which it shares the
highest number of reciprocal best hits.
Comprisons between other organisms can be
made using pull-down menus.
TaxMap
Comparisons of COG groups between various organisms
The ERIC database houses all of the available
genomes of the members of family Enterobacteriaceae
Boxes, represent
organisms with at
least one genome
sequenced
Human Pathogens
-Calymmatobacterium
-Moellerella
-Cedecea
-Morganella
-Citrobacter
-Plesiomonas
Insect Pathogens
/Endosymbionts
-Edwardsiella
-Proteus
-Enterobacter
-Providencia
Environmental/
-Brenneria
-Arsenophonus
-Escherichia
-Rahnella
Animals/Industrial
-Dickeya
-Buchnera
-Ewingella
-Salmonella
-Alterococcus
-Erwinia
-Sodalis
-Hafnia
-Serratia
-Budvicia
-Pantoea
-Klebsiella
-Shigella
-Buttiauxella
-Pectobacterium
-Kluyvera
-Tatumella
-Obesumbacterium
-Phlomobacter
-Leclercia
-Yersinia
-Pragia
-Sacchararobacter
-Leminorella
-Yokenella
-Trabulsiella
-Samsonia
-Wigglesworthia
-Xenorhabdus
Phytopathogens/
Plant-associated
Orthologs
If at least two of these criteria are met for the pair of genes in question they
are typically assigned as orthologs.
•Percentage identity and alignment percentage are in the typical range
•Local genome context, the conserved gene is part of an operon with other
genes that are already considered orthologs.
•Larger scale conservation of genomic context, the conserved gene is in
the same general genomic context as other orthologs.
•Functional conservation, the conserved gene is predicted or known to
perform the same function as the potential ortholog in another genome.
Reciprocal Best Blast hits
BlastP
X >60% Y
BlastP Y
X
>60%
Enterobacteria cont.
Generated from 180 orthologs
ERIC-Enteropathogen Resource Integration Center
Genomes
Tools & Annotations
Genome Views and
Comparisons
Part of a genome sequence
TCAGCGAAGATGAGATAGTTTTTAAAGGTGGGATTTCCCCACCTTTAAAAAGCGAGAAGTCCCGGTTTTAA
AGAGGAGTAAAATCCTCTTTTTCTAGCCCACTCAGGTGGTTTTTTTGGTTTTCGCTCCTTGCCGCATCTTC
TGTGCCTTTGATGGCGGCTGGTTGGGGTGAAAGGCTGCATATTCCAGAATTTCAGACAGTAGATTGTTTTT
GAAATCTTCCGTTTTATCGTTGACGAACTTAACCATCCTGTTGAAATCATCTTCCTTTGATACACCTTCAG
GAAATGCCTTAGGAACTGATGTTTGGCTATCCAAGGCATCTTGCAATATCTGCACGATCTCCGAATTCATT
GATCGCCCATTGGCCTTTGCTCTGGCGGCAACTGCGTCACGCATACCGTCAGGCATCCTAACTGTAAATCT
CTCAATGAAAGCTGGATCTTCTTTTTCAGTCATCATCTTAAACCATAAAAATTTATACAAAACACACTAGC
ATCATATTGACATTACCCACAATGACATCATAATGGTGTCAGGCATCAAAATGATGTCATCATGACAAGGG
GAAAGTAAATGCAAGATGTTCTCTATACAGGTCGTAAGAACGACAGCTTTCAGCTTCGTCTGCCTGAGCGA
ATGAAAGAAGAGATCCGTCGCATGGCAGAGATGGACGGCATTTCGATTAATTCTGCAATCGTGCAGCGCCT
TGCTAAAAGCTTGCGTGAGGAAAGAGTTAATGGGCAGTAAAAACAGCGAAGCCCGGAAGTGTGGGGACACT
AACCGGGCTTCTAATGTCAGTTACCTAGCGGGAAACCAACAATGACCAGTATAGCAATCTTTGAAGCAGTA
AACACTATCTCTCTTCCATTCCACGGACAGAAGATCATAACTGCGATGGTGGCGGGTGTGGCGTATGTGGC
AATGAAGCCCATCGTGGAAAACATCGGTTTAGACTGGAAGAGCCAGTATGCCAAGCTCGTTAGTCAGCGTG
AAAAGTTCGGGTGTGGTGATATCACCATACCTACCAAAGGTGGTGTTCAGCAGATGCTTTGCATCCCTTTG
AAGAAACTGAATGGATGGCTCTTCAGCATTAACCCAGCAAAAGTACGTGATGCAGTTCGTGAAGGTTTAAT
TCGCTATCAAGAAGAGTGTTTTACAGCTTTGCACGATTACTGGAGCAAAGGTGTTGCAACGAATCCCCGGA
CACCGAAGAAACAGGAAGACAAAAAGTCACGCTATCACGTTCGCGTTATTGTCTATGACAACCTGTTTGGT
GGATGCGTTGAATTTCAGGGGCGTGCGGATACGTTTCGGGGGATTGCATCGGGTGTAGCAACCGATATGGG
ATTTAAGCCAACAGGATTTATCGAGCAGCCTTACGCTGTTGAAAAAATGAGGAAGGTCTACTGATTGGCGT
ATTGGAAGGCGCAAAAAGAAAAGCCAGCAGATGGGCTGCTGGCATTCATTGGGTATATGAACTTTCGGAGA
ACATATGAAGTCAATTATCAAGCATTTTGAGTTTAAGTCAAGTGAAGGGCATGTAGTGAGCCTTGAGGCTG
CAAGCTTTAAAGGCAAGCCAGTTTTTTTAGCAATTGATTTGGCTAAGGCTCTCGGGTACTCAAATCCGTCA
What exactly are gene annotations?
Genome annotation is the process of attaching biological
information to sequences. It consists of two main steps:
1.-identifying elements on the genome, a process called
“structural annotation” or “gene finding”
1.-attaching information to these elements such as their
molecular and biological functions.
Annotation step #1: Structural Annotation
Example of a gene - the start codon is
green and the stop codon is red
Structural annotation consists of the identification of genomic elements (e.g. genes).
•Open Reading Frames (ORFs) also called coding sequences (CDSs) must have a start
codon and a stop codon
•location of regulatory motifs (such as promoters and ribosome binding sites)
•This step is typically automated using gene prediction software (Automation only
finds ~50-90% of the genes)
Annotation step #1: Structural Annotation (cont.)
using Genemark.hmm a statistical model
Annotation step #2
Functional annotation: consists in attaching biological
information to genomic elements.
•biochemical function
•involved regulation and interactions
•expression
•cellular location
Three examples of annotations for one gene:
•Name/synonym: a short “word” used to refer to the gene
(Ex. ureC)
•Product: a descriptive protein name (Ex. Urease gamma
subunit)
•Function : Describes what the protein does (Ex. Catalyzes
the hydrolysis of urea to form ammonia and carbon dioxide)
Module #2 Conduct genome alignments of phage genomes
-this module is developed to teach how to use Mauve using enterobacteria phage
-Phage genomes can be aligned using Mauve in a matter of minutes.
-applicable as a teaching tool to decipher the mosaicism of phage genomes.
-comparative studies of 30 mycobacteriophage genomes reveal new insights into the
diverse architecture and insight about gene exchange
(Hatfull et al. PLoS genetics et al. 2006)
-using Mauve, you could align EVERY mycobacteriophage genome available
-How diverse are enterobacteriophage?
(the following series of slides are Mauve alignments of phage isolated from E. coli,
Salmonella spp., Yersinia spp., and Shigella spp.) all alignments are also provided for
further inquiry
-we will run alignments with 3 phage genomes from E. coli O157:H7
Mauve: Multiple Genome Aligner
• Able to identify and align collinear
regions of multiple genomes even in the
presence of rearrangements
• Find and extend seed matches
• Group into locally collinear blocks
• Align intervening regions
(Darling et al. Genome Res. 2004
Jul;14(7):1394-403.)
Module #2 Understanding phage, the viruses that infect
microorganisms, via genome alignments
Recently aligned 56 enterobacterial phage, phage genomes are an
ideal training tools for teaching how to set up mauve alignments
Why Phage? Genomics timeline
1977 1982 1995 1996 1997 1998 2000
2001
2008
Step #1 copy the folder called 3 phage genomes for alignment
excercise, and paste it on the hard drive of your computer (C: drive)
Step #2 from the start menu, in programs select Mauve 2.1.1
Step #3 under the File pull down select Align with progressive Mauve
This new
window
will
appear
#4 click here to choose where to send
the output file, find the folder (from
Step#1), and double click on the folder
#5 Type in a file
name, and click on
Save
Next add the sequences to align
Click on Add sequence
Select the first phage genome
and click on Open, then
continue with the 2nd and 3rd
phage genomes. Then click on
Align to start the genome
alignment
When viewing the LCB’s, mauve
displays regions that are highly
conserved/identical as full color.
Areas that are
unique/variable to one
genome appear in white,
and represent unique
islands
Your tool bar is at the top on the left, the tools you will use
are in the View pulldown, and also the buttons
Returns the
viewer back to
home
Move left or right,
you will find this
useful to center a
region of interest
in the middle of
the screen prior to
zooming in
Zoom in/out, you
can also hold
down the ctrl
button and use the
arrows on the
keyboard
Search
for
features
Other useful commands in Mauve
Function
Key
Zoom in
Ctrl+Up
Zoom out
Ctrl+Down
Scroll Left
Ctrl+Left
Scroll Right
Ctrl+Right
Export the current view as
Ctrl+E
An image
Module #3) Dissecting virulence of E. coli O157:H7
using genome alignments
The first E. coli genome sequenced was the nonpathogenic E. coli K-12 genome MG1655
-determination of the complete E. coli
sequence required almost 6 years
-E. coli is the preferred model in
biochemical genetics, molecular
biology, and biotechnology and its
genomic characterization will
undoubtedly further research toward a
more complete understanding of this
important experimental, medical, and
industrial organism
(Blattner et al. Science 1997)
The first pathogenic E. coli genome sequence was
enterohaemorrhagic (EHEC) Escherichia coli O157:H7
strain 933 EDL
-In 1982 Escherichia coli
O157:H7 recognized as a
pathogen for human
disease
-Also known as EDL933
from the Michigan
outbreak in 1982 from
ground beef
-shiga toxin producing
(STEC)
(Perna et al. Nature 2001)
The completion of the 2nd E. coli O157:H7 (EHEC)
sequence strain Sakai
-In July 1996, an outbreak of Escherichia
coli O157:H7 infection occurred among
schoolchildren in Sakai City, Osaka,
Japan.
-8,938 schoolchildren sickened, 3 deaths
- We are starting to ask-What genomic
differences determine differences in
virulence, epidemiology, and fatality?
(Hayashi et al. DNA Res 2001)
In 2006 E. coli O157:H7
outbreak from bagged
spinach
(from CDC)
-multistate outbreak
205 people sickened, 3
deaths
Currently there are 13 E. coli O157:H7 Genomes sequenced,
we will have you focus on three that are all in the
Enteropathogen Resource Integration Center (ERIC)
database (www.ericbrc.org)
The three strains you will focus on are:
Escherichia coli EDL933 (EHEC)
Escherichia coli Sakai (EHEC) also called RIMD
Escherichia coli EC4042 (EHEC)
In your start menu under programs go to Mauve 2.1.1, start up
Mauve, notice there is a users guide in pdf form in this folder, this
will contain useful information and commands to navigate
Note: your computer may need to update Java, since mauve uses a
Java platform for the alignment.
You should see a
window for
Mauve appear
Next double click on the uncompressed 3 O157H7 folder, it should
contain the following 19 files, take the first one (3 O157 alignment),
and drag and drop it into the mauve window
It should start to say reading sequences here, and in a few seconds
the alignment will appear, note computers with less than 512MB
RAM may not be able to open the file
Your alignment should look like this
Organism
name notice
the first is
EDL933, the
second is
RIMD(Sakai),
and the third
is EC4042
(spinach)
Using the up or down arrows, you can switch
the position of the genomes
Top strand
Bottom strand
The colored blocks are called local colinear blocks (LCB’s), and
represent regions of the genome that Mauve has identified as
conserved, the lines connect the LCBS, notice that some are in
different positions in the other genomes, some are inverted and
appear on the bottom strand of the double stranded genome
When you move your mouse over a region of one genome
it will show a black box and also show the corresponding
region (boxes) in the other two genomes, try scrolling left
to right on one genome
Notice, that when you scroll (slowly) over a white region (island)
the black boxes pause in the other genomes, then comes back once
you have passed over the island and back into conserved regions
If you would like to look at all three LCB’s, even
though one is in a different position, scroll over one
LCB and click the mouse button
Lets use the zoom function, press the home button
to restore the alignment to original view
Now click on the white island in the top genome,
and using the right button bring it to the center of
the screen, now start to zoom in multiple times
You will start to see the genes, scroll
over one and pause, and a window will
pop-up with the product annotation, so
here you can view what genes are
present in this EDL933 island, and not
in the other two
Now place you mouse over one of the genes, in my example I
have iha irgA homolog adhesion
Click your mouse once
on the gene, and a
window will pop-up,
scroll down and select
View CDS iha in
ERICdb
This will open the page in the ERIC database for that gene, containing
all of the annotations, you can look to see if it is involved in virulence
Lets use the search feature
#1) Click on the search feature
#2) Choose
a genome
(EDL933)
#4) Click on search
#3) Type in a gene
name (stx2A)
Notice that it has found the stx2A gene (highlighted in blue), and also
in the RIMD strain. Just because it isn't aligned in the EC4042 strain
does not mean it isn't there, if you look to the right in the EC4042
genome, you will find it
Stx2A
One last feature you can use in Mauve
To find an island that is in 2 out of 3
strains you will use the backbone view
Press the home
button first
Then go to the View pull down select
color scheme then backbone color
Your alignment should look like this in backbone color, regions in
all three appear in light purple color, there will be regions that are
different colors that will correspond to 2 out of 3 genomes (you may
have to zoom in a bit to see these regions
Regions in only EDL933 and RIMD appear olive green
Regions in only EDL933 and EC4042 appear maroon
Regions in only RIMD and EC4042 appear tan/brown
This is how you
identify islands unique
to 2/3 strains
Using genomics to track the dissemination of
Yersinia pestis strains
Courtesy of www.cdc.gov
Deng et al. 2002 J. Bacteriol. 184:16 4601-4611
Transmission cycle of Plague
Historic 3 pandemics of plague
-pandemic: is defined as an epidemic that spreads
throughout the human population across a large region
such as a continent or worldwide
-1st pandemic ~550 A.D. confined to mainly Africa and
some parts of the middle ease
-2nd pandemic originated in Central Asia and spread via
trading routes into Europe (Killed ~30% of Europe
population)
Courtesy of edsitement.neh.gov
-3rd pandemic started in 1850’s in China’s Yunnan
providence century confined mainly to Asia
The first two genomes of Yersinia pestis CO92 & KIM
Parkhill et al. 2001 Nature 413, 523-527
Deng et al. 2002 J. Bacteriol. 184:16 4601-4611
Comparison of 2 genomes was not interactive initially
As of 04/2008 there are 7 complete and 14 Y.
pestis draft genomes
Traditionally the strains are classified as serovars (Antiqua, Mediaevalis,
Orientalis, and other) based on the following phenotypic characteristics:
-Antiqua = East Africa: (glycerol positive, arabinose positive, and nitrate
positive)
-Mediaevalis = Central Asia: (glycerol positive, arabinose positive, and nitrate
negative)
-Orientalis Central Asia (glycerol negative, arabinose positive, and nitrate
positive)
-other (ie Microtus, Pestoides) not consistent for these phenotypes
Paleomicrobiology
Partial view of the grave in Dreux investigated in this work, which
illustrates anthropologic features of a mass grave suitable for
paleomicrobiology research. (courtesy of www.cdc.gov)
-the prefix paleo comes from the Greek work palaios
meaning “ancient”
-bacterial colonization of dental pulp can occur during
bacteremia
-Bacteremia (also known as plague septicaemia with Y.
pestis) is the presence of bacteria in the blood
Courtesy of www.nidcr.nih.gov
Extraction of bacterial DNA from Dental pulp
-Some historians believed
that a flu-like virus and not Y.
pestis was responsible for the
1st and 2nd pandemics
-DNA detected in dental pulp
confirm that Y. pestis was the
cause
-Which serovar(s) are most
similar to the Y. pestis
strain(s) from the dental pulp
from the corpses?
Figure 1 The original protocol developed in our study allows recovering the dental pulp and minimizes the risk of
laboratory-acquired contamination of the specimen. The tooth was encasted into sterile resin (1a) ; the apex was sterily
sectioned (1b) to give access to the canal system (1c) ; solutions were injected (1d) ; after incubation, the tooth was put upside
down into sterile tube (1e) and centrifuged (1f).
Tran-Hung et al. PLoS ONE v.2(10); 2007
Use of genomic tools to study Y. pestis
Concepts in this module that you will address:
#1) mutations that affect the production of a full functional gene product that has
phenotypic consequences (insertions, deletions, single nucleotide polymorphisms
[SNP’s]) to study the genes glpD, napA, and araC
#2) Paleomicrobiology investigation, determine which serovar(s) have the most similar
matching genes compared to the amplified sequence from the dental pulp of 3 corpses.
#3) use of genome alignments; determine a island that is unique to the 4 genomes that
infect humans and is absent in Y. pestis strain 91001
#4) determine the conservation of a virulence factor in the 5 strains in the genome
alignment. Determine if it is a full functional product in strain 91001.
Next double click on the uncompressed Yersinia pestis alignment 5 genome folder,
it should contain the following 29 files, take the one
(yersinia_pestis_alignment_5genomes), and drag and drop it into the mauve
window
It should start to say reading sequences here, and in a few seconds the alignment will
appear, note computers with less than 512MB RAM may not be able to open the file
Your alignment should look like this
Organism name
notice the first is
CO92, the second
is KIM,the third is
91001, the fourth is
Antiqua, and the
fifth is Nepal516
Using the up or down arrows, you can switch the position of the genomes
You may find it easier to view the 5 genome alignment
without the connecting lines:
on your keyboard press Shift L
(pressing this again makes them reappear)
Now place you mouse over one of the genes,
Click your mouse once
on a gene, and a
window will pop-up,
scroll down and select
View CDS in ERICdb
This will open the page in the ERIC database for that gene, containing all of the
annotations, you can look to see what is known about it and/or if it is involved in
virulence (note you may be prompted to a log-in screen, click on the button that says
“Enter ASAP”)
Lets use the search feature to find the genes glpD, napA, and araC
#1) Click on the search feature
#2) Choose
a genome or
search all of
the genomes
#4) Click on search
#3) Type in a gene
name (glpD)
Notice that it has found the glpD gene (highlighted in blue), and also
a corresponding gene in each genome. You need to determine which
of the five CDS’s produce the full-length functional protein
Method #1: click on each gene
and go to the view CDS in
ERICdb, look at the length and
if any are labeled as
pseudogenes. If so look for a
note that describes why it is
thought to be a pseudogene
Identifying mutations in glpD, napA, and araC cont.
Method #2: from the feature
page in ERIC
Scroll down to the feature
context part of the page
This is a list of all features that are
neighboring your gene in the genome,
notice some are upstream, downstream,
or contained within
Notice that contained within your glpD
gene there are polymorphic sites
(otherwise known as SNP’s)
For SNP analysis, you will use a
new tool called “Snippy”
In a new tab or web browser window go to
http://asap.ahabs.wisc.edu/~cabot/aep/snippy.php
It should look like this:
Highlight and copy all feature ID’s for polymorphic sites from
glpD and paste them into here and click submit
feature ID’s
In your SNP analysis, you want to look for SNP’s that cause a
change in the amino acid that it encodes for. In some cases the
change results in a premature stop-codon, which may generate a
truncated non-functional protein
#1) note Snippy shows you if the SNP variation results in a
amino acid change, in this case A (Alanine) to T (Threonine)
#2) In this second SNP, the change resulted in a stop codon
In the middle of each region you will see the polymorphic site (in this
case capitol G’s) and the corresponding base in each genome, note you
are interested in variations in YPKIM, YPCO92, YP91001, YPNepal,
and YpAntiqua.
-in this case there is no difference in these 5 genomes in this analysis,
scroll down and search the remaining polymorphic sites and see if
there is any difference in the various polymorphic sites in the 5
genomes, if not it probably is a larger deletion or insertion event
Using the DNA sequence obtained from the dental pulp from three
corpses (found in the file called Ypestis corpse and CA88-4125YPE
genes.doc), conduct a BlastN search within the ERIC database with each
sequence against the 91001,Nepal, Kim, Antiqua, and CO92 genomes.
For each of the three corpses, which serovar is most similar to the strains
that caused the 1st and 2nd pandemics?
From the ERIC
home page you
can select to
run a Blast
search here
(http://www.ericbrc.org/)
Paste the first
nucleotide
sequence from
corpse #1
Select entire
genomes
Select the genomes to query,
hold down the Ctrl key and
select Y . pestis genomes
91001, Antiqua, CO92, KIM,
and Nepal
Finally click on the Submit Query
button, repeat with the other two
corpses sequences
Next repeat the BlastN process using the gene sequences from a known North
American ancestor (Y. pestis CA88-4125/YPE) for glpD, napA, and araC. Of the 5
genomes (91001, Antiqua, CO92, KIM, and Nepal) representing the three serovars,
which is most similar to the known North American ancestor?
Based on your analysis did Y. pestis arrive in North America via shipping routes
over the Atlantic or Pacific?
Atlantic?
Pacific?
(Serovar
Antiqua of
African origin)
Serovar
Orientalis or
Mediaevalis
of Asian
origin
Courtesy of education.usgs.gov
Your alignment should look like this in backbone color, regions in
all five appear in light purple color, there will be regions that are
different colors that will correspond to 2, 3, 4 out of 5 genomes (you
may have to zoom in a bit to see these regions)
Look for a region in the lightest blue color that is present in CO92, KIM, Antiqua, and
Nepal, but absent in the 91001 strain. Analyze the contents and determine if any of the
genes may contribute to human infection of Y. pestis.
Conclusion
If you are interested in using
some or all of these modules in
your class, please sign up, and
provide email, institution,
course(s)
-In the last two weeks of August
2008 I will be leading multiple
WebX training sessions to refresh
and field Q&A, you need a
telephone and internet-ready
computer
Thanks for your time
Collaborators:
Dr. Kai F. (Billy) Hung (UW-Madison/assistant Prof. At Eastern Illinois University Fall 2008)
Dr. Amy C. Wong (UW-Madison)
Dr. Lois Banta (Williams College)
Mentors:
Dr. Nicole Perna (UW-Madison)
Dr. Charles Kaspar (UW-Madison)
Dr. Jeffrey Byrd (St. Mary’s College)
Dr. Bob Kadner and the ASM Summer Institute
Thank you: everyone on the ERIC database team (especially Guy Plunkett III for setting up
module #1 & Eric Cabot for making Snippy) and all of the members of the Perna Genome
Evolution Laboratory
Funding: This project has been funded with Federal funds from the National Institute of
Allergy and Infectious Diseases, National Institutes of Health, Department of Health and
Human services, under contract No. HHSN266200400040C