Link to Powerpoint
Download
Report
Transcript Link to Powerpoint
The Microbiome and
Metagenomics
Catherine Lozupone
CPBS 7711
October 13, 2015
What is the microbiome?
• “The ecological community of commensal,
symbiotic, and pathogenic microorganisms that
share our body space”
• Microbiota: “collection of organisms”
Microbiome: “collection of genes”
• Bacteria, Archaea, microbial eukaryotes (e.g.
fungi or protists) and viruses.
• Body Sites
– Important roles in health and disease: Gut, Mouth,
Vagina, Skin (diverse sites:Nasal epithelial)
– Important roles in disease: Lung, blood, liver, urine
The big tree
• Majority of life’s
diversity is microbial
• Majority of microbial
life cannot be grown
in pure culture
Pace, N.R.,The Universal
Nature of Biochemistry. PNAS
Vol 98(3) pp 805-808.
The Human Gut Microbiota
• 100 trillion microbial cells: outnumber human
cells 10 to 1!
• Most gut microbes are harmless or beneficial.
– Protect against enteropathogens
– Extract dietary calories and vitamins
– Prevent immune disorders
• List of diseases associated with dysbiosis ever
growing
–
–
–
–
Inflammatory Diseases: IBD, IBS
Metabolic Diseases: Obesity, Malnutrition
Neurological Disorders
Cancer
What do we want to understand?
• What does a healthy microbiome look like?
– How diverse is it?
– What types of bacteria are there?
– What is their function?
• How variable is the microbiome?
– Over time within an individual?
– Across individuals?
– Functionally?
• What are driving factors of variability?
– Age, culture, physiological state (pregnancy)
• How do changes affect disease?
– What properties (taxa, amount of diversity) change with disease?
– Cause or affect?
– Functional consequences of dysbiosis
• Host Interactions
– Evolution/adaptation to the host over time.
– Immune system
Culture-independent studies revolutionized
our understanding of gut bacteria
• Culture-based studies over-emphasized
the importance of easily culturable
organisms (e.g. E. coli).
Culture-independent surveys
1.
Extract DNA from
environmental
samples.
2.PCR amplify SSU
rRNA gene (which
species?)
Sequence random
fragments (which
function?)
3. Evaluate
Sequences
Gut microbiota has simple
composition at the phylum level
Data from: Yatsunenko et. al. 2012. Nature.
Different phyla: Animals
and plants
Diversity of Firmicutes in 2 healthy
adults
• Each person
harbors > 1000
species.
• Some species
are unique (red
and blue)
• Some shared
(purple)
• We know very
little about
what most of
these species
do!
Sequencing technology renaissance enabled
more complex study designs
• Sanger Sequencing (thousands)
• Pyrosequencing (millions)
• Illumina (billions!)
Metagenomics
• The study of metagenomes, genetic material
recovered directly from environmental
samples.
• Marker gene
– PCR amplify a gene of interest
– Tells you what types of organisms are there
– Bacteria/Archaea (16S rRNA), Microbial Euks (18S
rRNA), Fungi (ITS), Virus (no good marker)
• Shotgun
– Fragment DNA and sequence randomly.
– Tells you what kind of functions are there.
Small Subunit Ribosomal RNA
• Present in all known life
forms
• Highly conserved
• Resistant to horizontal
transfer events
16S rRNA secondary structure
Other ‘Omics
• MetaTranscriptomics (sequence version of
microarray)
–
–
–
–
Isolate all RNA
Deplete rRNA
Sequence all transcripts
Sometimes phenotype only seen in activity of the
microbiota
• Metabolomics
– What metabolites does a community produce?
– E.g. in feces or urine
• MetaProteomics
– What proteins does a community produce?
Integrating Data Types
• 16S rRNA -> shotgun metagenomics
– What gene differences cannot be explained by
16S?
– Selection by HGT
• 16S/ genomics -> transcriptomics->
metabolomics
– What species or genes (or combination of species
or genes), when expressed, are responsible for
producing a given metabolite?
Sequencing Technologies
• Sanger -> 454 Pyrosequencing -> Illumina
Short reads (pyrosequencing)
can recapture the result.
• UW UniFrac
clustering with Arb
parsimony insertion
of 100 bp reads
extending from
primer R357.
• Assignment of
short reads to an
existing phylogeny
(e.g. greengenes
coreset) allows for
the analysis of very
large datasets.
Liu Z, Lozupone C, Hamady M, Bushman FD & Knight R (2007) Short pyrosequencing
reads suffice for accurate microbial community analysis. Nucleic Acids Res 35: e120.
Preprocessing pyrosequencing datasets
• Quality filtering: Discard sequences that:
–
–
–
–
Are too short and too long (200-1000 range)
With low quality scores
With long homopolymers
Can trim poor quality regions from the ends
• PyroNoise and Chimeras
– Can greatly inflate OTU counts
– Pyronoise algorithm uses SFF files to fix noisy
sequences
• Use barcodes to assign sequences to
samples
Defining species: OTU picking
• Cluster sequences based on % identity
– 97% id typical for species
– CD-HIT, UCLUST
• For Phylogenetic diversity measures need
to make a tree
– Align sequences: NAST, PyNAST
– Denovo tree building: FastTree
– Assign reads to sequences in a pre-defined
reference tree
Comparing Diversity
• Overview of methods for evaluating/comparing microbial
diversity across samples using 16S rRNA
diversity: Measures how much is there?
diversity: How much is shared?
• Phylogenetic verses taxon based diversity.
• Quantitative verses Qualitative diversity.
• What types of taxa are driving the patterns? Which
species are associated with measured properties?
• Tools: UniFrac/QIIME/Topiary Explorer
• Lozupone, C.A. and R. Knight (2008) Species divergence and the
measurement of microbial diversity. FEMS Microbiol Rev. 1-22.