Link to Powerpoint

Download Report

Transcript Link to Powerpoint

The Microbiome and
Metagenomics
Catherine Lozupone
CPBS 7711
October 13, 2015
What is the microbiome?
• “The ecological community of commensal,
symbiotic, and pathogenic microorganisms that
share our body space”
• Microbiota: “collection of organisms”
Microbiome: “collection of genes”
• Bacteria, Archaea, microbial eukaryotes (e.g.
fungi or protists) and viruses.
• Body Sites
– Important roles in health and disease: Gut, Mouth,
Vagina, Skin (diverse sites:Nasal epithelial)
– Important roles in disease: Lung, blood, liver, urine
The big tree
• Majority of life’s
diversity is microbial
• Majority of microbial
life cannot be grown
in pure culture
Pace, N.R.,The Universal
Nature of Biochemistry. PNAS
Vol 98(3) pp 805-808.
The Human Gut Microbiota
• 100 trillion microbial cells: outnumber human
cells 10 to 1!
• Most gut microbes are harmless or beneficial.
– Protect against enteropathogens
– Extract dietary calories and vitamins
– Prevent immune disorders
• List of diseases associated with dysbiosis ever
growing
–
–
–
–
Inflammatory Diseases: IBD, IBS
Metabolic Diseases: Obesity, Malnutrition
Neurological Disorders
Cancer
What do we want to understand?
• What does a healthy microbiome look like?
– How diverse is it?
– What types of bacteria are there?
– What is their function?
• How variable is the microbiome?
– Over time within an individual?
– Across individuals?
– Functionally?
• What are driving factors of variability?
– Age, culture, physiological state (pregnancy)
• How do changes affect disease?
– What properties (taxa, amount of diversity) change with disease?
– Cause or affect?
– Functional consequences of dysbiosis
• Host Interactions
– Evolution/adaptation to the host over time.
– Immune system
Culture-independent studies revolutionized
our understanding of gut bacteria
• Culture-based studies over-emphasized
the importance of easily culturable
organisms (e.g. E. coli).
Culture-independent surveys
1.
Extract DNA from
environmental
samples.
2.PCR amplify SSU
rRNA gene (which
species?)
Sequence random
fragments (which
function?)
3. Evaluate
Sequences
Gut microbiota has simple
composition at the phylum level
Data from: Yatsunenko et. al. 2012. Nature.
Different phyla: Animals
and plants
Diversity of Firmicutes in 2 healthy
adults
• Each person
harbors > 1000
species.
• Some species
are unique (red
and blue)
• Some shared
(purple)
• We know very
little about
what most of
these species
do!
Sequencing technology renaissance enabled
more complex study designs
• Sanger Sequencing (thousands)
• Pyrosequencing (millions)
• Illumina (billions!)
Metagenomics
• The study of metagenomes, genetic material
recovered directly from environmental
samples.
• Marker gene
– PCR amplify a gene of interest
– Tells you what types of organisms are there
– Bacteria/Archaea (16S rRNA), Microbial Euks (18S
rRNA), Fungi (ITS), Virus (no good marker)
• Shotgun
– Fragment DNA and sequence randomly.
– Tells you what kind of functions are there.
Small Subunit Ribosomal RNA
• Present in all known life
forms
• Highly conserved
• Resistant to horizontal
transfer events
16S rRNA secondary structure
Other ‘Omics
• MetaTranscriptomics (sequence version of
microarray)
–
–
–
–
Isolate all RNA
Deplete rRNA
Sequence all transcripts
Sometimes phenotype only seen in activity of the
microbiota
• Metabolomics
– What metabolites does a community produce?
– E.g. in feces or urine
• MetaProteomics
– What proteins does a community produce?
Integrating Data Types
• 16S rRNA -> shotgun metagenomics
– What gene differences cannot be explained by
16S?
– Selection by HGT
• 16S/ genomics -> transcriptomics->
metabolomics
– What species or genes (or combination of species
or genes), when expressed, are responsible for
producing a given metabolite?
Sequencing Technologies
• Sanger -> 454 Pyrosequencing -> Illumina
Short reads (pyrosequencing)
can recapture the result.
• UW UniFrac
clustering with Arb
parsimony insertion
of 100 bp reads
extending from
primer R357.
• Assignment of
short reads to an
existing phylogeny
(e.g. greengenes
coreset) allows for
the analysis of very
large datasets.
Liu Z, Lozupone C, Hamady M, Bushman FD & Knight R (2007) Short pyrosequencing
reads suffice for accurate microbial community analysis. Nucleic Acids Res 35: e120.
Preprocessing pyrosequencing datasets
• Quality filtering: Discard sequences that:
–
–
–
–
Are too short and too long (200-1000 range)
With low quality scores
With long homopolymers
Can trim poor quality regions from the ends
• PyroNoise and Chimeras
– Can greatly inflate OTU counts
– Pyronoise algorithm uses SFF files to fix noisy
sequences
• Use barcodes to assign sequences to
samples
Defining species: OTU picking
• Cluster sequences based on % identity
– 97% id typical for species
– CD-HIT, UCLUST
• For Phylogenetic diversity measures need
to make a tree
– Align sequences: NAST, PyNAST
– Denovo tree building: FastTree
– Assign reads to sequences in a pre-defined
reference tree
Comparing Diversity
• Overview of methods for evaluating/comparing microbial
diversity across samples using 16S rRNA
  diversity: Measures how much is there?
  diversity: How much is shared?
• Phylogenetic verses taxon based diversity.
• Quantitative verses Qualitative diversity.
• What types of taxa are driving the patterns? Which
species are associated with measured properties?
• Tools: UniFrac/QIIME/Topiary Explorer
• Lozupone, C.A. and R. Knight (2008) Species divergence and the
measurement of microbial diversity. FEMS Microbiol Rev. 1-22.