Metabarcoding - Bioinformatics Institute
Download
Report
Transcript Metabarcoding - Bioinformatics Institute
Metabarcoding
16S RNA targeted sequencing
Peter Tsai
Bioinformatics Institute, University of Auckland
Overview
What’s metagenomics and metabarcoding?
Next generation sequencing and
metabarcoding
How NGS changes metagenomics
Analysis approach
Taxonomic dependent and independent analysis
Study example
NZ vine yeast biogeography by pyrosequencing
Metagenomics
Metagenomics
Study of metagenomes, genetic materials directly from
environmental samples.
Shotgun metagenomics
◦ Randomly shears DNA, sequence many different species in environment
and attempts to reconstruct multiple genomes.
Metabarcoding
Subset of metagenomics.
Study of one or more marker gene.
Gene specific primers to ‘barcode’ that gene, i.e. 16S, ITS or CO1
Aim is often to identify different species and compare different
community
NGS and metagenomics
Accelerated by NGS, predominately 454 sequencing
because of the longer read length, now more with Illumina
based chemistry.
Organism no longer needs to be cultivated and cloned —
Culture independent insight
Direct sequencing from environment as a “community”
You can pool multiple samples together
Not all microbes can be cultured
Analysis approach
Analysis approach
Taxonomy independent analysis
Reads are group into operational taxonomic units (OTU)
based on a specified sequence variation.
Taxonomy dependent analysis
Assignment at the level of domain, phylum, class, order,
family, genus, and species
Require a reference database
Taxonomy independent analysis
Group reads into OTU based on certain imposed
similarity threshold
In study of bacteria, 97% seems like a good starting point
Species dependent, genes dependent, threshold may vary
1 OTU = 1 organism
Extract a OTU representative sequence
Most common sequence
Sequence that has minimum difference to all other
sequences in the same OTU
Taxonomy dependent analysis
Classify sequences
BLAST
Simply BLAST what you have
Online RDP classifier (Ribosomal Database Project )
RDP 10.26 (Release 10, Update 26 consists of 1,613,063 aligned and
annotated 16S rRNA sequences
Limited by number of reads you can submit
Online Greengenes classifier based on NAST alignment
Require pre-aligned dataset
Limited by number of reads you can submit
NZ vine yeast biogeography
by pyrosequencing
M. W. Taylor, N. Anfang, A. H. Thrimawithana, P. Tsai, H. Ross and M. R. Goddard
School of Biological Sciences, University of Auckland
NZ vine yeast biogeography by pyrosequencing
Yeasts are the agents responsible for fermentation of
fruits into wine
Yeasts naturally associated with vines and wines are
reasonably well characterised
Microbes have an effect on both vine and fruit
development (as some are pathogens), as well as the
resulting wine quality and style
Investigations into the ecology of these organisms is
lacking.
Vitis vinifera
NZ vine yeast biogeography by pyrosequencing
6 distinct vineyards in each of four major and distinct
wine-producing regions
West Auckland (WA)
Hawke’s Bay (HB)
Marlborough (MB)
Central Otago (CO)
26S RNA gene from DNA directly extracted from
microbial communities associated with ripe
Chardonnay fruit
NZ vine yeast biogeography by pyrosequencing
Quality checks
◦ Remove short reads
◦ Remove reads containing ambiguity
◦ Trim off low quality regions
Taxonomy independent analysis
◦
◦
◦
◦
No well established reference database for eukaryotic 26S
Clustering into 98% OTU
ANOSIM for statistical test between regions
Limited classification rely upon NCBI Taxonomy DB
NZ vine yeast biogeography by pyrosequencing
2,000 species were found using deep sequencing across all regions.
Culture based analysis recovered 7 species from West Auckland and
Hawke’s Bay
Deep sequencing identified ~700 from the same West Auckland and
Hawke’s Bay sample.
All 7 species were found in pyrosequencing dataset
The culture-based may miss ~99% of the community
Marlborough
Hawke’s Bay
Central Otago
West Auckland
Geographic patterns for yeast communities
Central Otago harbours the most distinct community
Different communities associated with Chardonnay vines in
different areas of NZ
Community similarity significantly decays with distance and
temperature
Different regions harbour different communities, may, in part,
contribute to the distinctiveness of wines deriving from that
area.
Key questions associated with Metagenomics
Number of reads needed
Statistical power
Over estimating due to sequencing error
Results in large number of OTUs
Multiple copies of 16S rRNA gene in some species
Lead to overrepresentation
Accuracy of taxonomic classification
Not all rRNA genes amplify equally well with the same
“universal” primers
Summary
Basic introduction, basic method, one of many ways of
analysing metabarcoded dataset.
Increasingly popular way of extracting the genomes of microorganisms.
Direct insight into communities without the need of culturing
Culture based and sequencing based method may recover
different proportion of organisms