TGAC_course_EGPa

Download Report

Transcript TGAC_course_EGPa

Metagenomics Bench and data analysis: concepts,
historical milestones and next advances
Center of Astrobiology, Madrid
Laboratory of Molecular Adaptation
Eduardo González-Pastor
TGAC Norwich, 2014
Metagenomics: From the Bench to Data Analysis
OUTLINE
1. Introduction
•
What is the metagenome?
•
Why and how to study the metagenome?
Sequence
Functional analysis
2. Functional metagenomic approach to search for novel mechanisms of
adaptation to extreme environments
•
Metal and acid resistance mechanisms in microbial communities
from the Rio Tinto (Spain)
What is the metagenome?
metagenome: the genomes of all the microorganisms
(virus included) of an environmental sample, and it is
studied using culture independent techniques
“metagenomics”
Handelsman, J.; Rondon, M. R.; Brady, S. F.; Clardy, J.;
Goodman, R. M. (1998). "Molecular biological access to the
chemistry of unknown soil microbes: A new frontier for
natural products". Chemistry & Biology 5 (10): 245–249.
Why to study the metagenome?
Only a small percentage of the microorganisms can be cultured
(around 1%) (Pace et al., 1985).
For instance, soil microbial communities could contain between 5,000 and
20,000 different species, but only few can be isolated and cultured (50-200)
The study of the metagenome provides culture independent
information about the microorganisms of an environmental sample.
Phylogenetic three of bacteria (rRNA 16S)
area: relative abundance of sequences
How to study the metagenome?
metabolomics
metaproteomics
culture independent techniques
to study microbial communities
metagenomics
metatranscriptomics
total DNA isolation from the environmental sample
(soil, water, insect guts, human intestine, skin, saliva, etc)
Which microbes are in the sample?
Analysis of microbial diversity
(sequencing of 16S rRNA libraries)
Construction of metagenomic libraries
(host that can be cultured
and genetically manipulable)
Sequencing of
metagenome
Construction of metagenomic libraries
Environmental DNA
fragmentation
vector
+
recombinant
vectors
insert
Host: Escherichia coli
Sequence
Metagenomic library
Functional analysis
Selecting the appropiate protocol
•
Liquid, solid (soil, sediment, etc), faeces
•
•
•
From raw sample
After matrix/cell separation
Extraction of DNA or DNA/RNA together
•
•
•
Short insert (phagemid or plasmid)
Large insert (fosmid of cosmid)
Mega-large insert (pBAC)
•
•
Enzymatic
Physical
•
•
•
•
•
Escherichia coli
Pseudomonas putida
Bacillus subtilis
Streptomyces
Pichia pastoris
Sequencing of the metagenomic DNA
Environmental
sample
A
Metagenomic
library
Total DNA
Direct
sequencing
Pyrosequencing
“shotgun”
(3Kb)
End sequencing
B
Plasmid or fosmid
isolation
-Roche/454 FLX
-Ilumina/Solexa
Pyrosequencing
-Applied
Biosystems SOLiD
Discard vector seq
DNA assembly
in silico
DNA assembly in
silico
DNA assembly
in silico
Sequencing of the metagenomic DNA
Bioinformatic analysis:
• gene annotation
• genome and metabolism reconstruction of microbial communities,
• comparation of microbial communities from different environments
1. Rhodopsins in marine bacteria, a new group of phototrophs
Beja et al, Science 2000
Bacteriorhodopsins
• Proton pumps localized in the cytoplasmic
membrane of archaea
• Associated to retinal, a chromophore that
changes its conformations when absorbs a
photon. This induces a conformational
change of the protein, and it is activated the
proton pumping out of the cell. Then, the
proton gradient is transformed in chemical
energy
First time that a rhodopsin is discover in an uncultured bacteria
(SAR86 group) (g-Proteobacteria) (protorhodopsin)
16S
rhodopsin
130 kb
The bacterial protorhodopsin can be expressed in
Escherichia coli, and it is functional
• binds to retinal (cells are red pigmented)
• works as a proton pump activated by light
2. Sequencing of the microbial communities from the Sargasso sea
Venter et al., Science 2004
Microorganisms were collected from the Sargasso sea
Metagenomic DNA is fractionated and libraries are
constructed with inserts from 2-6 kbp
(“shotgun” sequencing, pairwise-end sequencing)
• Weatherbird II: 1.66 million sequences (1.36 Gbp)
• Sorcerer II:
325,561 sequences (265 Mbp)
1800 species o phylotypes (148 new)
782 novel rhodopsin receptors from the Sargasso microorganisms
13 subfamilies
• 4 known (cultured organisms)
• 9 from uncultured, 7 new
3. Genome reconstruction of microorganisms from acid mine drainage
Tyson et al., Nature, 2004
• Acid mine drainage: process in which water, oxygen and chemolithotrophic
microorganisms interact with sulfide minerals producing very acidic solutions
• Bacterial biofilms floating on acidic water from Richmond Mine (Iron Mountain, California)
(pH 0-1 and high concentration of toxic metals Fe, Zn, Cu y As)
Eucaryotes 4%Sulfobacillus ssp. 1%
Archaea 10%
Leptospirillum gp III 10%
Leptospirillum gp II 75%
Labelling of cells (FISH):
• yellow, Leptospirillum
• green, other bacteria
• blue, archaea
Sequence of the microorganisms from the biofilms of the
acidic waters, and reconstruction of the metabolism
Reconstruction of the complete
genome sequence of the two
most abundant
microorganisms: Leptospirillum
and Ferroplasma, both of them
obtein energy from iron
oxidation.
The sequence data allowed to
create a model of the
biogeochemichal cycles ruled
by the microorganisms in this
environment.
4. Comparative metagenomics of microbial communities
Tringe et al., Science 2006
• Comparison of unassembled sequence data obtained from shotgun
sequencing DNA isolated from different environments.
• Quantitative gene content analysis (abundance or absence) reveals habitat
specific fingerprints that reflects known characteristics of the sampled
environment
• Identification of genes or metabolic pathways specific for a particular
environment.
Comparison of 8 libraries: 3 from Sargasso sea, 3 from Whale fall (cemetery of
whales, deep sea), 1 from farm soil and 1 from acid mine drainage
Comparison of libraries from soils, whale corpses and Sargasso sea
bacteriorhodopsin
Transport of
proline/glycine betaine
cellobiose phosphorilase
photosynthesis
Polyketide synthesis (antibiotics)
COGs: Cluster of orthologous groups of proteins
KEGG: Kyoto Encyclopedia of genes and genomes
(high order cellular processes)
Functional metagenomics: search of genes expressing a function
• Screening of metagenomic libraries to search for a particular function (resistance to
some compounds, fluorescence, etc).
• Many compounds like antibiotics, quorum sensing inhibitors or inducers, enzymes of
commercial interest, pigments, etc, have been discovered.
The ISME Journal, 9 October 2008;
Functional metagenomics reveals diverse
b -lactamases in a remote Alaskan soil
Heather K Allen1,2, Luke A Moe1, Jitsupang Rodbumrer1,3,
Andra Gaarder1 and Jo Handelsman1
2. Functional metagenomic approach to search for novel
mechanisms of adaptation to extreme environments
Study of life in extreme environments
Which are the limits of life?
Search for novel molecular mechanisms of adaptation of the microorganisms to extreme
conditions (toxic metales, acidic pH, low and high temperatures, high radiation and high
salt concentrations)
Biotechnological aplications, bioremediation, biomining…
Bias in the known mechanisms of adaptation,
most from cultured microorganisms
Functional Metagenomic approach
(culture independent)
OUTLINE
1. Search for metal resistance genes in microorganisms from the Río Tinto
•
Nickel resistance genes from rhizosphere communities
2. Search for acid pH resistance genes in microorganisms from the Río Tinto
3. Construction of nickel resistant transgenic plants
4. Future: search for adaptation mechanisms in microorganisms from
from rhizosphere and phyllosphere of Antartic plants, and from
hypersaline environments
1. Search for metal resistance genes in microorganisms from the Río Tinto
Río Tinto
• Tinto river flows through the Iberian Pyrite Belt (FeS2), southwestern Spain
• Natural environment (not the result of mining) of at least 2.000.000 years old
• Acid mine drainage (AMD): natural process in which water, oxygen and chemolitothophic
microorganisms interact with the pyrite to produce oxidized iron and highly acidic solutions
(average pH=2.3)
FeS2
Fe2+
Acidithiobacillus ferrooxidans
Leptospirillum ferrooxidans.
S2Acidithiobacillus
ferrooxidans
SO42-
Fe3+ +H
H2SO4
Acid water and oxidation increase the
solubility of other metals and metalloids
As
380 ppm
Cr
380 ppm
Cu
Zn
110 ppm
220 ppm
Ni
10 ppm
Complex microbial communities.
(High diversity of eukaryotes, but
low diversity of bacteria and archaea
in the planktonic phase)
Metagenomic libraries
• planktonic phase: highly enriched in toxic metals, very low pH, low
bacterial diversity (less than ten species)
• rhizosphere from the endemic heather, Erica andevalensis: less
enriched in heavy metals, pH ~ 4-5, high bacterial diversity (root
exudates are enriched in nutrients)
1H3
C12
F7
H7
E5
Uncultured acidobacterium (AF200698)
1A3
Acidobacterium capsulatum (D26171)
1H5
C8
1A1
1F6
1B3
Uncultured acidobacterium (AB192240)
1D3
1F3
1E2
E1
Uncultured planctomycete (AF465657)
F6
G10
Uncultured candidate bacterium TM7 (AY225653)
1c1
F12
Acidiphilium acidophilum (D86511)
G3
G1
Acidocella sp. X91797
1B1
H8
Rhodopila globiformis M59066
C4
B1
Bacterium Ellin 340 (AF498722)
100 1G5
Enterobacter dissolvens (Z96079)
Bacterial diversity
in rhizosphere
(16S RNA, 1450 bp)
Acidobacteria (26,2%)
Tm7 (1,2%)
a-proteobacteria (18%)
g-proteobacteria (1%)
B4
Conexibacter woesei (AJ440237)
1C3
Mycobacterium florentinum (AJ616230)
B9
Acidimicrobium ferroxidans (U75647)
H10
D9
C6
F5
Uncultured actinomycetales bacterium (X92708)
F1
Actinobacteria (46,4%)
F3
C9
D12
1C5
0.1
Mirete et al. Appl. Env. Microbiol, 2007
Construction of metagenomic libraries
Environmental DNA
partial Sau3AI digestion
vector
pBluescript SKII
Bam HI digested
+
recombinant
vectors
insert: 1-10 Kb
Host: Escherichia coli
Rhizosphere:
750.000 recombinants
Average size insert: 2 Kb
1,4 Gbp ~350 bact. genomes
SCREENING
AMPLIFICATION
Planktonic:
30.000 recombinants
Average size insert: 2.5 kb
75 Mbp ~19 bact. genomes
Screening of metagenomic libraries
Pool
Plasmid
DNA isolation
Individual
clones
Retransformation
(to discard chromosomal
mutations)
Confirm resistance
Selection
Digestion
(independent clones)
Identification of the
genes involved in the
resistance phenotype
Subcloning
In vitro mutagenesis
transposon
Sequence
Annotation
1.1. Nickel resistance genes from rhizosphere communities
0
10-1 10-2
10-3 10-4
pSM1
pSM2
Screening of nickel resistant genes
in niquel 2 mM (toxic concentration
for the E. coli host)
pSM3
pSM4
pSM5
pSM6
pSM7
13 clones with different DNA
fragments inserted
pSM8
pSM9
pSM10
pSM11
pSM12
pSM13
pSKII +
LB-Nickel 2 mM
Salvador Mirete, Carolina G. de Figueras
• Mirete et al. Appl. Env. Microbiol, 2007
• Gonzalez-Pastor & Mirete, Metagenomics:
methods and protocols, 2010
Intracellular nickel concentration in the resistant clones
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
DH5
Ni concentration
(mg/g dry weight)
(ICP-MS)
pSM1
pSM2
pSM3
pSM4
pSM5
pSM6
pSM7
pSM8
pSM9 pSM10 pSM11 pSM12 pSM13
Active transport of nickel?
Ni concentration
(mg/g dry weight)
9.000
Control
8.000
7.000
6.000
pSM5
5.000
4.000
3.000
pSM12
2.000
1.000
0
DH5
0
-1
-2
-3
pSM5
pSM12
-4
pSM5
pSM12
ORF2
261 aa
ORF1
229 aa
ORF1
178 aa
ORF2
298 aa
ORF 1: ABC transporter, membrane subunit (48%)
ORF 1: ABC transporter, ATPase subunit (43%)
ORF 2: ABC transporter, ATPase subunit (57%)
ORF 2: ABC transporter, membrane subunit (36%)
ABC transporters (ATP Binding Cassette)
First description of this type of ABC
transporter related to metal export
but not import
Resistance by intracellular protection
Ni concentration
(mg/g dry weight)
16.000
14.000
DH5a (pBluescript)
Control
-1
-2
-3
10.000
8.000
DH5a (pSM11)
0
12.000
6.000
4.000
-4
2.000
0
DH5
pSM11
pSM11
253 aa
74 aa
serine O-acetyltransferase (SAT) (51%)
SAT is involved in nickel
resistance in plants (Thlaspi)
SAT overexpression in plant cells increases the
intracellular leves of reduced glutathione (GSH),
which protects against the oxidative stress
produced by Ni (Freeman et al., AEM, 2005)
ORFs organization of other nickel resistant clones
pSM1
Unknown, and hypothetical
pSM2
Protein of unknown function DUF195
COG1322: Uncharacterized protein conserved in bacteria
pSM3
Hypothetical
pSM4
DnaA protein
pSM6
Conserved hypothetical protein
pSM7
Acyl-CoA sterol acyltransferase (fungi)
pSM8
hypothetical protein Cphamn1DRAFT_2587
VrlI-like protein
pSM9
penicillin binding protein 1A
Tfp pilus assembly protein, ATPase PilM
pSM10
similar to Amino acid transporters
Apolipoprotein N-acyltransferase
pSM13
Conjugal transporter protein TraA
0,5 Kb
Mirete et al. Appl. Env. Microbiol, 2007
Gonzalez-Pastor & Mirete, Metagenomics: methods and protocols,
2010
2. Search for acid pH resistance
genes in microorganisms from
the Río Tinto
Screening by acid shock
(pH 1.8) in liquid medium (2 h)
E. coli DH10B
(control -)
1AA
A
B
C
D
E
1AA
Libraries
rhizosphere
planktonic
E. coli DH10B
(Control)
Dilution 10-3 in
LB (pH 1.8 )
Incubation at 37ºC
with shaking (2 h)
Plating in LB agar-Ap-Xgal
DNA digestion
15 independent clones
María Eugenia Guazzaroni
Guazzaroni et al. Env. Microbiol, 2012
100.000
100
10.000
10
1.000
1
0,1
0.100
0,01
0.010
0,001
T: 0 h
10-3
10-3
10-5 10-7
10-5
T: 1 h
D3
D2
B2
A6
A5
1AA-13
Clon A2
DH10B pSKII+ ( negative control)
10-7
1AA-12
D1
B1
A3
A2
A1
1AA-11
1AA-10
1AA-8
0.001
DH10B
Survival at pH 1.8 (log)
Percent
Percent Survival at pH 1.8 (log)
% Survival 1h
10-7
T: 0 h
10-3
10-3
10-5 10-7
10-5
T: 1 h
Guazzaroni et al. Env. Microbiol, 2012
DNA protection
Clon B1
Glycosyl hydrolase BNR
repeat-containing protein
Ferritin DPS
family protein
*
2,855 bp
25% survival at pH 1.8 (1h)
DPS: DNA Protecting protein under Starved conditions
Some DPS proteins nonspecifically bind DNA, protecting it from
cleavage caused by reactive oxygen species.
Guazzaroni et al. Env. Microbiol, 2012
A chaperon involved in acid pH resistance
ATP-dependent Clp ATP-dependent Clp protease,
protease, ATP-binding
proteolytic subunit ClpP
subunit ClpX
Clon B2
1,701 bp
*
32 % survival at pH 1.8 (1h)
ClpPX: a two component protease involved in removing heat-damaged
proteins (heat shock). Not previously reported to be involved in acid pH
tolerance
• ClpP is the proteolytic subunit
• ClpX is the ATP-binding subunit and works as a molecular chaperone.
Guazzaroni et al. Env. Microbiol, 2012
ORFs organization of other acid pH resistant clones
A1
Unknown
Unknown
4-hydroxy-3-methylbut-2-enyl
diphosphate reductase
multi-sensor hybrid
histidine kinase
* 2 Kb
*
A2
A5
2,4 Kb
*
PhoH family
protein
*
Alkyl
hydroperoxide Amino acid-binding ACT
domain-containing protein
reductase
1,9 Kb
Hypothetical
protein
D1
stringent response
LexA
repressor
Repressor of genes in the cellular SOS response
to DNA damage (non-active heterodimers?)
1,4 Kb
RNA-binding protein Hypothetical protein
D3
* 1,3 Kb
DNA-binding protein HU
Unknown
Gp45 protein
1AA10
* 2 Kb
Unknown
1AA12
1AA13
Hypothetical protein
1,9 Kb
*
Integrase family protein
*
Involvement of HU in DNA repair.
Plays a positive role in translation of RpoS.
Unknown
1,7 Kb
0.2 Kb
Guazzaroni et al. Env. Microbiol, 2012
Percent Survival (log)
Test of the ORFs involved in acid pH resistance in E. coli,
also in Pseudomonas putida and Bacillus subtilis
100.000
100
10
10.000
1.0001
0,1
0.100
0,01
0.010
Percent Survival (log)
RNAbinding
protein
ACT domaincontaining
protein
HU
protein
ClpP
protease
E. coli DH10B
LexA
repressor
HP
No
homology
HP
-pSKII + ≈500 copies per cell
-pH 1.8 (60 m)
0,001
0.001
100.000
100
10.000
10
1
1.000
0,1
0.100
0,01
0.010
P. putida KT2440
-pSEVA 15-20 copies per
cell
-pH 3.8 (10 m)
0,001
0.001
100.000
100
Percent Survival (log)
(-)
Dps
protein
10.000
10
1.000
1
B. subtilis PY79
0.100
0,1
-Gene inserted in chromosome,
promoter induction with ITPG
0,01
0.010
-pH 4.0 (10 m)
0,001
0.001
3. Construction of nickel resistant transgenic plants
Cloning in pCAMBIA3500 to transform in Arabidopsis thaliana
nickelR
T-border
(left)
•
•
CaMV
polyA
phosphinothricin
CaMV
35S
2x CaMV
35S
CaMV
polyA
T-border
(right)
Replication origin of Agrobacterium tumefaciens
T-DNA from Agrobacterium:
– Three copies of 35S promoter from Cauliflower Mosaic Virus (CaMV35S), one to transcribe the
phosphinothricin gene (herbicide to select the transgenic plants), and two copies to transcribe
the gene to be cloned.
– Trancriptional terminator, CaMV polyA
Carolina González de Figueras
Salvador Mirete
3. Construction of nickel resistant transgenic plants
pSM6: Conserved hypothetical protein
pSM7: Acyl-CoA sterol acyltransferase (fungi). This enzyme solubilizes the sterol
from the membrane, and is accumulated in the cytoplasm.
Could the Ni resistance be explained by changes in membrane permeability?
Ni concentration
(mg/g dry weight)
10000
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
DH5
pSM6
pSM7
3. Construction of nickel resistant transgenic plants
Wt
Wt
pSM6
pSM7
3rd generation of plants transformed with two genes involved in metal resistance
genes from pSM6 and pSM7 plasmids (125ug/ml Ni) (18 days)
3. Construction of acid pH resistant transgenic plants
Ferritin Dps family protein
B1
*
ORF4
RNA-binding
protein
D3
*
ORF5
A5
Amino acid-binding ACT
domain-containing
protein
*
5 individual genes were
selected for cloning in
pCAMBIA3500 vector
ORF9
DNA-binding protein HU
1AA10
*
ORF14
ATP-dependent Clp protease,
proteolytic subunit ClpP
B2
*
ORF23
M Eugenia Guazzaroni
Carolina González de Figueras
4. Search for adaptation mechanisms in microorganisms
from rhizosphere and phyllosphere of Antartic plants
Colobanthus quitensis
Deschampsia antartica
• Microbial diversity from rhizosphere and phyllosphere
• Metagenomics:
- sequence
- funtional (genes involved in cold and radiation adaptacion)
Verónica Morgante
4. Search for adaptation mechanisms in microorganisms
from hypersaline environments (collaboration Ramón Rosselló-Móra)
Imagen aérea de los lagos en Bratina Island
Hipersaline antarctic
ponds (Bratina Island)
Salt flats
Añana (Spain)
Coast Salt flats
Boyeruca (Chile),
Es Trenc (Mallorca)
Rhizosphere
and phyllosphere
Salicornia
Calonecris diomedea
(nostril salt glands)
•
Microbial and viral diversity
•
Functional diversit: salt resistance, UV radiation resistance, low temperatures, etc
(functional metagenomics, sequencing, and metatranscriptomic in experiments with mesocosms)
CONCLUSIONS
 Small insert metagenomic libraries have been useful to retrieve genes involved in
resistance to toxic metals and acidic pH.
- genes previously described (chaperons, transporters, DNA binding proteins…)
- hypothetical and unknown genes not previously assigned to be resistant to
these conditions, and now they can be annotated
The team……
Carolina González de Figueras
M. Eugenia Guazzaroni
Salvador Mirete Castañeda
Verónica Morgante
Maria Lamprecht
Olga Zafra
Collaborators from CAB
Manuel Gómez
Marina Postigo
M. Paz Martín