ch_06_lecture_presentation_PCNo questionsx
Download
Report
Transcript ch_06_lecture_presentation_PCNo questionsx
6.2 Sequencing Genomes
PowerPoint® Lecture
Presentations prepared by
John Zamora
Middle Tennessee State
University
CHAPTER
6
Microbial
Genomics
© 2015 Pearson Education, Inc.
Entering the era of Science Fiction
Sequencing the Oceans
The Human Ecosystem
Artificial Genomes
Personalized Genomics
© 2015 Pearson Education, Inc.
6.1 Introduction to Genomics
• Genome
• Entire complement of genetic information
• Includes genes, regulatory sequences, and noncoding
DNA
• Genomics
• Discipline of mapping, sequencing, analyzing, and
comparing genomes
• Sequencing: determining the precise order of nucleotides
in a DNA or RNA molecule
© 2015 Pearson Education, Inc.
6.2 Sequencing Genomes
• Sanger method
• Dideoxy analogs of
dNTPs used in
conjunction with
dNTPs
• Analog prevents
further extension of
DNA chain
Bases are labeled with
radioactivity
• Gel electrophoresis is
then performed on
products
© 2015 Pearson Education, Inc.
Fredrick Sanger
Figure 6.1
6.2 Sequencing Genomes
Fredrick Sanger
© 2015 Pearson Education, Inc.
Figure 6.2a
6.2 Sequencing Genomes
G
A
T
C
Sequence from bottom:
5’ – A G C T A A G – 3’
Sequence of unknown strand:
3’ –T C G A T T C – 5’
© 2015 Pearson Education, Inc.
Fredrick Sanger
Figure 6.2b
6.2 Sequencing Genomes
Radioactivity
replaced by
fluorescent dye
Sequence from bottom:
5’ – A G C T A A G – 3’
Sequence of unknown strand:
3’ –T C G A T T C – 5’
© 2015 Pearson Education, Inc.
Fredrick Sanger
Figure 6.2c
6.2 Sequencing Genomes
Virtually all genomic sequencing projects use shotgun sequencing
Entire genome is cloned, and resultant clones are sequenced
Much of the sequencing is redundant (7-10 fold = DEPTH)
© 2015 Pearson Education, Inc.
6.2 Sequencing Genomes – 2nd GEN
• Second-generation DNA sequencing
• Generates data 100x faster than Sanger method
• Massively parallel methods = DEPTH
• Large number of amplified samples sequenced side by side
• Uses increased computer power and miniaturization
• 454 and Illumina: Enzymes generate light, which is
quantified
© 2015 Pearson Education, Inc.
6.2 Sequencing Genomes – 2nd GEN
• Illumina
© 2015 Pearson Education, Inc.
6.2 Sequencing Genomes – 2nd GEN
• Illumina
© 2015 Pearson Education, Inc.
6.2 Sequencing Genomes – 3rd GEN
• Pacific Biosciences SMRT (PacBio)
• Single-stranded DNA fragments attached
• Complementary strand synthesized
• Fluorescent tags monitored
“Nanocontainers”
© 2015 Pearson Education, Inc.
6.2 Sequencing Genomes – 4th GEN
• Ion torrent semiconductor sequencing
© 2015 Pearson Education, Inc.
Figure 6.4a
6.2 Sequencing Genomes – 4th GEN
• Nanopore sequencing
“Pocket Sequencing”
© 2015 Pearson Education, Inc.
Figure 6.4b
6.3 Bioinformatics and Annotating Genomes
© 2015 Pearson Education, Inc.
Figure 6.5
6.3 Bioinformatics and Annotating Genomes
• Functional ORF: an open reading frame that
encodes a peptide or protein
• Computer algorithms used to search for ORFs
• Look for start/stop codons and Shine–Dalgarno
sequences
• ORFs can be compared to ORFs in other
genomes = comparative genomics
© 2015 Pearson Education, Inc.
6.3 Bioinformatics and Annotating Genomes
© 2015 Pearson Education, Inc.
6.3 Bioinformatics and Annotating Genomes
© 2015 Pearson Education, Inc.
Figure 6.6
6.3 Bioinformatics and Annotating Genomes
• Number of genes with role that can be clearly
identified in a given genome is 70% or less of total
ORFs detected
• Hypothetical proteins: uncharacterized ORFs;
proteins that likely exist but whose function is
currently unknown
• BIOPROSPECTING
Enzyme Discovery
& Engineering
© 2015 Pearson Education, Inc.
6.4 Genome Size and Content
• Correlation between genome size and ORFs On
average, a prokaryotic gene is 1,000 bp long
• ~1,000 genes per megabase
© 2015 Pearson Education, Inc.
Figure 6.7
6.4 Genome Size and Content
NCBI Genomes Jan 25 = 94,992
Plus incomplete, unassembled, and unannotated…
© 2015 Pearson Education, Inc.
6.4 Genome Size and Content
• Comparative analyses allow for predictions of metabolic
pathways and transport systems
© 2015 Pearson Education, Inc.
Figure 6.9
6.4 Genome Size and Content
• Percentage of an organism's genes devoted to a specific cell function
is to some degree a function of genome size
• Replication and translational indispensable; Regulation increases in
complexity with genome size
© 2015 Pearson Education, Inc.
Figure 6.10
6.5 Genomes of Organelles
• Encode proteins required for
photosynthetic reactions and
CO2 fixation
• Contains rRNA used in
chloroplast ribosomes, tRNA
for translation and several
proteins used in transcription
and translation.
• Some chloroplast proteins
are encoded in the nucleus
© 2015 Pearson Education, Inc.
Figure 6.11
6.5 Genomes of Organelles
• Primarily encode proteins for
oxidative phosphorylation
• Use simplified genetic codes
rather than "universal" code
• Some contain small
plasmids
• Mammalian mitochondria
encode 13 proteins
© 2015 Pearson Education, Inc.
Figure 6.12
6.5 Genomes of Organelles
• Many insects and some
other invertebrates contain
symbiotic bacteria
• Symbionts are not capable
of independent life
• Restricted genomes
• Host receives essential
amino acids and other
nutrients
© 2015 Pearson Education, Inc.
Figure 6.12
6.6 Eukaryotic Microbial Genomes
• The haploid yeast genome
• Entire genome is ~13,400 kbp
• Encodes ~6,000 ORFs;
• ~4,000 encode proteins with
known functions
• About 900 ORFs are essential (single deletions)
• Contains a large amount of repetitive DNA
• Genes contain introns
© 2015 Pearson Education, Inc.
6.6 Eukaryotic Microbial Genomes
Avg introns = 10 per gene
© 2015 Pearson Education, Inc.
Figure 6.16
III. Functional Genomics
“From the growth of the Internet
through to the mapping of the
human
genome
and
our
understanding of the human
brain, the more we understand,
the more there seems to be for
us to explore.”
-Martin Rees
© 2015 Pearson Education, Inc.
Explore 6.1
6.7 Microarrays and the Transcriptome
• Transcriptome
• The entire complement of RNA produced under a given
set of conditions
• Microarrays
• Small solid-state supports to which genes or portions of
genes are fixed and arrayed spatially in a known pattern
© 2015 Pearson Education, Inc.
6.7 Microarrays and the Transcriptome
• DNA segments on
arrays are hybridized
with mRNA from cells
grown under specific
conditions and
analyzed to determine
patterns of gene
expression
© 2015 Pearson Education, Inc.
Figure 6.17
6.7 Microarrays and the Transcriptome
• What can be learned from microarray
experiments?
• Global gene expression
• Expression of specific groups of genes under different
conditions
• Expression of genes with unknown function; can yield
clues to possible roles
• Identification of specific organisms
© 2015 Pearson Education, Inc.
6.7 Microarrays and the Transcriptome
• RNA-SEQ: all RNA molecules from a cell are sequenced
• Don’t need a genome or other template
• Quantitative: can be used to determine induction
Exponential Phase:
4.5 hours
Stationary Phase:
14 hours
© 2015 Pearson Education, Inc.
Figure 6.19
6.8 Proteomics and the Interactome
Proteomics
study of the structure,
function, and regulation
of an organism's
proteins
Proteome
The entire set of proteins
expressed by a genome,
cell, tissue or organism
at a certain time
2-D Gel Electrophoresis
© 2015 Pearson Education, Inc.
Figure 6.20
6.8 Proteomics and the Interactome
• Proteins with >50% sequence similarity typically
have similar functions
• Proteins with >70% sequence similarity almost
certainly have similar functions
• Protein domains
• Distinct structural modules within proteins
• Have characteristic functions that can reveal much
about a protein's role, even in the absence of complete
sequence homology
© 2015 Pearson Education, Inc.
6.8 Proteomics and the Interactome
• Interactome
• Complete set of interactions among molecules
© 2015 Pearson Education, Inc.
Figure 6.22
Interactions are a Social Network
© 2015 Pearson Education, Inc.
6.9 Metabolomics and Systems Biology
• Metabolome
• The complete set of metabolic intermediates and other
small molecules produced in an organism
• Mass spectrometry is one of the primary
techniques for monitoring metabolites
© 2015 Pearson Education, Inc.
Figure 6.23
6.9 Metabolomics and Systems Biology
Me
© 2015 Pearson Education, Inc.
Figure 6.24
6.10 Metagenomics
•
Metagenome
• The total gene content of the organisms
present in an environment
• Several environments have been surveyed by largescale metagenome projects
• Examples: human body, marine ecosystems,
fertile soil
• HMP: Human Microbiome Project
© 2015 Pearson Education, Inc.
6.10 Metagenomics
Jeff Gordon
The Human Microbiome Project
• The NIH Common Fund Human Microbiome Project (HMP) was
established in 2008, with the mission of generating resources that would
enable the comprehensive characterization of the human microbiome
and analysis of its role in human health and disease.
• The HMP has characterized the microbial communities found at several
different sites on the human body: nasal passages, oral cavity, skin,
gastrointestinal tract, and urogenital tract. The project has examined the
role of these microbes in human health and disease.
• The HMP is an interdisciplinary effort involving four sequencing centers:
The Broad Institute, the Baylor College of Medicine, Washington
University School of Medicine, and the J. Craig Venter Institute.
© 2015 Pearson Education, Inc.
6.10 Metagenomics
The Human Microbiome Project
The 5 stated aims of the project include:
1. Development of a reference set of 3,000 isolate microbial genomes
2. Initial 16S & mWGS metagenomic studies at each of the 5 target sites
(i.e. "core" microbiomes)
3. Demonstration projects to determine the relationship between disease
and changes in the human microbiome
4. Development of new tools and technologies for computational
analysis, establishment of a data analysis and resource repositories
5. Examination of the ethical, legal and social implications (ELSI)
© 2015 Pearson Education, Inc.
“Omics” – Summary
© 2015 Pearson Education, Inc.
Table 6.6
6.11 Gene Families, Duplications, and Deletions
Some Terms:
• Homologous: related sequence that implies
common genetic ancestry
• Gene families: groups of gene homologs
• Paralogs: genes within an organism whose
similarity to one or more genes in the same
organism is the result of gene duplication
• Orthologs: genes found in one organism that are
similar to those in another organism but differ
because of speciation
© 2015 Pearson Education, Inc.
6.11 Gene Families, Duplications, and Deletions
© 2015 Pearson Education, Inc.
Figure 6.27
6.11 Gene Families, Duplications, and Deletions
• Gene duplications thought to be mechanism for
evolution of most new genes
“neofunctionalization”
© 2015 Pearson Education, Inc.
Figure 6.28
6.11 Gene Families, Duplications, and Deletions
• Gene duplications thought to be mechanism for
evolution of most new genes
X
© 2015 Pearson Education, Inc.
X
X
http://www.personal.psu.edu/rua15/Stage3.jpg
6.11 Gene Families, Duplications, and Deletions
• Deletions can eliminate gene no longer needed
• Gene analysis in the three domains of life
suggests that many genes present in all
organisms have common evolutionary roots
© 2015 Pearson Education, Inc.
6.12 Horizontal Gene Transfer and Genome
Stability
• The transfer of genetic information between organisms, as
opposed to vertical inheritance from parental organism(s)
• May cross phylogenetic domain boundaries
© 2015 Pearson Education, Inc.
Figure 6.29
6.12 Horizontal Gene Transfer and Genome
Stability
• Detecting horizontal gene flow:
• Presence of genes typically
found only in distantly related
species
• Presence of a DNA with GC
content or codon bias that
differs significantly from
remainder of genome
• Horizontally transferred genes typically do not encode core
metabolic functions
© 2015 Pearson Education, Inc.
6.12 Horizontal Gene Transfer and Genome
Stability
• Transposons—pieces of DNA that can move between
chromosome, plasmids, and viruses
© 2015 Pearson Education, Inc.
Figure 6.30
6.12 Horizontal Gene Transfer and Genome
Stability
• Transposons may transfer DNA between different
organisms
• Transposons may also mediate large-scale
chromosomal changes within a single organism
• Presence of multiple insertion sequences (IS)
• Recombination among identical IS can result in
chromosomal rearrangements
• Examples: deletions, inversions, or translocations
© 2015 Pearson Education, Inc.
6.13 Core Genome versus Pan Genome
• The "pan"/"core" concept: genomes of bacterial
species consist of two components
• Core genome: shared by all strains of the species
• Pan genome: includes all the optional extras present in
some but not all strains of the species
© 2015 Pearson Education, Inc.
6.13 Core Genome versus Pan Genome
Core = Black
Pan = Everything
© 2015 Pearson Education, Inc.
Figure 6.31
6.13 Core Genome versus Pan Genome
© 2015 Pearson Education, Inc.
Figure 6.32
6.13 Core Genome versus Pan Genome
Figure 1. Circular representation of
the genome of Campylobacter
jejuni NCTC 11168. Genome maps
(in order of presentation from
outside to inside) are: (A) NCTCK12E5 from an infected human
being; (B) NCTC 11168-GSv; and
(C) NCTC 11168-V26. The scale on
the outside of the outermost map
represents genome location (x 10 4
bases). Red and orange bars
represent mutations relative to the
original annotated reference NCTC
11168-GS strain deposited in
GenBank
Comparative Variation within the Genome of Campylobacter jejuni NCTC 11168 in Human
and Murine Hosts (2014) Thomas DT, Lone AG, Selinger LB, Taboada, EN, Abbott, DW, Inglis
GD. PLOS One 9(2):e88229.
© 2015 Pearson Education, Inc.
6.13 Core Genome versus Pan Genome
• Chromosomal islands believed to have a "foreign"
origin based on several observations
• Extra regions often flanked by inverted repeats
• Base composition and codon usage in chromosomal
islands often differ from rest of genome
• Often found in some strains of a species but not others
© 2015 Pearson Education, Inc.
6.13 Core Genome versus Pan Genome
• Chromosomal islands: Region of bacterial chromosome of
foreign origin that contains clustered genes for some extra property
such as virulence or symbiosis
GC Content
Red = different
Blue = average
Gene comparison
Red = virulence
Green = conserved
PAI: Pathogenicity Island
CI: Chromosomal Island
© 2015 Pearson Education, Inc.
Figure 6.33
6.13 Core Genome versus Pan Genome
• Chromosomal islands contribute specialized
functions not essential to growth
• Virulence
• Biodegradation of recalcitrant compounds
• Symbiosis
© 2015 Pearson Education, Inc.