Freshman Seminar

Download Report

Transcript Freshman Seminar

Analysis of your 16s RNA
DNA sequencing
• Most current sequencing projects use the
chain termination method
– Also known as Sanger sequencing, after its
inventor
• Based on action of DNA polymerase
– Adds nucleotides to complementary strand
• Requires template DNA and primer
Chain-termination sequencing
• Dideoxynucleotides
stop synthesis
– Chain terminators
• Included in amounts
so as to terminate
every time the base
appears in the
template
• Use four reactions
– One for each base:
A,C,G, and T
Template
3’ ATCGGTGCATAGCTTGT 5’
Sequence reaction products
5’ TAGCCACGTATCGAACA* 3’
5’ TAGCCACGTATCGAA* 3’
5’ TAGCCACGTATCGA* 3’
5’ TAGCCACGTA* 3’
5’ TAGCCA* 3’
5’ TA* 3’
Sequence detection
• To detect products of
sequencing reaction
• Include labeled
nucleotides
• Formerly, radioactive
labels were used
• Now fluorescent
labels
• Use different
fluorescent tag for
each nucleotide
• Can run all four
TAGCCACGTATCGAA*
TAGCCACGTATC*
TAGCCACG*
TAGCCACGT*
Sequence separation
–
• Terminated chains need
to be separated
• Requires one-base-pair
resolution
– See difference between
chains of X and X+1
base pairs
• Gel electrophoresis
– Very thin gel
– High voltage
– Works with radioactive
or fluorescent labels
CAGTCAGT
+
Sequence reading of
radioactively labeledA reactions
T
C
G
• Radioactive labeled
reactions
–
– Gel dried
– Placed on X-ray film
• Sequence read from
bottom up
• Each lane is a different
base
+
Capillary electrophoresis
• Newer automated
sequencers use very
thin capillary tubes
• Run all four
fluorescently tagged
reactions in same
capillary
• Can have 96
capillaries running at
the same time
robotic arm and syringe
96 glass capillaries
96–well plate
load bar
Sequence reading of
fluorescently labeled reactions
• Fluorescently labeled
reactions scanned by
laser as a particular
point is passed
• Color picked up by
detector
• Output sent directly to
computer
Summary of chain termination
sequencing
Sequence databases
• What is a database?
– An indexed set of records
– Records retrieved using a query language
– Database technology is well established
• Examples of sequence databases
– GenBank
• Encompasses all publicly available protein and
nucleotide sequences
– Protein Data Bank
• Contains 3-D structures of proteins
The biological importance of
sequence alignment
• Sequence alignments assess the degree
of similarity between sequences
• Similar sequences suggest similar function
– Proteins with similar sequences are likely to
play similar biochemical roles
– Regulatory DNA sequences that are similar
will likely have similar roles in gene regulation
• Sequence similarity suggests evolutionary
history
– Fewer differences mean more recent
divergence
Sequence alignment
• Sequence alignments
search for matches
between sequences
• Two broad classes of
sequence alignments
– Global
– Local
• Alignment can be
performed between
two or more
sequences
QKESGPSSSYC
VQQESGLVRTTC
Global alignment
ESG
ESG
Local alignment
The algorithmic problem of aligning
sequences
• Comparison of similar
sequences of similar
length is
straightforward
• How does one deal
with insertions and
gaps that may hide
true similarity?
• How does one
interpret minimal
similarity?
QKESGPSRSYC
QQESGPVRSTC
RQQEPVRSTC
QQESGPVRSTC
QKGSYQEKGYC
QQESGPVRSTC
Methods of sequence alignment
• Graphical methods
• Dynamic-programming methods
• Heuristic methods
A practical example of sequence
alignment
BLAST results
Detailed BLAST results
A pairwise alignment with MASH-1
• HASH-2, a human homolog of MASH-1
– “+” indicates conservative amino acid substitution
– “–” indicates gap/insertion
– XXXX… shows areas of low complexity