PPT - Bruce Blumberg

Download Report

Transcript PPT - Bruce Blumberg

Bio Sci 203 bb-lecture 6 – DNA sequence analysis
• Bruce Blumberg ([email protected])
– office – 2113E McGaugh Hall
– 824-8573
– office hours MWF 11-12.
• Today
– Characterization of Selected DNA Sequences
• DNA sequence analysis
BioSci 203 blumberg lecture 6
page 1
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• You have one protein and want to identify proteins that interact with it
– some sort of interaction screen is indicated
• straight biochemistry
• phage display
• two hybrid
• in vitro expression cloning
BioSci 203 blumberg lecture 6
page 2
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• biochemical approach
– purify cellular proteins that interact with your protein
• co-immunoprecipitation
• affinity chromatography
• biochemical fractionation
– pure protein(s) are microsequenced
• if not in database then make oligonucleotides and screen cDNA library
from appropriate tissues
– advantage
• functional approach
• stringency can be manipulated
• can identify multimeric proteins or complexes
• will work if you can purify proteins
– disadvantages
• much skill required
• low throughput
• considerable optimization required
BioSci 203 blumberg lecture 6
page 3
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• Phage display screening (a.k.a. panning)
– requires a library that expresses
inserts as fusion proteins with a
phage capsid protein
• most are M13 based
• some lambda phages used
– prepare target protein
• as affinity matrix
• or as radiolabeled probe
– test for interaction with library members
• if using affinity matrix you purify phages from a mixture
• if labeling protein one plates fusion protein library and probes with
the protein
– called receptor panning based on similarity with panning for
gold
BioSci 203 blumberg lecture 6
page 4
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• Phage display screening (a.k.a. panning) (contd)
– advantages
• stringency can be manipulated
• if the affinity matrix approach works the cloning could go rapidly
– disadvantages
• Fusion proteins bias the screen against full-length cDNAs
• Multiple attempts required to optimize binding
• Limited targets possible
• may not work for heterodimers
• unlikely to work for complexes
• panning can take many months for each screen
BioSci 203 blumberg lecture 6
page 5
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• Two hybrid screening
– originally used in yeast, now
other systems possible
– prepare bait - target protein fused
to DBD (GAL4) usual
• stable cell line is commonly used
– prepare fusion protein library with
an activation domain
– What is key factor required for success?
No activation domain in bait!
– approach
• transfect library into cells and either
select for survival or activation of
reporter gene
• purify and characterize
positive clones
BioSci 203 blumberg lecture 6
page 6
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• Two hybrid screening (contd)
– advantages
• seems simple and inexpensive on its face
– in materials
• functional assay
– disadvantages
• fusion proteins bias the screen against full-length cDNAs.
• Binding parameters not manipulable
• Difficult or impossible to detect interactions between proteins and
complexes.
• Doesn’t work for secreted proteins
• Many months to screen
– savings in materials are eaten up by salaries
– avg grad student costs $30k/year
– avg postdoc or tech costs $40k/year
• MANY false positives
BioSci 203 blumberg lecture 6
page 7
©copyright
Bruce Blumberg 2001-2005. All rights reserved
How to identify your gene of interest (contd)
• In vitro interaction screening
– based on in vitro expression cloning (IVEC)
• transcribe and translate cDNA libraries in vitro into small pools of
proteins (~100)
• test these proteins for their ability to interact with your protein of
interest
– EMSA, co-ip, FRET, SPA
– advantages
• functional approach
• smaller pools increase sensitivity
• automated variant allows diversity of targets
– proteins, protein complexes, nucleic acids, protein/nucleic acid
complexes, small molecule drugs
– very fast
– disadvantages
• can’t detect heterodimers unless 1 partner known
• expensive consumables (but cheap salaries)
– typical screen will cost $10-15K
• expense of automation
BioSci 203 blumberg lecture 6
page 8
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Analysis of genes and cDNAs
• Characterization of cloned DNA (what do we want to know about a new gene?
– Complete DNA sequence
• cDNA sequence
• genomic sequence? (promoters, introns and exons)
• Restriction enzyme maps?
– where is the promoter(s)?
• Alternative promoter use?
• Mapping transcription start(s)
– where and when is mRNA expressed?
• How abundantly is it expressed in each place?
• association between expression levels and putative function?
– What is the function of this gene?
• Loss-of-function analysis decisive
– Knockout or mutation
– Knockdown (morpholino antisense, si RNA)
– mutant mRNA e.g. dominant negative
• gain of function may be helpful
– transgenic
– mutant mRNA - constitutively active transcription factor
BioSci 203 blumberg lecture 6
page 9
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis
• Complete DNA sequence (all nts both strands, no gaps)
– complete sequence is desirable but takes time
• how long depends on size and strategy employed
– which strategy to use depends on various factors
• how large is the clone?
– cDNA
– genomic
• How fast is sequence required?
• sequencing strategies
– primer walking
– cloning and sequencing of restriction fragments
– progressive deletions
• Bidirectional, unidirectional
– Shotgun sequencing
• whole genome
• with mapping
– map first (C. elegans)
– map as you go (many)
BioSci 203 blumberg lecture 6
page 10
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis (contd)
• Primer walking - walk from the ends with oligonucleotides
– sequence, back up ~50 nt from end, make a primer and continue
– Why back up?
• To get adequate
overlap
• May not get within 50
nt of primer with
current sequencing
BioSci 203 blumberg lecture 6
page 11
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis (contd)
• Primer walking (contd)
– advantages
• very simple
• no possibility to lose bits of DNA
– restriction mapping
– deletion methods
• no restriction map needed
• best choice for short DNA
– disadvantages
• slowest method
– about a week between sequencing runs
• oligos are not free (and not reusable)
• not feasible for large sequences
– applications
• cDNA sequencing when time is not critical
• targeted sequencing
– verification
– closing gaps in sequences
BioSci 203 blumberg lecture 6
page 12
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis (contd)
• Cloning and sequencing of restriction fragments
– once the most popular method
• make a restriction map, subclone fragments
• sequence
– advantages
• straightforward
• directed approach
• can go quickly
• cloned fragments often useful otherwise
– RNase protection, nuclease mapping, in situ hybridization
– disadvantages
• possible to lose small fragments
– must run high quality analytical gels
• depends on quality of restriction map
– mistaken mapping -> wrong sequence
• restriction site availability
– applications
• sequencing small cDNAs
• isolating regions to close gaps
BioSci 203 blumberg lecture 6
page 13
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis (contd)
• nested deletion strategies - sequential deletions from one end of the clone
– cut, close and sequence
• Approach
– make restriction map
– use enzymes that cut in polylinker and insert
– Religate, sequence from end with restriction site
– repeat until finished, filling in gaps with oligos
• advantages
– Fast, simple, efficient
• disadvantages
– limited by restriction site availability in vector and insert
– need to make a restriction map
BioSci 203 blumberg lecture 6
page 14
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis (contd)
• nested deletion strategies (contd)
– Exonuclease III-mediated deletion
• cut with polylinker enzyme
– protect ends » 3’ overhang
» phosphorothioate
• cut with enzyme between first
cut and the insert
– can’t leave 3’ overhang
• timed digestions with Exonuclease III
• stop reactions, blunt ends
• ligate and size select recombinants
• sequence
• advantages
– unidirectional
– processivity of enzyme
gives nested deletions
BioSci 203 blumberg lecture 6
page 15
©copyright
Bruce Blumberg 2001-2005. All rights reserved
DNA Sequence analysis (contd)
• Nested deletion strategies
– Exonuclease III-mediated deletion (contd)
• disadvantages
– need two unique restriction sites flanking insert on each side
– best used successively to get > 10kb total deletions
– may not get complete overlaps of sequences
» fill in with restriction fragments or oligos
• applications
– method of choice for moderate size sequencing projects
» cDNAs
» genomic clones
– good for closing larger gaps
BioSci 203 blumberg lecture 6
page 16
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis
• Shotgun sequencing NOT invented by Craig Venter
– Messing 1981 first description of shotgun
– Sanger lab developed current methods in 1983
– approach
• blast genome into small chunks
• clone these chunks
– 3-5 kb, 8 kb plasmid
– 40 kb fosmid jump
repetitive sequences
• sequence + assemble by computer
– A priori difficulties
• how to get nice uniform distribution
• how to assemble fragments
• what to do about repeats?
• How to minimize sequence redundancy?
BioSci 203 blumberg lecture 6
page 17
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
BioSci 203 blumberg lecture 6
page 18
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
BioSci 203 blumberg lecture 6
page 19
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
• Shotgun sequencing (contd)
– How to minimize sequence redundancy?
• Best way to minimize redundancy is map before you start
– C. elegans was done this way - when the sequence was finished,
it was FINISHED
» mapping took almost 10 years
– mapping much too tedious and nonprofitable for Celera
» who cares about redundancy, let’s sequence and make $$
• why does redundancy matter?
– Finished sequence today costs about $0.50/base
BioSci 203 blumberg lecture 6
page 20
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
– Mapping by hybridization
– Mapping by fingerprinting
BioSci 203 blumberg lecture 6
page 21
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
– Map as you go
BioSci 203 blumberg lecture 6
page 22
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
• Whole genome shotgun sequencing (Celera)
– premise is that rapid generation of draft sequence is valuable
– why bother trying to clone and sequence difficult regions?
• Basically just forget regions of repetitive DNA - not cost effective
– using this approach, it is easy to get to 90% finished
• rule of thumb is that it takes at least as long to finish the last 5% as it
took to get the first 95%
– problems
• sequences done this way may never be complete as is C. elegans
• much redundant sequence with many sparse regions and lots of gaps.
• Fragment assembly for regions of highly repetitive DNA is dubious at
best
• “Finished” fly and human genomes lack more than a few already
characterized genes
BioSci 203 blumberg lecture 6
page 23
©copyright
Bruce Blumberg 2001-2005. All rights reserved
Large-Scale DNA Sequence analysis (contd)
• How to approach a large new genome, knowing what we know now?
– Xenopus tropicalis 1.7 Gb (about ½ human)
• Whole genome shotgun
• BAC end sequencing
• EST sequencing
– 8 x coverage currently
– How to finish?
• Gaps could be closed with BACS
• Finishing dependent on additional funding
BioSci 203 blumberg lecture 6
page 24
©copyright
Bruce Blumberg 2001-2005. All rights reserved