Complementary DNA Sequencing: Expressed Sequence Tags and

Download Report

Transcript Complementary DNA Sequencing: Expressed Sequence Tags and

Complementary DNA
Sequencing:
Expressed Sequence Tags and
Human Genome Project
Adams MD, Kelly JM, Gocayne JD, Dubnick M,
Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B,
Moreno RF, Kerlavage AR, McCombie WR,
and Venter JC*
Presented by Malva “Lisa” Severios
Background
• Partial DNA sequencing done on 600
human cDNA clones to generate
expressed sequence tags (ESTs).
• With up to 100,000 genes present in the
human genome, 30,000 are expected to
be expressed in the brain.
Expressed Sequence Tags
• Markers for Genome mapping.
• Obtained by randomly acquiring cDNA
sequences.
• Quick method that’s able to give information
about the diversity of genes expressed.
• Can be mapped to chromosomes by FISH,
RH panels, BAC contigs, and Polymorphic STS.
• Uses include: determining relationships between
genes and forensic analysis.
Purpose
• Promote a faster approach to cDNA
characterization.
• Tag most of the human neurological genes
expressed in the brain.
• Up to 1/4th of all genetic diseases affect
neurological functions, indicating the importance
of the genes expressed in the brain.
• Provide a less expensive method to find human
genes.
Purpose…cont.
• Examine diversity of representative cDNA
libraries.
• Identify advantages / disadvantages of
cDNA libraries.
• Determine info content and accuracy of
single-run sequence reactions.
Experimental Goal
Of cDNA libraries, random-primed and partial
cDNA clones are more informative in identifying
genes and constructing a more useful EST
database than sequencing from the ends of
full-length cDNAs.
Therefore, obtain coding sequences in order to
take advantage of more sensitive peptide
sequences and for nucleotide sequence
comparisons.
Experiment
• 3 commercial human brain cDNA libraries
~ made from isolated mRNA in the
hippocampus and temoral cortex of a
2 year old and fetus~
• Single-run DNA sequence data obtained
from randomly chosen cDNA clones.
• Tested subtractive hybridization to
increase the # of brain-specific clones.
EST Matches to Human Genes
EST Matches to Human
Genes…cont.
Map ESTs to Human
Chromosomes
• PCR used to screen somatic cell hybrid
cell lines with defined sets of human
chromosomes → hybrids with human gene
corresponding to a specific EST yields an
amplified fragment.
• The chromosome present in all hybrids
that yielded an amplified fragment is the
location of the EST.
Chromosome Segregation of ESTs
GenBank
• Able to analyze accuracy of ds automated DNA
sequencing by matching ESTs with human
sequences in GenBank.
• NIH genetic sequence database, providing a
collection of DNA sequences to the public.
• Avg accuracy rate: 97.7% for 150 - 400 bases,
having < 3% ambiguous base calls.
• 348 ESTs met this criteria and were submitted to
GenBank.
EST Similarities in GenBank and
PIR Databases
Conclusions
• Single-run DNA sequencing is efficient in
obtaining preliminary data on cDNA clones.
• Found 230 ESTs, representing new genes
• Random selection approach yields a high
amount of highly represented clones in the
cDNA libraries used ~ NOT GOOD!!
• EST and physical mapping → high resolution
map of the location of genes on chromosomes ~
more efficient and cheaper than genomic
sequencing.
Conclusions…cont.
• Using ESTs will provide a better way of
anayzing chromosomes and discovering
more human genes.
• EST method will result in partial
sequencing of most human brain cDNAs in
a couple years → further identification of
genes involved in neurological diseases.
What’s next?
• Characterize the new 230 genes found by:
~ complete sequencing and expression
~ chromosome mapping
~ tissue distribution
~ immunology
Debate against cDNA for the
Human Genome Project
• With cDNAs, can only find sequence of protein
encoding information (ie. Know sequence of
exons, not introns ~ which is important for
regulation and control).
• Difficulty finding every single mRNA expressed
in all tissues, cell types, and developmental
stages.
• Gene coding regions, and therefore mRNA
sequences, are NOT predictable from genomic
sequences → don’t need large-scale cDNA
sequencing.