Transcript Genome
Jeff Young, Botanist
[email protected]
x3638
Office: BI412
Office Hours
M WF - 1 - 2 pm
…by appointment.
Arabidopsis thaliana
Genome-based, molecular
study of plant physiology and
environmental responses.
DNA Sequence
Reagent for the 21st Century
“Biology is in the midst of an intellectual and
experimental sea change....
...essentially the discipline is moving from being
largely a data-poor science to becoming a data-rich
science. ”
Vukmirovic and Tilghman, Nature 405, 820-822 (2000)
Data Poor Era
Data Rich Era
• Great DATA, but you had to
get it yourself,
–
–
–
–
–
Collect data from previous day,
Set-up experiment,
Lunch,
Analyze, discuss data,
Repeat…
• Free DATA, more than any
one person could ever use.
Course Goals
• Introduce Genome Scale Research,
• Develop and improve skills in reading, analyzing
and understanding primary literature,
• Enjoy, responsibly, the enormous amount of
creativity and genius that is being expended, right
now, in the biological sciences.
Class Evaluation
Reading Assignments
available online
materials will be from the primary literature, and journal
reviews.
All materials may be downloaded (for free) for printing,
however, sometimes figures are best viewed on a monitor.
You are responsible for understanding these papers, including all
figures and tables.
You must read each assigned paper prior to lecture (if you want
to do OK).
Recommended (optional) background and supporting materials will
be made available.
Reading Recommendations
Read before class,
Follow references,
– abstracts, if not entire papers are free on line (NCBI: PubMed),
– may contain materials and methods,
Look up words and concepts that aren’t familiar,
Don’t neglect Figures and Tables.
Genomics
…the systematic study of genomes that
begins with large scale DNA sequencing,
– Structural genomics: the study of DNA sequence,
chromatin structure, and DNA physical interactions,
– Functional genomics: how particular DNA
sequences facilitate biological functions,
– Bioinformatics: computational discipline that has
evolved to handle modern biological data...
Hieter P and Boguski M. Science 278, 601-02.
Genomics
... Genomics...is characterized by high throughput
or large-scale experimental methodologies
combined with statistical and computational
analysis of the results.
...the fundamental strategy in a functional
genomics approach is to expand the scope of
biological investigation from studying single
genes or proteins, to studying all genes or proteins
at once in a systematic fashion.
DNA
Genome
mRNA
Protein
Transcriptome
Proteome
• Genome... the dynamic complement of heritable genetic material,
• Transcriptome... mRNA in a cell, tissue, organ or individual,
– complexity increases resulting from transcription control and posttranscription modification,
• Proteome... protein in a cell, tissue, organ or individual,
– complexity increases due to post-translational modification, proteinprotein interactions, etc.
Modern research integrates data from all of these sources.
Course Contents
• Introduction to Functional Genomics
• Sequencing Complex Genomes
• Environmental Genome Sequencing
• NexGen Technology
• Bioinformatics I
Bioinformatics II
(Genetics, Mouse Knockouts)
(Protein Biochemistry)
• Reverse Genetics I
(RNAi)
• Reverse Genetics II
(Target Genes)
• Transcriptome I
(Expression Microarray)
• Transcriptome II*
(DNA Microarray)
• Proteomics I+
(Mass Spectrometry, Y2H)
• Student Presentations
Student Presentations
Environmental/Ecological Genomics
Bioinformatics
Canine Genomics
Personal Genome Projects
Malaria Genomics
Comparative Genomics
NexGen Results
Evolutionary Genomics
Sequencing Projects/Results,
Mouse
Chicken
Chimpanzee,
etc.
Systems Biology
Others, with approval.
GENOMICS
Controversial From the Start
Objection #1: Big Biology Is Bad Biology
Objection #2: Why Sequence the Junk?
Objection #3: Impossible to Do!
Besides, who’d want to do it?
"Absurd," "dangerous," and "impossible," scoffed numerous
critics, who noted that the technology did not exist to sequence
a bacterium, much less a human. And even if the project's
starry-eyed proponents could by some miracle pull it off, who
would want the complete sequence data anyway?1
In the late 1970s, an entire doctoral thesis might be devoted
to reporting the sequence of a gene of several thousand DNA
bases.2
1
Science 291 (5507), 1182-1188
2
Science 287 (5459), 1777-1782
Gene Sequencing
Ph.D. Projects
1970s: Thesis Title
1970s: Sequence Gene
1980s: Chapter Title
1980s: Sequence Gene(s) + Mol. Analysis
1990s: Material and Methods entry
1990s: Mol. Analysis + Pre-Genomics
2000s: Reference to Database
2000s: Post-Genom. + System Analysis
Science 291 (5507), 1182-1188
In the 1980s…
Sydney Brenner
... facetiously suggested that project leaders parcel out the job to
prisoners as punishment--the more heinous the crime, the bigger
the chromosome they would have to decipher.
Who wanted to do it?
It turns out a lot of people did.
...with the help of lots of machines.
• “This once-ludicrous proposal became one of most
hotly contested--and contentious--races in recent
scientific history.”
• “Although the race has been dominated in the past
few years by the acrimonious feud between the
public and private teams, tensions go way back…”
Science 291 (5507), 1182-1188
Objection #1: Big Biology Is Bad Biology
• Researchers feared that a massive sequencing project
would siphon precious dollars from investigator-initiated
research, destroying the cottage industry culture of biology
in the process.
– 1988, US Congress agreed to fund the HGP separately.
• ...just as bad, the project didn't even amount to hypothesisdriven science at all. Rather, critics charged, it was no
more than a big fishing expedition, a mindless factory
project that no scientists in their right minds would join.
Science 291 (5507), 1182
Hypothesis vs. Discovery
• "Discovery science has absolutely revolutionized biology,"
says Leroy Hood, now director of the Institute for Systems
Biology in Seattle, Washington...
• ...it's given us new tools for doing hypothesis-driven
research," maintains Hood, and these tools help rather than
hinder individual investigators."
Science 291 (5507), 1182
Objection #2: Why Sequence the Junk?
• ~2% of the human genome codes for polypeptides,
– why not sequence the 6o million bases that “make
something”.
• besides, sequencing the rest, often called “junk DNA”,
– “...(it) would be a waste of time and money to include the
repetitive, hard-to-sequence regions in the genome project.”
Science 291 (5507), 1184
Why Sequence the Junk?
• Promoters!
– control expression.
• Telomeres!
– prevent the ends of the chromosome from fraying during cell division and
help determine a cell's life-span.
• Repetitive and “non-protein” coding sequences!
– plays a crucial role in X chromosome inactivation,
• plays a similar role in the regulation of other genes/genomic regions,
– plays a role in genome surveillance/protection,
– “noncoding DNA (may) provide "a built-in plasticity that ... if an organism
is going to evolve, may be a huge selective advantage.”
• Other?
Science 291 (5507), 1184
Objection #3: Impossible to Do
• State-of-the-art sequencing could produce about 500 bases
per 8 hours per rig, working day in and day out,
– and the computer technology that came to play such a vital role in
the project wasn't even invented yet.
• "In the early days, it was believed that a radical new
technology would be required to sequence the full human
genome,
– ...but it didn't turn out that way.”
- Stanford University geneticist David Botstein.
Science 291 (5507), 1186
Not Revolution, Evolution
• radioactive probes --> fluorescent probes,
– allowed automated, laser-based detection,
• slab gels --> capillary tubes,
• automation and computer technology.
"It was definitely evolution...but you can go a long
way with evolution.”
...David Baltimore, president of the California Institute of Technology.
Science 291 (5507), 1186
> 206 Gb (Dec. 2007)
> 165,000 organisms
Presently
Reference, GOLD
Nature Reviews Genetics
Friday: pp. 302 - 307 (figs 1 -3)