David Haussler`s Presentation - Research Review Day

Download Report

Transcript David Haussler`s Presentation - Research Review Day

The Human Genome Project
and 100 Million Years of Human
Evolution
David Haussler
Center for Biomolecular Science and Engineering
University of California, Santa Cruz
The human genome is a recipe for an
entire body and brain
• The genome is organized into
23 pairs of human chromosomes
(1-22 and the pair X,Y or X,X)
• Each chromosome consists of
DNA – molecular string of A,
C, G, & T (bases), 3 billion in
all
• All cells in the body have the
same DNA that was in the
original fertilized egg
• Genes are DNA sequence that
codes for proteins (only about
1.5% of human genome)
To what extent does a person’s
genome define them?
On July 7, 2000, UCSC posted
human genome on the web
Outgoing UCSC internet traffic for year 2000
The UCSC Genome Browser: a new kind
of web-based genome microscope
• Data from all over the world are fed into
nightly updates of the UCSC browser database,
analysis, and display
• Every day, more than 7,000 biomedical
researchers use it to scan the genome at ever
greater detail, dimension, and depth, making
more than 300,000 web page requests
Explore the genome at http://genome.ucsc.edu
UCSC Genome Bioinformatics Group
Large-scale Operations in
Genome Evolution
Zack Sanborn
Example: evolutionary history of a
mammalian chromosome
History of rat chromosome X
Jian Ma, Bernard Suh, Brian Raney
Example: evolutionary history of a
mammalian chromosome
History of rat chromosome X
Jian Ma, Bernard Suh, Brian Raney
Morpheus: new genes by segmental duplication
Baboon
chromosome
Human
chromosome
Evan Eichler
The demise of a gene
Codon TGG for amino acid tryptophan became a stop codon in this gene before
the human-chimp ancestor, killing the gene. Proteins of this type (acyltransferase
3) appear in all branches of life; this was the last in the hominid genome.
Jing Zhu, Zack Sanborn, Craig Lowe
Project to
reconstruct the
evolutionary
history of the
genomes of
placental
mammals
Data from
NHGRI
Comparative
Genome
Sequencing
Program
Homo sapiens sapiens
Homo sapiens
neanderthalensis
Homo sapiens
chimpanzee
(Pan troglodytes)
Homo/Pan
Gorilla
Homo/Pan/Gorilla
orangutan
Hominidae
(great apes)
gibbon
Homonoidae
(apes)
rhesus macaque
Catarrhini
(old world monkeys
and apes)
marmoset
Anthropoidea
tarsier
Haplorhines
bushbaby
Primates
pygmy tree shrew
Eurachonta
mouse
(Mus musculus
“genomicus”)
Euarchontoglires
Boreoeutheria
common shrew
Eutheria
(placental mammals)
elephant shrew
Tursiops truncates
Not all descendants of
the eutherian ancestor
are shrew-like
We found 49 genomic regions that showed
extremely accelerated evolution in humans
Human Accelerated Region 1
Katie Pollard and Sofie Salama
HAR1 produces a structured RNA
sequence that is expressed in the fetal brain
New interactions in
the human version
of this gene
Computational prediction of structure conserved throughout amniotes
Jakob Pedersen
The six layers of the cerebral cortex are
built during fetal brain development
During development, the
cerebral cortex is built
“inside-out” by neurons
migrating radially from
the subventricular zone to
the pial surface. This
process is guided by the
neurodevelopmental gene
Reelin.
Image: www.thebrain.mcgill.ca
HAR1 is expressed in the same cells as Reelin (the
Cajal-Retzius neurons), and during the same period
of development (8-20 GW)
Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen
We are pursuing the hypothesis that HAR1
functions in cortical development and was
involved in the evolution of the human brain
Grand challenge of human
molecular evolution
Reconstruct the evolutionary history
of each base in the human genome
•
•
•
•
Discover functional elements of the genome
Find the human evolutionary innovations
Map the important human genetic variation
Map the genome adaptations in individual cancer
tumors that make them dangerous
The UCSC Team
Katie Pollard and Gill Bejerano
Jim Kent
Sofie Salama
Adam Siepel
Extended Credits
Thanks to Jim Kent, Sofie Salama, Gill Bejerano*, Katie
Pollard*, Adam Siepel*, Robert Baertsch, Galt Barber, Hiram
Clawson, Mark Diekhans, Jorge Garcia, Rachel Harte, Angie
Hinrichs, Fan Hsu, Donna Karolchik, Sol Katzman, Andy
Kern, Bryan King, Robert Kuhn, Victoria Lin, Andre Love,
Craig Lowe, Yontao Lu, Jian Ma, Chester Manuel, Courtney
Onodera, Jakob Pedersen, Andy Pohl, Brian Raney, Brooke
Rhead, Kate Rosenbloom, Krishna Roskin, Zack Sanborn,
Kayla Smith, Mario Stanke, Bernard Suh, Paul Tatarsky,
Archana Thakkapallayil, Daryl Thomas, Heather Trumbower,
Jason Underwood, Ting Wang, Erich Weiler, Chen-Hsiang
Yeang, Jing Zhu, and Ann Zweig, in my group at UCSC
And to Webb Miller, Nadav Ahituv, Manny Ares, Mathieu
Blanchette, Rico Burhans, Michele Clamp, Richard Gibbs,
Eric Green, Haller Igel, John Karro, Eric Lander, Kerstin
Lindblad-Toh, Jim Mullikin, Tom Pringle, Eddy Rubin,
Armen Shamamian, Pierre Vanderhaeghen, and many other
outside collaborators
Single nucleotide polymorphisms (SNPS)
• When we compare the genomes of many people, we
see ~3 million variable bases (SNPs). That is one
every 1000 bases.
• Each SNP is a change that happened only once.
• The more ancient the SNP, the more common – most
SNPs come from before the time of a population
bottleneck about 100,000 years ago, before our
ancestors migrated out of Africa.
• Each of your kids has about 175 new DNA changes,
but nearly all changes are lost within 20 generations.
• SNPs inherited together with no recombination form
“haplotype blocks”.
Polymorphism Data is Used to
Help Locate Disease Genes
• With new genotyping technology, there has been
a revolution in our ability to discover diseaserelated genes. New discoveries have been made
for diabetes, cancer, cardiovascular disease, auto
immune diseases, and neurological diseases.
• The ability to interactively explore the genome
on the web is accelerating biomedical research
and will eventually help us to better diagnose
and cure disease.
Genomes and the Central Dogma
of Molecular Biology
The Tree of Life
DNA -> DNA (molecular evolution)
DNA -> RNA -> protein (molecular cell biology)
Neutral drift: a genetic change that does not affect the organism
• Mutations occur all the time in protein-coding regions; some do not
change the protein, so do not affect the fitness of the organism
• Changing the third DNA base in this codon does not change the
amino acid it encodes, alanine (A)
Browser: Kent et al; conservation track: Siepel and Rosenbloom
Negative selection: rejecting a change that decreases fitness
• Some mutations would change the protein and thereby reduce fitness
• Such changes are rejected by natural selection, and the DNA is conserved
Browser: Kent et al; conservation track: Siepel and Rosenbloom
Positive selection: a genetic change that increases fitness
• Some mutations have a positive effect: This change from C to A in the gene
FOXP2 changed the amino acid from threonine (T) to asparagine (N) , which
may have improved fitness
• Possible role in the evolution of speech
Browser: Kent et al; conservation track: Siepel and Rosenbloom; FOXP2 results: Enard et al, Nature, 2002