Data visualization in the post

Download Report

Transcript Data visualization in the post

Data visualization
in the post-genomics era
Carol Morita
Genentech, Inc.
Pre-Genomics: assembling the pieces
GenBank
Genome
project
initiated
Where we are today
Organism
Size (bp)
# genes
E.coli (bacteria)
4.67 million
Arabidopsis (plant)
100 million
25,000
C. elegans (worm)
97 million
19,099
Drosophila (fly)
136 million
13,061
Mouse
3 billion
~40,000
Human
3 billion
~40,000
3,237
American view of the genome
Entrez Genome Browser
National Center for Biotechnology Information
National Institutes of Health
http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/euk_g.html
European view of the genome
Ensembl Genome Browser
European Molecular Biology Laboratory
http://www.ensembl.org/
What the genomes of model
organisms tell us
Maturation 10 days
9 weeks
20-25 years
Genome
165 million bp
3 billion bp
3 billion bp
Genes
13,600
~40,000
~40,000
Almost every human gene has a counterpart in the mouse
and some blocks of DNA are proving impossible to tell apart
Human genes mapped onto
mouse chromosomes
If we are so similar genetically,
why are we so different?
Proteomics: the real work begins
Definition: Description and functional characterization of
the full complement of an organism’s proteins
what’s at play…
– Multiple proteins can be derived from one
gene
– Protein interactions can be complex and are
poorly understood
– ‘Plasticity’ of the genome
– Spatial and temporal regulation
Increased diversity due to alternative splicing
gene A
Alternative splicing
• Plays an important role in:
– expanding protein diversity
– generating proteins with subtle or opposing
functional roles
– enabling an organism to respond to
environmental pressures
• >35% of human genes undergo
alternate splicing; probably higher
Complexity due to protein interactions
Death Receptor
Signaling pathway
DNA Microarrays
Microarray chips
may contain 50,000
known DNA fragments
on a single slide
Visualizing microarray data
Source: Silicon Genetics: GeneSpring
Limitations of DNA microarrays
• ‘snapshots’ of the DNA activity in a cell -prefer movies!
• Many important biological events cannot
be detected because transcription of DNA
is not involved
• Protein array technology is still in its
infancy
The curse of dimensionality
Source: Klausner, 2002 Cancer Cell1, p. 3-10