Everything you wanted to know about ENCODE

Download Report

Transcript Everything you wanted to know about ENCODE

Everything you wanted to know
about ENCODE
But were afraid to ask
Top 5 Reasons Biologists Go Into
Bioinformatics
5 - Microscopes and biochemistry are so
20th century.
Top 5 Reasons Biologists Go Into
Bioinformatics
5 - Microscopes and biochemistry are so 20th
century.
4 - Got started purifying proteins, but it turns
out the cold room is really COLD.
Top 5 Reasons Biologists Go Into
Bioinformatics
5 - Microscopes and biochemistry are so 20th
century.
4 - Got started purifying proteins, but it turns
out the cold room is really COLD.
3 - After 23 years of school wanted to make
MORE than $23,000/year as a postdoc.
Top 5 Reasons Biologists Go Into
Bioinformatics
5 - Microscopes and biochemistry are so 20th
century.
4 - Got started purifying proteins, but it turns
out the cold room is really COLD.
3 - After 23 years of school wanted to make
MORE than $23,000/year as a postdoc.
2 - Like to swear, @ttracted to $_ Perl #!!
Top 5 Reasons Biologists Go Into
Bioinformatics
5 - Microscopes and biochemistry are so 20th
century.
4 - Got started purifying proteins, but it turns
out the cold room is really COLD.
3 - After 23 years of school wanted to make
MORE than $23,000/year as a postdoc.
2 - Like to swear, @ttracted to $_ Perl #!!
1 - Getting carpel tunnel from pipetting
Top 5 Reasons Computer People
go into Bioinformatics
5 - Bio courses actually have some females.
Top 5 Reasons Computer People
go into Bioinformatics
5 - Bio courses actually have some females.
4 - Human genome more stable than Windows XP
Top 5 Reasons Computer People
go into Bioinformatics
5 - Bio courses actually have some females.
4 - Human genome more stable than Windows XP
3 - Having mastered binary trees, quad trees, and parse
trees ready for phylogenic trees.
Top 5 Reasons Computer People
go into Bioinformatics
5 - Bio courses actually have some females.
4 - Human genome more stable than Windows XP
3 - Having mastered binary trees, quad trees, and parse
trees ready for phylogenic trees.
2 - Missing heady froth of the internet bubble.
Top 5 Reasons Computer People
go into Bioinformatics
5 - Bio courses actually have some females.
4 - Human genome more stable than Windows XP
3 - Having mastered binary trees, quad trees, and parse
trees ready for phylogenic trees.
2 - Missing heady froth of the internet bubble.
1 - Must augment humanity to defeat evil artificial
intelligent robots.
The Paradox of Genomics
How does a long, static, one dimensional string
of DNA turn into the remarkably complex,
dynamic, and three dimensional human body?
GTTTGCCATCTTTTG
CTGCTCTAGGGAATC
CAGCAGCTGTCACCA
TGTAAACAAGCCCAG
GCTAGACCAGTTACC
CTCATCATCTTAGCT
GATAGCCAGCCAGCC
ACCACAGGCATGAGT
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Looks like ‘code’ not enough,
must study actual cells & DNA
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
How DNA is Used by the Cell
Promoter Tells Where to Begin
Different promoters activate different genes in
different parts of the body.
A Computer in Soup
Idealized promoter for a gene involved in making hair.
Proteins that bind to specific DNA sequences in the
promoter region together turn a gene on or off. These
proteins are themselves regulated by their own promoters
leading to a gene regulatory network with many of the
same properties as a neural network.
Regulation By Txn Factor Binding
When I-KB is removed from
by phosphorylation, NF-KB
complex binds to dna.
Note that you would need
Both NF-KB p65 and NF-KB p50
Subunits to be expressed in same cell
For this transcription activation
Pathway to work. Selective, combinatorical
expression of txn factors is very important
In defining different types of cells.
The Decisions of a Cell
• When to reproduce?
• When to migrate and where?
• What to differentiate into?
• When to secrete something?
• When to make an electrical signal?
The more rapid decisions usually are via the cell
membrane and 2nd messengers. The longer
acting decisions are usually made in the nucleus.
Nucleus Used to Appear Simple
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
• Cheek cells stained with basic dyes. Nuclei are
readily visible.
Mammalian Nuclei Stained in Various Ways
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Image from Tom Misteli lab
Artist’s rendition of nucleus
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Image from nuclear protein database
Chromatin
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Turning on a gene:
• Getting DNA into the right compartment of the
nucleus (may involve very diffuse signals in DNA
over very long distances)
• Loosening up chromatin structure (this involves
enhancers and repressors which can act over
relatively long distances)
• Attracting RNA Polymerase II to the transcription
start site (these involve relatively close factors
both upstream and downstream of transcription
start).
4
HISTONE MODIFICATIONS
Modification Effect
H3K4me3
H3K4me2
H3K4me1
H3acK9/14
H4acK5/8/12/16
Slide adapted from Christoph Kock, Sanger Institute
Methods for Studying Transcription
Traditional
• Genetics in model organisms
• Promoters/enhancers hooked to reporter genes
• Gel shifts and DNAse footprinting.
ENCODE/High Throughput
• Phylogenic footprinting
• Motif searches in clusters of coregulated genes.
• Chromatin Immunoprecipitation & CHIP/CHIP
• DNAse hypersensitivity
Drosophila Genetics
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
normal
antennapedia
mutant
QuickT ime™ and a
T IFF (Uncompressed) decompressor
are needed to see t his picture.
Reporter Gene Constructs
promoter to study
easily seen gene
Quic kTime™ a nd a
TIFF (Un co mp res sed ) d eco mp resso r
ar e n eed ed to see thi s p ictu re.
Drosophila embryo transfected with ftz promoter hooked
up to lacz reporter gene, creating stripes where ftz promoter
is active.
Biochemical Footprinting
Assays
Gel showing selective
QuickTime™ and a
protection of DNA fromTIFF (Uncompressed)
decompressor
are needed to see this picture.
nuclease digestion
where transcription
factor is bound.
Txn factor
footprint
Comparative Genomics
Webb Miller
Comparative Genomics at BMP10
Conservation of Gene Features
100%
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
aligning
identity
Conservation pattern across 3165 mappings of human RefSeq mRNAs
to the genome. A program sampled 200 evenly spaced bases across
500 bases upstream of transcription, the 5’ UTR, the first coding exon,
introns, middle coding exons, introns, the 3’ UTR and 500 bases after
polyadenylatoin. There are peaks of conservation at the transition from
one region to another.
Normalized eScores
Conservation Levels of
Regulatory Regions in
Human/Mouse Alignments
Dnase I Hypersensitivity, CHIP/CHIP, transcription data on ENR333
CHromatin ImmunoPrecipitation
• Crosslink cells with formaldehyde.
• Sonicate to shear DNA
• Add antibody to a protein involved in
transcription.
• Precipitate antibody and and everything
attached
• Heat to release DNA.
• Analyse DNA with PCR or microarrays
– CHIP on microarray = CHIP/CHIP
CHIP/CHIP in ENCODE
• groups: Sanger, Yale, Affy, UCSD,
Stanford, GIS (more?)
• proteins: RNA Pol II, TAF1, histones in
various states of acylation/methylation
• cells: various cell lines treated various
ways.
CHIP/CHIP Groups
• Sanger - sequencing center in UK that does
a lot of annotation as well.
• UCSD/Ludwig Institute - where Bing Ren,
a pioneer of CHIP lives
• GIS - Genome Institute Singapore - using
“paired-end ditag” CHIP.
• Stanford, YALE, Affy you all know.
CHIP/CHIP Targets
• RNA Polymerase II, converts DNA->RNA
for protein coding genes.
– Antibody targets form in initiation complex
(start of gene)
• TAF1 - A basal transcription factor.
Involved in recruiting Pol II to initiation site
• Histones 3&4 - the balls DNA winds around
– Antibodies against various acylated and
methylated forms, most of which are associated
with chromatin opening
Cell Types
•
•
•
•
•
HELA - cervical epithelial carcinoma
HCT116 - colon epithelial carcinoma
IMR90 - lung fibroblast
THP1 - blood monocyte leukemia
GMO6990 - lymphoblastoid
• HL-60 - promyelocytic leukemia cell line
• Many others in Stanford promoter track.
DNAse hypersensitivity
• Very old technique being adapted to high
throughput.
• DNA cutting enzymes can access open chromatin
faster than closed chromatin
• Other things may also influence how susceptible a
particular piece of DNA is to DNAse cutting.
• What is hypersensitive in a particular cell line is
quite reproducible.
• There are various techniques for seeing where cut
is: sequencing cut ends, PCR around cut site, etc.
Dnase I Hypersensitivity, CHIP/CHIP, transcription data on ENR333
Dnase I Hypersensitivity, CHIP/CHIP, transcription data on ENR333
Close up of same region
The END
How is a gene turned on?
• “Pioneering” transcription factors bind to
DNA and tag it for “chromatin opening”
• Histones are acylated/methylated which
opens chromatin.
• More transcription factors bind newly
exposed sites in DNA.
• RNA Polymerase II attracted to txn factors
• Yet more txn factors phosphorylate tail of
Pol II, allowing it to start transcription.