Transcript lecture 4
Limitations of genome projects
Windowjhgjhddoorhubbahubbastairduh
107
3.109
What do proteins do for a living?
Post-genomics
(A) Identifying genes from the sequence
(B) Gene expression profiling
(C) Genome activity studies
Genomes2 by TA Brown; chapter 7
(A) Hunting genes from the sequence
2 broad approaches
1) Ab initio method (computational)
2) Experimental method
Ab initio method (computational)
Scanning ORFs (open reading frames) –
initiation or termination codons
Codon bias found in specific species
Exon-intron boundaries
Upstream control sequences – e.g conserved
motifs in transcription factor binding
regions
CpG islands
Homology searches
Ab initio method (computational)…..
Software for automated annotation of genes like
GENSCAN, Genie, GENEBUILDER etc are being
used. These scan for special features like
1) Scanning ORFs (open reading frames) – initiation
or termination codons
AAC
TAA
ATG
5’- ATGACGCATGATCGAGGAT –3’
3’ – TACTGCGTACTAGCTCCTA –5’
CTA
CCT
TCC
Ab initio method (computational)…
Codon bias found in specific species
Not all codons used at same frequency
e.g.human leucine mainly coded by CTG and
rarely by TTA or CTA
Exon-intron boundaries (splice sites)
5’-AG GTAAGT-3’ hit and miss affair
Upstream control sequences – e.g
conserved motifs in transcription factor
binding regions
CpG islands
experimental method
Experimental evaluation based on the use of
transcribed RNA to locate exons and
entire genes from DNA fragment.
experimental method
2 main strategies
Hybridisation approaches – Northern
Blots, cDNA capture / cDNA select, Zoo
blots
Transcript mapping: RT-PCR, exon
trapping etc
In this method, known DNA databases are
searched to find out whether the test
sequence is similar to any other known
genes, suggesting an evolutionary
relationship.
Northern Blot
Fig 7.4: Genomes 2
Zoo Blot
Fig 7.5: Genomes 2
RT-PCR
Fig 7.: Genomes 2
Exon trapping
Fig 7.8: Genomes 2
(B) Gene expression profiling
• COMPUTATIONAL APPROACH
Homology searches for either
- Orthologous genes (homologues in
different organisms with common
ancestor)
- Paralogous genes (genes in the same
organism, e.g. multigene families)
(B) Gene expression profiling…..
• EXPERIMENTAL APPROACH
gene inactivation methods (knockouts, RNAi,
site-directed mutagenesis, transposon
tagging, genetic footprinting etc)
Gene overexpression methods (knock-ins,
transgenics, reporter genes etc)
(C) Genome activity studies
Gene expression needs to be
complemented by
Transcriptome analysis
Proteome analysis
The transcriptome
Total RNA
Non-coding RNA
(96%)
coding RNA
(4%)
Pre-r RNA
Pre-t RNA
hn RNA
sn RNA
sc RNA
sno RNA
mRNA
r RNA
t RNA
All organisms eukaryotes bacteria
tm RNA etc
The transcriptome
complete collection of transcribed
elements of the genome
transcriptome maps will provide
clues on
Regions of transcription
• Transcription factor binding sites
• Sites of chromatin modification
• Sites of DNA methylation
• Chromosomal origins of replication
The transcriptome
Analysis can be done by either
SAGE (serial analysis of gene expression)
technology
Microarray technology
SAGE
Shortcut to doing cDNA library screening
SAGE tags identify
• mRNAs derived from known genes
• anonymous mRNAs, also known as expressed sequence
tags (ESTs)
• mRNAs derived from currently unidentified genes
Advantages
• Analyzes all transcripts (Transcriptome) without prior
selection of known genes
• Provides quantitative data on both known and unknown
genes
• Ideally suited for determining changes on gene
expression as consequence of an experimental treatment
(e.g. carcinogen, hormone)
SAGE
Microarrays – allows comparisons
Microarrays….
Proteomics
Proteomics
Nature (2003) March 13: Insight articles from pg 194
Proteomics
Proteome projects - co-ordinated by the HUPO
(Human Protein Organisation)
Involve protein biochemistry on a highthroughput scale
Problems
limited and variable sample material,
sample degradation,
abundance,
post-translational modifications,
huge tissue, developmental and temporal
specificity as well as disease and drug
influences.
Nature (2003) March 13: Insight articles from pgs 191-197.
Approaches in proteomics
High throughput approach
1) Mass- spectrometry
based
2) Array based
3) Structural proteomics
4) Informatics
5) Clinical proteomics
Nature (2003) March 13: Insight articles from pgs 191-197.
High throughput approaches in proteomics
1) Mass spectrometry-based proteomics:
relies on the discovery of protein
ionisation techniques.
used for
protein identification and
quantification,
profiling,
protein interactions and
modifications.
Nature (2003) March 13: Insight articles from pgs 191-197
Mass spectrometry (MS)
Nature (2003) March 13: Insight articles from pgs 191-197
Principle of MS
oion source,
omass analyser that measures mass-to-charge ratio (m/z)
odetector that registers the number of ions at each m/z
value
Electrospray ionisation (ESI)
matrix-assisted laser desortion/ionisation (MALDI)
MALDI-MS - simple peptide mixtures whereas
ESI-MS - for complex samples.
Nature (2003) March 13: Insight articles from pgs 191-197.
Principle of MALDI-TOF
Matrix
assisted
laser
desorption/
ionisation –
time
of
flight
Fig 7.24 Genomes 2 by
TA Brown pg 210
2) Array-based proteomics
Based on the cloning and amplification of
identified ORFs into
homologous (ideally used for bacterial and
yeast proteins) or sometimes
heterologous systems (insect cells which
result in post-translational
modifications similar to mammalian
cells).
A fusion tag (short peptide or protein
domain that is linked to each protein
member e.g. GST) is incorporated
into the plasmid construct.
Nature (2003) March 13: Insight articles from pgs 191-197.
Array based proteomics….
a. Protein expression and purification
b. Protein activity: Analysis can be done using
biochemical genomics or
functional protein microarrays.
c. Protein interaction analysis
two-hybrid analysis (yeast 2-hybrid),
FRET (Fluorescence resonance energy
transfer),
phage display etc
d. Protein localisation:
immunolocalisation of epitope-tagged
products.
E.g the use of GFP or luciferase tags
Nature (2003) March 13: Insight articles from pgs 191-197.
3) Structural proteomics !
a. Protein expression and purification
b. Protein activity: Analysis can be done using
biochemical genomics or
functional protein microarrays.
c. Protein interaction analysis
two-hybrid analysis (yeast 2-hybrid),
FRET (Fluorescence resonance energy
transfer),
phage display etc
d. Protein localisation:
immunolocalisation of epitope-tagged
products.
E.g the use of GFP or luciferase tags
Nature (2003) March 13: Insight articles from pgs 191-197.
PROTEIN INTERACTION MAPS FOR
MODEL ORGANISMS
Nature Reviews Molecular Cell Biology 2; 55-63 (2001); doi:10.1038/35048107
Challenges for the future – ‘physiome’
Nature Reviews Molecular Cell Biology 4; 237-243 (2003)