Information Encoding in Biological Molecules: DNA and

Download Report

Transcript Information Encoding in Biological Molecules: DNA and

High Throughput Methods
in Proteomics
David Wishart
University of Alberta
Edmonton, AB
[email protected]
Lecture 1.1
1
Proteomics
Proteomics employs an incredibly diverse
range of technologies including:
– molecular biology
– X-ray crystallography
– chromatography
– NMR spectroscopy
– electrophoresis
– microscopy
– mass spectrometry
– computational biology
Lecture 1.1
2
Proteomics Tools
• Molecular Biology Tools
• Separation & Display Tools
• Protein Identification Tools
• Protein Structure Tools
Lecture 1.1
3
Molecular Biology Tools
•
•
•
•
•
•
•
•
Northern/Southern Blotting
Differential Display
RNAi (small RNA interference)
Serial Analysis of Gene Expression (SAGE)
DNA Microarrays or Gene Chips
Yeast two-hybrid analysis
Immuno-precipitation/pull-down
GFP Tagging & Microscopy
Lecture 1.1
4
SAGE
• Principle is to convert every mRNA molecule
into a short (10-14 base), unique tag.
Equivalent to reducing all the people in a
city into a telephone book with surnames
• After creating the tags, these are assembled
or concatenated into a long “list”
• The list can be read using a DNA sequencer
and the list compared to a database to ID
genes or proteins and their frequency
Lecture 1.1
5
SAGE Tools
Lecture 1.1
6
SAGE
Convert mRNA
to dsDNA
Digest with NlaIII
Split into 2 aliquots
Attach
Linkers
Lecture 1.1
7
SAGE
Linkers have
PCR & Tagging
Endonuclease
Cut with TE
BsmF1
Mix both aliquots
Blunt-end ligate
to make “Ditag”
Concatenate
& Sequence
Lecture 1.1
8
SAGE of Yeast Chromosome
Lecture 1.1
9
DNA Microarrays
• Principle is to analyze gene (mRNA) or
protein expression through large scale
non-radioactive Northern (RNA) or
Southern (DNA) hybridization analysis
• Brighter the spot, the more DNA
• Microarrays are like Velcro chips made of
DNA fragments attached to a substrate
• Requires robotic arraying device and
fluorescence microarray reader
Lecture 1.1
10
Gene Chip Tools
Lecture 1.1
11
DNA Microarrays
Lecture 1.1
12
DNA Microarray
Lecture 1.1
13
Microarrays & Spot Colour
Lecture 1.1
14
Microarray Analysis Examples
Lung
20,224
Liver
37,807
Prostate
7,971
Skin
3,043
Lecture 1.1
Brain
67,679
Brain
Lung
Heart
9,400
Colon
4,832
Liver
Liver Tumor
Bone
4,832
15
Microarray Software
Lecture 1.1
16
Yeast Two-Hybrid Analysis
• Yeast two-hybrid
experiments yield
information on protein
protein interactions
• GAL4 Binding Domain
• GAL4 Activation Domain
• X and Y are two proteins of
interest
• If X & Y interact then
reporter gene is expressed
Lecture 1.1
17
Invitrogen Yeast 2-Hybrid
X
LexA
lacZ
LexA
B42
Y
lacZ
B42
X
Y
LexA
lacZ
Lecture 1.1
18
Example of 2-Hybrid Analysis
• Uetz P. et al., “A Comprehensive Analysis
of Protein-Protein Interactions in
Saccharomyces cerevisiae” Nature
403:623-627 (2000)
• High Throughput Yeast 2 Hybrid Analysis
• 957 putative interactions
• 1004 of 6000 predicted proteins involved
Lecture 1.1
19
Example of 2-Hybrid Analysis
• Rain JC. et al., “The protein-protein
interaction map of Helicobacter pylori”
Nature 409:211-215 (2001)
• High Throughput Yeast 2 Hybrid Analysis
• 261 H. pylori proteins scanned against genome
• >1200 putative interactions identified
• Connects >45% of the H. pylori proteome
Lecture 1.1
20
Another Way?
• Ho Y, Gruhler A, et al. Systematic identification
of protein complexes in Saccharomyces
cerevisiae by mass spectrometry. Nature
415:180-183 (2002)
• High Throughput Mass Spectral Protein
Complex Identification (HMS-PCI)
• 10% of yeast proteins used as “bait”
• 3617 associated proteins identified
• 3 fold higher sensitivity than yeast 2-hybrid
Lecture 1.1
21
Affinity Pull-down
Lecture 1.1
22
Molecular Biology Tools
•
•
•
•
•
•
•
•
Northern/Southern Blotting
Differential Display
RNAi (small RNA interference)
Serial Analysis of Gene Expression (SAGE)
DNA Microarrays or Gene Chips
Yeast two-hybrid analysis
Immuno-precipitation/pull-down
GFP Tagging & Microscopy
Lecture 1.1
23
Yeast Protein Localization
Huh, K et al., Nature, 425:686-691(2003)
Lecture 1.1
24
Yeast Proteome Localized
• Used 6234 yeast strains expressing fulllength, chromosomally tagged green
fluorescent protein (GFP) fusion proteins
• Measured localization by fluorescence
microscopy
• Localized 75% of the yeast proteome, into 22
distinct subcellular localization categories
• Provided localization information for 70% of
previously unlocalized proteins
Lecture 1.1
25
22 Different Cellular Zones
Lecture 1.1
26
GFP Tagging the Yeast
Proteome
Lecture 1.1
27
Fluorescence Microscopy
Nucleus
Bud Neck
Lecture 1.1
Nuclear Periphery
Endoplasmic Retic.
Mitochondria
Lipid particles
28
Confirmation by Co-localization
(GFP/RFP merging)
Lecture 1.1
29
Results
Lecture 1.1
30
Proteomics Tools
• Molecular Biology Tools
• Separation & Display Tools
• Protein Identification Tools
• Protein Structure Tools
Lecture 1.1
31
Separation & Display Tools
• 1D Slab Gel Electrophoresis
• 2D Gel Electrophoresis
• Capillary Electrophoresis
• HPLC (SEC, IEC, RP, Affinity, etc.)
• Protein Chips
Lecture 1.1
32
SDS PAGE
Lecture 1.1
33
SDS PAGE Tools
Lecture 1.1
34
Isoelectric Focusing (IEF)
Lecture 1.1
35
Isoelectric Focusing
•
•
•
•
•
Separation of basis of pI, not Mw
Requires much higher voltages
Requires much longer period of time
IPG (Immobilized pH Gradient)
Typically done in strips or tubes (to
facilitate 2D gel work)
• Uses ampholytes to establish pH gradient
Lecture 1.1
36
2D Gel Principles
IEF
SDS
PAGE
Lecture 1.1
37
Advantages and Disadvantages
• Provides a hard-copy
record of separation
• Allows facile quantitation
• Separation of up to 9000
different proteins
• Highly reproducible
• Gives info on Mw, pI and
post-trans modifications
• Inexpensive
Lecture 1.1
• Limited pI range (4-8)
• Proteins >150 kD not
seen in 2D gels
• Difficult to see
membrane proteins
(>30% of all proteins)
• Only detects high
abundance proteins
(top 30% typically)
• Time consuming
38
2D Gel Software
Lecture 1.1
39
Capillary Electrophoresis
Lecture 1.1
40
Capillary Electrophoresis
• Capillary Zone Electrophoresis (CZE)
– Separates on basis of m/z ratio
• Capillary Gel Electrophoresis (CGE)
– Separates by MW and m/z ratio
• Capillary Isoelectric Focusing (CIEF)
– Separates on basis of pI
• 2-Dimensional Electrophoresis (2D-CE)
– Separates using tandem CE methods
Lecture 1.1
41
Chromatography
• Size Exclusion (size)
• Reverse Phase (hphob)
• Ion Exchange (charge)
• Normal Phase (TLC)
• Affinity (ligand)
• HIC (hydrophobicity)
• 2D Chromatography
Lecture 1.1
42
Ciphergen Protein Chips
Lecture 1.1
43
Ciphergen Protein Chips
• Hydrophobic (C8) Arrays
• Hydrophilic (SiO2) Arrays
• Anion exchange Arrays
• Cation exchange Arrays
• Immobilized Metal Affinity
(NTA-nitroloacetic acid)
Arrays
• Epoxy Surface (amine and
thiol binding) Arrays
Lecture 1.1
44
Ciphergen Protein Chips
Normal
Tumor
Lecture 1.1
45
Protein Arrays
Lecture 1.1
46
Different Kinds of Protein
Arrays
Antibody Array
Antigen Array
Ligand Array
Detection by: SELDI MS, fluorescence, SPR,
electrochemical, radioactivity, microcantelever
Lecture 1.1
47
Protein (Antigen) Chips
H Zhu, J Klemic, S Chang, P Bertone, A Casamayor, K Klemic, D Smith,
M Gerstein, M Reed, & M Snyder (2000).Analysis of yeast protein kinases
using protein chips. Nature Genetics 26: 283-289
ORF
GST
His6
Nickel coating
Lecture 1.1
48
Protein (Antigen) Chips
Nickel coating
Lecture 1.1
49
Arraying Process
Lecture 1.1
50
Probe with anti-GST Mab
Nickel coating
Lecture 1.1
51
Anti-GST Probe
Lecture 1.1
52
Probe with Cy3-labeled
Calmodulin
Nickel coating
Lecture 1.1
53
“Functional” Protein Array
Nickel coating
Lecture 1.1
54
Proteomics Tools
• Molecular Biology Tools
• Separation & Display Tools
• Protein Identification Tools
• Protein Structure Tools
Lecture 1.1
55
Microsequencing
Electro-blotting
Lecture 1.1
56
Edman Sequencing
Lecture 1.1
57
Microsequencing
• Generates sequence info from N terminus
• Commonly done on low picomolar
amounts of protein (5-50 ng)
• Newer techniques allow sequencing at the
femtomolar level (100 pg)
• Up to 20 residues can be read
• Allows unambiguous protein ID for 8+ AA
• Relatively slow, modestly expensive
Lecture 1.1
58
Protein ID by MS and 2D gel
Lecture 1.1
59
Protein ID by MS and 2D gel
• Requires gel spots to be cut out (tedious)
• Ideal for high throughput (up to 500
samples per day)
• Allows modifications to be detected
• MS allows protein identification by:
– Intact protein molecular weight
– Peptide fingerprint molecular weights
– Sequencing through MS/MS
Lecture 1.1
60
Protein ID Protocol
Lecture 1.1
61
Typical Results
• 401 spots identified
• 279 gene products
• Confirmed by SAGE,
Northern or Southern
• Confirmed by amino
acid composition
• Confirmed by amino
acid sequencing
• Confirmed by MW & pI
Lecture 1.1
62
MS Analysis Software
Protein Prospector
MS-Fit
Mowse
PeptideSearch
PROWL
Lecture 1.1
63
Proteomics Tools
Molecular Biology Tools
Separation & Display Tools
Protein Identification Tools
Protein Structure Tools
Lecture 1.1
64
Protein Structure Initiative
•35,000 proteins
•10,000 subset
•30% ID or
•30 seq
30 seq
•Solve by 2010
•$20,000/Structure
Lecture 1.1
65
Structure Determination
NMR
Lecture 1.1
X-ray
66
X-ray Crystallography
FT
Lecture 1.1
67
NMR Spectroscopy
FT
Lecture 1.1
68
Structure Determination
Lecture 1.1
69
Bottlenecks
X-ray
• Producing enough
protein for trials
• Crystallization time and
effort
• Crystal quality, stability
and size control
• Finding isomorphous
derivatives
• Chain tracing & checking
Lecture 1.1
NMR
• Producing enough
labeled protein for
collection
• Sample “conditioning”
• Size of protein
• Assignment process is
slow and error prone
• Measuring NOE’s is
slow and error prone
70
Protein Expression
Lecture 1.1
71
Robotic Crystallization
Lecture 1.1
72
Synchrotron Light Source
Lecture 1.1
73
MAD & X-ray Crystallography
• MAD (Multiwavelength
Anomalous Dispersion
• Requires synchrotron
beam lines
• Requires protein with
multiple scattering centres
(selenomethionine labeled)
• Allows rapid phasing
• Proteins can now be
“solved” in just 1-2 days
Lecture 1.1
74
High Throughput NMR
• Higher magnetic fields
(From 400 MHz to 900 MHz)
• Higher dimensionality
(From 2D to 3D to 4D)
• New pulse sequences
(TROSY, CBCANNH)
• Improved sensitivity
• New parameters (Dipolar
coupling, cross relaxation)
Lecture 1.1
75
Automated Structure
Generation
Lecture 1.1
76
NMR & Structural Proteomics
Lecture 1.1
77
Proc. Natl. Acad. Sci. USA, Vol. 99,1825-1830, 2002
NMR & Structural Proteomics
Lecture 1.1
78
Proc. Natl. Acad. Sci. USA, Vol. 99,1825-1830, 2002
Auto-comparative Modeling
ACDEFGHIKLMNPQRST--FGHQWERT-----TYREWYEGHADS
ASDEYAHLRILDPQRSTVAYAYE--KSFAPPGSFKWEYEAHADS
MCDEYAHIRLMNPERSTVAGGHQWERT----GSFKEWYAAHADD
Lecture 1.1
79
The Goal
Lecture 1.1
80