Transcript Lecture 1

Workshop in Computational
Structural Biology
2016
81813, 4 credit points
Orly Marcu & Emma-joy Dodson
Contents by Prof. Ora Schueler-Furman
Introduction – When, Where, How?
• When & Where:
– Thursdays, Givat Ram
– Lecture & Exercise: 14:0018:00, Sprinzak computer
class #2
– Lectures & exercises available
on moodle2
http://moodle2.cs.huji.ac.il/nu15/
course/view.php?id=81813
• How:
– Make sure you have an
account in CS ✓
• Exercises
-
Submit 8/11 exercises
Due within 2 weeks
Submit by email to [email protected]
• Contact:
Orly 87063
[email protected]
Emma 87063
[email protected]
Acknowledgements: Sources of figures and slides include slides from Branden & Tooze; some slides have been
adapted from members of the Rosetta Community, especially from Jens Meiler
Exercises in Pyrosetta have been adapted from teaching material by Jeff Gray
What will we learn?
• Structure prediction (mainly Rosetta; also I-TASSER, MODELLER):
– from sequence alone to high resolution models (Ab-initio modeling)
– From homologous structures to high resolution models
MSKAVGIDLGTTYSC……
MSKAVGIDLGTTYSC……
||
What will we learn?
• Protein design – Engineering novel proteins not
found in nature to fit a desired fold/function
Gordon et. Al. JACS (2012). Computational Design of an α-Gliadin Peptidase
What will we learn?
• Protein-protein docking – achieve models of protein
complexes given two monomers (Rosetta, PatchDock,
PIPER, HADDOCK)
• Interface analysis and design – identify interface
“hotspots” (via computational alanine scanning); change
protein specificity
What will we learn?
• Optimization techniques:
– Energy Minimization; concepts and implementation in Rosetta
– Monte-Carlo methods
• Side chain modeling
– Deterministic and heuristic methods for finding preferred side
chain combinations given a certain backbone
energy
START
conformations
What we will not learn
Existing protocols, out of this course’s scope:
•Protein-ligand docking
•Membrane proteins modeling and design
•Peptide-protein docking
•DNA & RNA modeling
•Antibody modeling
The code: 4 bases, 64 triplets, 20 amino acids
4 Hierarchies of protein structure
• Anfinsen: sequence determines structure
The building blocks: amino acids
Special amino acids
CO
N
C
H
H
• The simplest aa
• No sc
• Very flexible bb
H
CO
N
C
H2C
CH2
CH2
H
• Cyclic aa
• sc Connects bb N
• Very constrained bb
Aliphatic amino acids
• sc contains only carbon and hydrogen atoms
• hydrophobic
Amino acids with hydroxyl group
Negatively charged amino acids
Different size → different tendency for 2. structure
Amide amino acids
Positively charged amino acids
• pKa 11.1
• pKa 12
• large sc
Aromatic amino acids
• sc contains
aromatic ring
Figure from Wikipedia
Figure from Proteopedia
Amino acids with sulfur
Cystine
Oxidation of Sulfur
atoms creates
covalent disulfide
bond (S-S bond)
between two
cysteines
Hydrogen bonding potential of amino acids
Primary sequence: concatenated amino
acids
Formation of a peptide bond
H
+H N
3
C
O
C
O-
R
cpk colors
O - oxygen
H - hydrogen
N - nitrogen
C - carbon
The geometry of the peptide backbone
The peptide bond is planar & polar:



• Peptide bond length and angles do not change
• Peptide dihedral angles define structure
Dihedral angles
Dihedral angles 1-4 define side chain
• Dihedral angle: defines geometry of
4 atoms (given bond lengths and
angles)
From wikipedia
The geometry of the peptide backbone
The peptide bond is planar & polar:
=180o (trans) or 0o (cis)



• Peptide bond length and angles do not change
• Peptide dihedral angles define structure
The search for the native fold
The Levinthal paradox: a 100 residue protein would
require 1016 seconds to explore all possible conformations
and choose the native one.
Quick collapse to
intermediate state,
followed by accurate
contacts formation
Quick collapses followed
by unfolding until near
native state achieved
Ramachandran plot


All except Glycine
Glycine: flexible backbone
33
Ramachandran plot


34
Secondary structure: local interactions
Secondary structure – built from
backbone hydrogen bonds
 helix
• discovered 1951 by Pauling
• 5-40 aa long
• average: 10aa
• right handed
• Oi-NHi+4 : bb
atoms satisfied
•  helix: i - i+5
• 310 helix: i - i+3
1.5Å/res
Favored: Glu, Ala, Leu, Arg, Met, Lys
Disfavored: Asn, Thr, Cys, Asp, Gly
Frequent amino acids at the
N-terminus of  helices
Ncap, N1, N2, N3 …….Ccap
Pro
Blocks the continuation of the helix by its side
chain
Asn, Ser
Block the continuation of the helix by
hydrogen bonding with the donor (NH) of N3
38
 helix: dipole
• binds negative charges at N-terminus
Representation: helical wheel
1. buried
2. partially exposed:
amphipathic helix
3. exposed
41
-sheet
• Involves several regions in sequence
• Residue side chains point up/down/up ..
• Oi-NHj
•Parallel and
anti-parallel
sheets
Favored: Tyr, Thr, Ile, Phe, Trp
Disfavored: Glu, Ala, Asp, Gly, Pro
42
Antiparallel -sheet
• Parallel Hbonds
• Pleated
43
Beta-hairpin Loops
• Connect strands in antiparallel sheet
G,N,D
G
G
S,T
44
Parallel -sheet
• less stable than antiparallel sheet
• angled
hbonds
45
Connecting elements of secondary
structure define tertiary structure
46
Tertiary structure defines protein
function
Loops
• connect helices and strands
• at surface of molecule
• more flexible
• contain functional sites
48
Important bonds for protein folding and
stability
Dipole moments attract each
other by van der Waals
force (transient and very
weak: 0.1-0.2 kcal.mol)
Hydrophobic interaction –
hydrophobic groups/
molecules tend to cluster
together and shield
themselves from the
hydrophilic solvent
Interplay of enthalpy and entropy in
protein folding
Formation of the aformentioned bonds contributes
to the enthalpy of the system, decreasing protein
enropy
change in
Gibbs free
energy
change in
enthaply
change in the
entropic term
The hydrophobic effect
• A central effect in protein folding
• Driven by entropy – gain of water molecules entropy
Water molecules near
hydrophobic elements have
less freedom to form and
break hydrogen bonds with
neighboring waters
More water molecules not in
direct contact with
hydrophobic elements
Figures from post by Dr. Steve Mack on www.madsci.org
The quaternary structure of a
protein defines its biological
functional unit
55
Quaternary structure: assembly of
protein domains
(from two distinct protein chains, or two
domains in one protein sequence)
Glyceraldehyde phosphate
dehydrogenase:
• domain 1 binds the
substance to be
metabolized,
• domain 2 binds a
cofactor
1. Introduction to Computational
Structural Biology
Experimental determination of
protein structure: X-ray diffraction
and NMR
X-ray diffraction
• Rotation of crystal
enables recording
different diffractions
• Resolution measures
diffraction angles;
higher angle peaks
 higher resolution
NMR (Nuclear Magnetic Resonance)
NMR-active nuclei (w spins)
1H, 13C, 15N
Application of magnetic field
reorients spins – measure
resonance between close
nuclei
Extract constraints &
determine structure
more constraints – better defined structure
Experimental determination of
structure
X-ray crystallography
NMR
• Determines electron
density – positions of
atoms in structure
• Highly accurate
• Technically challenging
• Depends on crystal
(static; artifacts?)
• Determines constraints
between labeled spins
• Allows measure of
structure in solution
Progress in experimental
determination of structures
1950’s first protein structure
solved by Kendrew & Perutz:
sperm whale myoglobin
Today: ~114,000 structures
solved, most by x-ray
crystallography