Development of Software Package for Determining Protein
Download
Report
Transcript Development of Software Package for Determining Protein
Development of Software
Package for Determining
Protein Titration Properties
Final Presentation
Winter 2010
By
Kaila Bennett, Amitoj Chopra,
Jesse Johnson, Enrico Sagullo
Background
Electrostatic interactions
are very important for the
function of proteins which
include:
Binding
Enzymatic catalysis
Conformational transitions
Electrostatic Interaction
Stability
Ionizable amino acids
Electrostatic interactions
Salt Bridges
Dipole-Dipole
Columbic interaction
Facilitate interactions with
aqueous environments
Mediate polar contributions
biological processes
Depicts electrostatic potential (isopotential
contour) red represents the negative, and the
blue represent the positive
Functions of proteins such as
catalysis are dependent on
protonation state of ionizable
amino acid residues
pKa for a single amino acid is 50%
protonation
pKa values are environment
dependent
The environment may cause shifts
in pKa
pKa values are important for
understanding many biological
processes
pKa values are important for
understanding many biological
processes
pKa intrinsic - pKa for one amino
acid
pKa apparent- pKa of the entire
protein
Partial charge
Background
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
VCP E108
VCP E120
SPICE K108
SPICE K120
0
2
4
6
pH
8
10
12
14
Catalysis
Asp102 of Chymotrypsin – hydrogen bond with His57 – increases pKa
His57 can accepts proton from Ser195 – activates serine protease for
cleavage of substrate
pKa shift important for each chemical reaction in catalytic mechanism
Necessary to donate and abstract protons from neighboring groups
Without pKa shift of His57, catalysis would not be possible!
Salt Bridge
pKa shifts also effect intermolecular salt bridges
Salt bridges are short range, Columbic interactions that occur
between two ionizable amino acid residues
From S.Fischer et al, Proteins 2009
Conformation Change
Another important biological process that is dependent on pKa of
the environment is transition states of proteins
Conformational switch
-5
-3
-1
1
3
5
7
9
11 13 15 17 19 21 23 25
0
h: helix
-0.1
c: coil
Tyr67
Tyr100
Tyr115
Tyr177
Asp68
Asp76
Asp144
Asp160
Glu162
Glu173
-0.2
Partial charge
Gh-c(neutral)
neutral
-0.3
-0.4
-0.5
-0.6
-0.7
-0.8
-0.9
-1
pH
G
h,ion
(pH)
ionized
G
c,ion
(pH)
+
–+ –
– +–+–
++–
++
–
––
–
+ Gh-c(pH) +
His121
His119
Catalytic site His108
His132
His137
Figure: Morikis et al, Protein Sci 2001
Binding
Background
Linearized Poisson-Boltzmann
Equation (LPBE)
4e2
(r ) (r ) 0 (r ) (r ) (r )
0 k BT
2
4e I
1 M 2 0
2 (r )
I zi ni
0k BT
2 i 1
2
ε:
κ:
I:
q:
φ:
ε high
F
z (r r )
i 1
i
i
ε low
ε surface
κ surface
Dielectric coefficient
Ion accessibility function
Ionic strength
Charge
Electrostatic potential
Electrostatic Free Energies
1
Gelectro i qi
2
q, , ,
κ=0
κ≠0
Solvent Charges
Partial Charges (Electric dipoles)
Background Charges
Courtesy of C. Kieslich
Background
Intrinsic pKa calculation by
the free energies of the
thermodynamic cycle
Thermodynamic cycle has
four proposed states:
1-Neutral to charge of bound
2-Bound charge to amino acid
3-Neutral to charge free
4-Bound neutral to amino acid
This method also allows for
calculation free energy
values
Ultimately allowing for the
elucidation of intrinsic pKa
values and titration curves
Polymer
AH
1
4
AH
Polymer
2
3
A-
A-
Background
G Free G protein G neutral G ch arg e
G n c G ch arg e G neutral
ka e
G protein
RT
protein
Pk a
Figure: Courtesy of
Morikis et al
G protein
log(ka )
2.303RT
G protein
2.303RT
free
Pk a
G free
2.303RT
2.303RT (Pka free Pka protein ) G n c
Pka
protein
Pka
Adapted from lecture notes of Bioengineering 135
free
G n c
2.303RT
Background (PDB file)
The Protein Data Bank (PDB) archive is
the single worldwide repository of proteins.
A PDB file is a downloadable file from the
databank that contains all the necessary
information about a protein needed for 3-D
modeling and our calculations.
Background
PDB
PQR
APBS
These modifications include:
Adding a limited number of missing heavy atoms
Placing polar hydrogen's
Optimizing the protein for favorable hydrogen
bonding
Removing unfavorable van der Waals clashes (when
two atoms try to occupy the same space)
Assigning charge ( partial or whole) and
van der Waals radii parameters from a variety of
force fields
Rationale
Developing a software
package that not only
incorporates APBS to
calculate free energies
but also calculate
protein titration
characteristics, will help
ultimately aid to
elucidate proteins
stability, catalysis, salt
bridges, binding
Figure: Test case protein 1LY2
Experimental Procedure (So Far)
Make Two PDB
files neutral
and charged
Two PDB’s are made to
be incorporated into
free energy calculations
One neutral PDB that
contains all the amino
acids in their neutral
forms
One Charged PDB that
contains all the amino
acids in their charged
forms
PDB to PQR
Generate four
states of TC
Call of APBS
Calculate
intrinsic pKa
Take cleaned PDB file
and covert file to a PQR
file to make compatible
with APBS software
Charged and Neutral
PQR’s are combined
and trimed to make the
four states of the
Thermodynamic Cycle
Newly converted PQR
were taken for energy
calculations using
APBS software
Each ΔG value is used
to calculate the pKa of
its corresponding
residue
Obtain PDB to PQR
converter
Each ionizable amino
acid are placed within
the neutral to leave one
charged and the rest
neutral to make the
first state
Make four PQR files to
correlate to the four
states in the
Thermodynamic Cycle
The ΔG values are
first divided by
thermo energy then
subtracted by the
model pKa
Use python to call the
converter from R
The charged and
neutral amino acids by
themselves correspond
to two of the states
Develop a template
input file which will be
edited through scripts
to make a specific input
file
System call from R to
convert file
The last state is the
neutral PQR by itself
Template was read in,
edited and then written
into a new input file
Use system call with
new input file to
calculate free energies
using APBS
Experimental Parameters
Ionic
strength
Dielectric
solvent
(εlow)
Dielectric
Solute
(εhigh)
Temp
150.0 mM
80.0
20.0
298.15 K
Box Size
(Å)
X
Y
Z
fglen
100
100
100
cglen
100
100
100
Grid size
129
129
129
Results (PDB2PQR)
Code (General) :
$ python pdb2pqr.py [options] --ff={forcefield} {path}
{output-path}
Using PARSE to give
van der Waal radii and
atomic charge
Forcefield
Path
Where the file is located
Output_path
Where the PQR file are to
be generated
Figure: Protein 1LY2
Code used in program:
system("python /Users/senior_design/pdb2pqr-1.5/pdb2pqr.py -ff parse 1LY2.pdb 1LY2.pqr")
Results ( Neutral and Charge)
Neu_Char_pdb <- function(pdb)
{
x <- pdb
x$atom[atom.select(x, resid = "ASP" )$atom,4]<-sub("ASP", "ASH",
x$atom[atom.select(x, resid = "ASP" )$atom,4])
x$atom[atom.select(x, resid = "GLU" )$atom,4]<-sub("GLU", "GLH",
x$atom[atom.select(x, resid = "GLU" )$atom,4])
x$atom[atom.select(x, resid = "LYS" )$atom,4]<-sub("LYS", "LYN",
x$atom[atom.select(x, resid = "LYS" )$atom,4])
x$atom[atom.select(x, resid = "ARG" )$atom,4]<-sub("ARG", "AR0",
x$atom[atom.select(x, resid = "ARG" )$atom,4])
write.pdb(pdb = x,file = "1ly2_neutral”
The newly generated PDB’s will be incorporated into
the calculation of free energies
Generates the
neutral and
charged PDB’s
Results (Call APBS Script)
con <- file("apbs_template.in", "r")
Reads in our input
in_file <- readLines(con)
template
close(con)
bdp_file <- “1LY2_noGLU35.pqr"
bp_file <- “1LY2_GLU35.pqr"
Four PQR files which
fdp_file <- "GLU35_no.pqr"
correspond to each state of TC
fp_file <- "GLU35.pqr"
length <- 100
width <- 100
height <- 100
in_file[2] <- paste("
mol pqr ",bdp_file, sep = "")
in_file[3] <- paste("
mol pqr ",bp_file, sep = "")
in_file[4] <- paste("
mol pqr ",fdp_file, sep = "“)
in_file[5] <- paste("
mol pqr ",fp_file, sep = "")
Writes a new
in_file[11] <- paste("
cglen ",length,width,height, sep = " ")
input file with our
in_file[12] <- paste("
fglen ",length,width,height, sep = " ")
in_file[34] <- paste("
cglen ",length,width,height, sep = " ")
specific
in_file[35] <- paste("
fglen ",length,width,height, sep = " ")
parameters
in_file[57] <- paste("
cglen ",length,width,height, sep = " ")
in_file[58] <- paste("
fglen ",length,width,height, sep = " ")
in_file[80] <- paste("
cglen ",length,width,height, sep = " ")
in_file[81] <- paste("
fglen ",length,width,height, sep = " ")
con <- file("infile.in","w")
writeLines(in_file,con,sep = "\n")
close(con)
System call to APBS to
TC <- system(paste( "/apbs-1.2-mac-univ/bin/apbs", "infile.in",">",
use new input file and
"outfile.txt", sep = " "))
calculate free energies
Results (Free Energy Calc.)
k <- ( as.numeric(neutral_pqr$atom[1,"resno"]) )
end_of_seq <- length(seq.pdb(neutral_pqr) ) - 1
seq <-our_seq(LY2, end_of_seq)
AAdf <- NULL
for (
{
Indexing
i in seq )
For loop to run through
if ( i == "R" | i == "K" | i == "H" | i == "C" | i == "Y"
sequence one amino
| i == "D" | i == "E" )
acid at a time
{
Before <- trim.pdb( neutral_pqr, atom.select(neutral_pqr, resno = 1:( k - 1 ) ) )
Free_protonated <- trim.pdb( charged_pqr,atom.select (charged_pqr, resno = k ) )
After <- trim.pdb( neutral_pqr, atom.select (neutral_pqr, resno = (k+1): end_of_seq ) )
Free_deprotonated <- trim.pdb( neutral_pqr, atom.select(neutral_pqr, resno = k))
write.pqr(Free_protonated, file = "Free_protonated.pqr")
Before_FP <- cat_pdb( Before, Free_protonated )
Total <- cat_pdb(Before_FP, After)
write.pqr(Total, file = "Bound_Protonated.pqr")
write.pqr(Free_deprotonated, file = "Free_deprotonated.pqr")
Calls APBS for
bp <- read.pqr("Bound_Protonated.pqr")
every ionizable
bdp <- read.pqr("1ly2_neutral.pqr")
fp <- read.pqr("Free_protonated.pqr")
amino acid to
fdp <- read.pqr("Free_deprotonated.pqr")
calculate specific
delta_G <- call_apbs(in_file)
AAdf <- rbind(AAdf, c("Resid"=i,"Resno" = k+1,"delta_G"=delta_G))
ΔG values
}
k <- k + 1
}
Results (Intrinsic pKa)
Protein 1LY2
Residue
Average ΔG
(kJ/mole)
Average pKa
Arginine
-50.21
20.79
Aspartic Acid
-50.29
12.71
Cystine
-50.30
17.09
Glutamic Acid
-48.86
12.86
Histidine
-47.94
14.28
Lysine
-50.29
19.31
Tyrosine
-46.51
18.24
Discussion
We believe that our ΔG values may be off by a order of magnitude
If the ΔG values are off by a order of magnitude, this would throw off
our pKa values as well
Complete evaluation of all scripts will done to see if our scripts are
running the right calculations
Special evaluation will be done on APBS template file
pKa are off because free energies are off
But we do see that the acidic amino acid residues pKa’s are lower then basic amino
acid residues pKa’s
pKa values from established software with same parameters yield
Arginine = 10.7
Aspartic Acid = 3.1
Cystine = N/A (software doesn’t recognize cystine as ionizable)
Glutamic Acid = 2.6
Histidine = 5.2
Lysine = 10.9
Tyrosine = 9.6
Values courtesy of H++ software
Progress Tracker (Winter)
1/2/10
1/12/10
1/22/10
2/1/10
2/11/10
2/21/10
3/3/10
3/13/10
Learn R
Conversion of PDB to PQR
Function to call APBS from R
Calculations of Free Energies
Duration (days)
Retrieving of Charged/Neutral from PDB
Calculate Intrinsic pKa
Combination of all Scripts
Reduce Run Time on Scripts
Future work
3/29/10
4/8/10
4/18/10
4/28/10
5/8/10
5/18/10
5/28/10
6/7/10
Calculate Apparent pKa by using Intrinsic pKa
Clustering or Monte Carlo
Generate Titration Curves
Thermodynamic Stability of Protein
User Friendly Graphical Interface
Duration (days)
Conclusion
Developed and refined scripts that took in
PDB files and converted them to neutral and
charged PQR files
Developed and refined scripts that took
neutral and charged PQR files and generated
files that corresponds to the four states of the
thermodynamic cycle
Intergrated all codes to run sequentially to
calculate free energies and pKa
Successful in taking protein 1LY2 PDB
file and calculating intrinsic pKa for all
ionizable amino acids of 1LY2
Acknowledgments
Dr. Dimitrios Morikis
Chris Kieslich
Ronald Gorham
Dr. Jerome Schultz
Gokul Upadhyayula
Hong Xu
Dr. Thomas Girke
References
Questions?
Our group would like to mention that no
computers were injured in the making of the
software package