Transcript Document
David Kim
Allergan Inc.
SoCalBSI
California State University, Los Angeles
Objective
Develop a model to predict corneal
permeability based on literature
compounds
Introduction
Ocular drug delivery mechanism
(through cornea and/or conjunctiva)
Focus of the project is the corneal route
Three major cell layers of the Cornea
Why predict corneal permeability?
Allergan, Inc. develops drugs which are
administered through the eye
A drug is only effective if it can reach its
target tissue
Can save company time and money in
determining if the drug can pass through
the cornea before the drug is synthesized
Introduction
Few models have been developed to
predict corneal permeability
Congeneric model (one class of compounds)
Non-congeneric model (mutiple class of compounds)
Develop non-congeneric model
focused on drug-like compounds
Literature
Generate descriptor values
• Compound names
• logPC and logD
• structure of compounds
Filter descriptors (intuitively)
Find optimal training and
testing set percentage
Run Partial Least
Squares modeling
Statistical analysis
Pick best model
Remove descriptor with
the lowest importance
Rebuild model
Final Model
Log PC = log of the Permeability Coefficient (cm/s)
Partition Coefficient:
Log D = log of the Distribution Coefficient (pH 7.65)
Yoshida, F., Topliss, J.G., J. Pharm. Sci. 85, 819-823 (1996)
Yoshida, F., Topliss, J.G., J. Pharm. Sci. 85, 819-823 (1996)
Compounds in Literature
Went through published literature
Filtered compounds to look only for drug
like compounds
Came up with 30 compounds and their
measured permeability
Next step in our model building process is to
produce descriptors for each of our
compounds
Descriptors
Molecular weight or volume
Degree of ionization
Aqueous solubility
Hydrogen-bonding
Log D
Polar surface area (PSA)
pKa
Solvent accessible surface area
Schrödinger Software
Named after Erwin Schrödinger –Nobel
prize winner for the Schrödinger equation
which deals with quantum mechanics
Suite of various programs dealing with
computational chemistry
Two programs used:
Maestro – calculate descriptor values
Canvas – generate model
Maestro Program
Maestro Program
Can generate 77 descriptors
Can manually input descriptors (eg. log D)
Filtered descriptors which do not deal with
permeability (intuitively) to reduce noise
Came up with 30 descriptors to use
Export the 30 compounds and its 30
descriptors to Canvas
Canvas Program
Canvas Program
Partial Least Squares (PLS) modeling
Can specify what descriptors to use to build
the model
Can specify the compounds used for
training and testing the model
Model assessment: corresponding statistics
of the model
Statistics
Training Set
Standard deviation (SD) – low
Coefficient of determination (R2) – high close to 1
Coefficient of determination, cross validation (R2-CV) –
high close to 1
Stability – close to 1
F-statistic (overall significance of the model) – high
P-value (probability that correlation happened by
chance) – low <0.01
Statistics
Testing Set
Root Mean Squared Error (RMSE) – low
Q2 – high close to 1
Pearson correlation coefficient (r-Pearson) – high close
to 1
Important for the assessment of what percentage of the
compounds we want to use for the training set
Important for the assessment of our model as we start to
remove unnecessary descriptors
Finding the ideal training set
percentage
Ran PLS modeling specifying various percentages
to use for the training set
40%, 50%, 60%, 70%, 80%
Looked at the statistics of each of the models built
Found that using 80% of the compounds for the
training set was ideal
30 compounds found in literature
24 in training set and 6 in the testing set
bx coefficient
After the PLS model is built, it gives the bx coefficient
for each descriptor in order to predict permeability
The bx coefficient is the weight that the model puts on
the descriptor after the descriptor values have been
scaled
Example:
log PC = 0.348(scaled MW) –0.221(scaled log D) -0.002(scaled log P)……
Removal of descriptors
Started with 30 descriptors and built a model
Identified the descriptor with the lowest bx coefficient
and removed it
Rebuilt model with 29 descriptors
Repeat…. while keeping track of the statistics
Want to keep track of statistics to know when to stop
Example:
log PC = 0.348(scaled MW) –0.221(scaled log D) -0.002(scaled log P)………….(30)
log PC = 0.392(scaled MW) –0.183(scaled log D)……………………………….………….(29)
Statistics based on Descriptor Removal
1.00
0.90
Statistical Values
0.80
0.70
Training Statistics
0.60
SD
R^2
0.50
R^2-CV
Stability
0.40
RMSE
Q^2
0.30
r-Pearson
Test Statistics
0.20
0.10
0.00
29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2
Number of Descriptors
Remaining 8 Descriptors
CIQPlogS – conformation independent predicted
aqueous solubility
QPlogS - predicted aqueous solubility
FOSA – hydrophobic component of the total
solvent accessible surface area
PISA - (carbon and attached hydrogen)
component of the total solvent accessible surface
area
Remaining 8 Descriptors
QPlogKp - predicted skin permeability
QPlogBB – predicted blood/brain partition
coefficient
donorHB - Estimated number of hydrogen bonds
that would be donated by the solute to water
molecules in an aqueous solution
log D – Distribution coefficient
Permeability Model Function
log PC = -0.1371(scaledCIQPlogS ) - 0.1383(scaledFOSA) +
0.1792(scaledPISA) + 0.1558(scaledQPlogBB) + 0.2815(scaledQPlogKp) 0.1451(scaledQPlogS) - 0.2242(scaleddonorHB) + 0.2646(scaledlogD)
SD = 0.460791
R2 = 0.814213
F = 46.0162 (p < 0.0000001)
Predicted vs Observed Permeability
Predicted Permeability (log PC)
-4.0
-4.5
-5.0
Training Set
Testing Set
-5.5
-6.0
-6.5
-6.5
-6
-5.5
-5
Observed Permeability (log PC)
-4.5
-4
Conclusion
Successfully created a model to predict the corneal
permeability of compounds
Showed that the Schrödinger software generates
significant descriptors to build a permeability model
Potential Future Work
Apply the model to external training set to asses its
predictability power
Build a more refined model with more compounds
Find other descriptors other than the ones generated
by Maestro and use them in the model building
Acknowledgments
Dr. Ping Du
Dr. Chungping Yu
Pushpa Chandrasekar
Noeris Salem
Allergan
SoCalBSI