O. Arodola - QSAR studyx
Download
Report
Transcript O. Arodola - QSAR studyx
QSAR study on diketo acid and carboxamide
derivatives as potent HIV-1 integrase inhibitor
Presented By
Olayide Arodola
(Master student – Pharmaceutical Chemistry)
Aim of this study
The aim of this study is to find out how accurate the QSAR method predicted the
activities of compounds in comparison to their experimental biological activities.
Therefore, a 2-dimensional QSAR model was used to analyze 40 potential diketo
acid and carboxamide-based compounds as HIV-1 integrase inhibitors.
KEY WORDS:
Diketo acid and Carboxamide derivatives
2D-QSAR (2-dimensional quantitative structural activity relationship)
GFA (Genetic function algorithm)
Integrase inhibitor
SOFTWARES USED IN THIS STUDY
Chemdraw ultra 10.0 (to draw 2D structures of the compounds)
Discovery studio v3.5 (to perform QSAR analysis)
About HIV-1 integrase
The integration of HIV-1 DNA into the host chromosome contains a series of DNA cutting and joining reactions. The first step in
the integration process is 3”end processing. In the second step, termed DNA strand transfer, the viral DNA end is inserted into the
target DNA. Thus, the integrase enzyme is crucial for viral replication and represents a potential target for antiretroviral drug.
• First, a quick reminder: what do you understand by ‘drug’
• A very broad definition of a drug would include “all chemicals other than food that
affect living processes”. if it helps the body, its medicine, but if it causes a harmful effect
on the body, its poison.
Nowadays, we are facing a problem of screening a huge number of molecules in other
to testify:
• If they are toxic to human
• If they have an effect on virus e.g HIV, HPV (cervical cancer), H1N1 (flu), ebola etc
• Such screenings are measured by laborious experiments.
• Researchers came up with a process to relate a series of molecular
features with biological activities or chemical reactivities, which is
expected to decrease a number of laborious and expensive experiments
thereby selecting small number of good compounds for later synthesis.
QSAR
• QSAR is a mathematical relationship between a biological activity of a molecular
system and its physical and chemical characteristics i.e QSAR represents an
attempt to develop correlations between biological activity and physicochemical
properties of a set of molecules.
• In pharmacology, biological activity describes the beneficial or adverse effects of
a drug on living matter.
• Physicochemical properties of a compound simply means both its physical and
chemical property.
• The first application of QSAR is attributed to Hansch (1969), who developed an
equation that related biological activity to certain physicochemical properties of a
set of structures.
WHY QSAR
The number of compounds required for synthesis in order to place 10
different groups in 4 positions of benzene ring is 104
Solution: synthesize a small number of compounds and from their data
derive rules to predict the biological activity of other compounds.
QSAR and Drug Design
Correlate chemical structure with activity using statistical approach
Compounds + biological activity
QSAR
New compounds with
improved biological activity
BASIC PRINCIPLES
A QSAR normally takes the general form of a linear equation:
Biological activity = Const + (C1×P1) + (C2×P2) + (C3×P3) +...
where the parameters
P1 through pn are computed for each molecule in the series and the coefficients C1 through cn are
calculated by fitting variations in the parameters and the biological activity.
• A = k1d1 + k2d2 + k3d3 + kndn + Const
A – Biological activity
D – Structural properties (descriptors)
K – Regression coefficient
There are a series of statistical model analysis that are used to develop a QSAR model, they
include:
Multiple linear regression (MLR)
Principle component analysis (PCA)
Partial least square (PLS)
Genetic function algorithm (GFA)
There are a series of statistical model analysis that are used to develop a QSAR model, they
include:
Multiple linear regression (MLR)
Principle component analysis (PCA)
Partial least square (PLS)
Genetic function algorithm (GFA)
Why GFA
GFA was used to develop this QSAR models for variable selection. The
purpose of variable selection is to select the variables significantly
contributing to prediction and to discard other variables by fitness
function.
Ability to build multiple models rather than single model
Ability to incorporate the lack of fit (LOF) error that resists over-fitting
Automatic removal of outliers e.g 1, 3, 6, 9, 100
Provision of additional information not available from other statistical
Cpd Core R1
R2
R3
1
A
Pyrrole
4'-F
-
IC50(μM) *pIC50(μM) Predicted
pIC50(μM)
0.17
0.770
0.409
2
A
O-xylene
-
-
5.67
-0.754
0.105
3
A
-
0.22
0.658
0.377
4a
A
1,2-(CH3)-1Hpyrrole
2,3-(CH3) thiopene -
-
0.18
0.745
0.326
5
A
2,4-(CH3) thiopene -
-
0.16
0.796
0.498
6
A
-
0.5
0.301
0.616
7
A
1,3-(CH3)-1Hpyrrole
2,5-( CH3) thiopene -
-
0.5
0.301
0.608
Methods
Out of 40 compounds, 30 were used as a training set and 10 as a test set to evaluate the internal degree
of predicitivity of the QSAR equation.
Using Chemdraw ultra 10.0, different 2D structures were drawn, followed by the conversion to 3D
structures of reasonable conformations using Discovery studio v3.5 software.
A large number of descriptors were also calculated (e.g. ALogP, molecular weight, molar refractivity,
dipole moment, heat of formation, Radius of gyration, Wiener index, Zagreb index etc.).
2D QSAR analysis was carried out using genetic function algorithm (GFA) analysis.
RESULT
A QSAR model was generated for integrase activity. In order to select the
optimal set of descriptors, we used systematic variable selection leave one
out (LOO) method in a stepwise forward manner for the selection of
descriptors. Three best QSAR equations models generated for this study
using the GFA approach and LOO method are shown in table below.
R2
Q2
LOF
P-value
1 Y= -11.65 − 0.0024929W + 0.088809Z +
0.01936M + 1.1879R
0.820
0.558
0.193
5.174e-0
2 Y= -12.896 − 0.0028585W + 0.077907Z
+ 0.020068M + 0.015681Ms
0.812
0.470
0.202
9.270e-0
3 Y= -9.6736 − 0.0020098W + 0.078883Z
+ 0.89779R
0.790
0.620
0.190
5.641e-0
Equation
Y: pIC50, set of descriptors (W, Z, M, R, Ms,), R2: correlation coefficient, Q2: cross-validated R squared, LOF: Lack of fit, P-value:
significance level
pIC50 = -11.65 − 0.0024W + 0.089Z + 0.019M + 1.187R
O
O
OH
O
OH
HN
N
H
N
HN
H
N
N
H
N
H
N
N
N
O
O
O
O
F
F
30
0.02
34
0.02
O
O
N
O
OH
N
F
F
HN
Br
N
H
N
N
H
N
N
N
N
Cl
OH
OH
O
O
F
O
17
0.04
OH
O
19
0.03
OH
35
0.015
Cmpds
1
2
3
5
6
7
9
10
11
13
14
15
17
18
19
21
pIC50
0.77
-0.754
0.658
0.796
0.301
0.301
0.602
0.824
0.854
0.638
0.432
1.398
1.398
1.699
1.523
2.699
Predicted1 Residual1 Predicted2 Residual2 Predicted3 Residual3
0.409
0.361
0.393
0.377
0.274
0.496
0.105
-0.859
0.407
-1.161
0.335
-1.089
0.377
0.281
0.397
0.261
0.261
0.397
0.498
0.298
0.618
0.178
0.228
0.568
0.616
-0.315
0.536
-0.235
0.422
-0.121
0.608
-0.307
0.398
-0.097
0.512
-0.211
0.463
0.139
0.330
0.272
0.602
0.000
0.505
0.319
0.563
0.261
0.692
0.132
0.591
0.263
0.900
-0.046
0.725
0.129
0.971
-0.333
0.676
-0.038
1.017
-0.379
1.280
-0.848
1.316
-0.884
1.276
-0.844
1.239
0.159
1.166
0.232
1.260
0.138
1.267
0.131
1.401
-0.003
1.340
0.058
1.580
0.119
1.311
0.388
1.559
0.139
1.276
0.247
1.464
0.059
1.362
0.160
2.495
0.204
2.796
-0.097
2.334
0.365
22
2.155
1.681
0.474
1.672
0.483
1.713
0.442
23
2.097
1.973
0.124
2.034
0.063
1.989
0.108
25
1.921
1.957
-0.036
1.998
-0.077
1.975
-0.054
26
2.000
1.704
0.296
1.724
0.276
1.777
0.223
27
1.824
1.797
0.027
1.707
0.117
1.867
-0.043
29
1.824
1.943
-0.119
1.851
-0.027
1.883
-0.059
30
1.699
1.970
-0.271
1.926
-0.227
1.929
-0.230
31
1.585
1.391
0.194
1.499
0.086
1.594
-0.009
33
2.046
1.739
0.307
1.845
0.201
1.860
0.186
34
1.699
2.020
-0.321
1.809
-0.110
2.154
-0.455
35
1.824
1.931
-0.107
1.787
0.037
2.017
-0.193
37
2.155
2.325
-0.170
2.302
-0.147
2.090
0.065
38
2.097
2.221
-0.124
2.243
-0.146
2.109
-0.012
39
2.222
2.357
-0.135
2.219
0.002
2.133
0.089
Cmpds
pIC50
Predicted1
Residual1
Predicted2 Residual2
Predicted3 Residual3
4
0.745
0.326
0.419
0.287
0.458
0.282
0.463
8
0.000
0.485
-0.485
0.761
-0.761
0.587
-0.587
12
1.000
1.178
-0.178
0.836
0.164
1.215
-0.215
16
0.420
1.212
-0.792
1.259
-0.839
1.233
-0.813
20
1.699
1.482
0.217
1.784
-0.085
1.473
0.226
24
1.745
1.580
0.165
1.471
0.274
1.634
0.111
28
2.398
1.594
0.804
1.500
0.898
1.706
0.692
32
1.678
1.937
-0.260
1.877
-0.199
1.961
-0.283
36
2.155
1.936
0.219
1.765
0.390
2.096
0.059
40
1.824
2.656
-0.832
2.360
-0.536
2.371
-0.547
Conclusion
From the above result, it can be concluded that Radius of gyration,
Zagreb index, Weiner index and minimized energy are statistically
important with the correlation coefficient value of 0.8209, which is
highly significant.
This QSAR method can be used to predict the activities of future HIV-1
integrase inhibitors.
References
1.
Summa, V., Petrocchi, A., Bonelli, F., Crescenzi, B., Donghi, M., Ferrara, M., Fiore, F., Gardelli, C., Paz, O. G., Hazuda, D.
J., Jones, P., Kinzel, O., Laufer, R., Monteagudo, E., Muraglia, E., Nizi, E., Orvieto, F., Pace, P., Pescatore, G., Scarpelli, R.,
Stillmock, K., Witmer, M. V., and Rowley, M. (2008) Discovery of Raltegravir, a potent, selective orally bioavailable HIV-
integrase inhibitor for the treatment of HIV-AIDS infection, J. Med. Chem. 51, 5843-5855.
2.
Wai, J. S., Egbertson, M. S., Payne, L. S., Fisher, T. E., Embrey, M. W., Tran, L. O., Melamed, J. Y., Langford, H. M., Guare,
J. P., Zhuang, L. G., Grey, V. E., Vacca, J. P., Holloway, M. K., Naylor-Olsen, A. M., Hazuda, D. J., Felock, P. J., Wolfe, A. L.,
Stillmock, K. A., Schleif, W. A., Gabryelski, L. J., and Young, S. D. (2000) 4-aryl-2,4-dioxobutanoic acid inhibitors of HIV-1
integrase and viral replication in cells, J. Med. Chem. 43, 4923-4926.
3.
Wai, J. S., Kim, B., Fisher, T. E., Zhuang, L., Embrey, M. W., Williams, P. D., Staas, D. D., Culberson, C., Lyle, T. A., Vacca,
J. P., Hazuda, D. J., Felock, P. J., Schleif, W. A., Gabryelski, L. J., Jin, L., Chen, I. W., Ellis, J. D., Mallai, R., and Young, S.
D. (2007) Dihydroxypyridopyrazine-1,6-dione HIV-1 integrase inhibitors, Bioorg. Med. Chem. Lett. 17, 5595-5599.
My Current Research
Could the FDA-approved anti-HIV drugs be promising anti-
cancer agents? An answer from extensive molecular dynamic
analyses
Acknowledgement
Dr Mahmoud Soliman (my supervisor) & the lab members
CHPC (Technical support)
UKZN School of health sciences (Financial support)
Thank you