Quantitative Structure-Activity Relationship Analysis of

Download Report

Transcript Quantitative Structure-Activity Relationship Analysis of

Quantitative Structure-Activity
Relationship Analysis of Functionalized
Amino Acid Anticonvulsant Agents Using k
Nearest Neighbor and Simulated
Annealing PLS Methods
Shen, M.; LeTiran, A.; Xiao, Y.; Golbraikh, A.;
Kohn, H.; Tropsha, A. J. Med. Chem. 2002,
45, 2811-2823
Study aims to develop a QSAR for
for anticonvulsant drugs
• Epilepsy is a neurological disorder affecting millions
• Functionalized amino acids (FAA) show promise as
therapeutics
• A quantitative structure-activity relationship (QSAR)
would assist the development of new FAA
2D computational models were used
• Two variable selection approaches were used
– Simulated annealing partial least squares (SAPLS)
– K nearest neighbor (kNN)
• Advantages of 2D over 3D models
– Readily automated
– Adaptable to database searching
– Faster
Structure of test compounds
R2
R1
N
H
H
N
O
R3
2D models evaluate molecular
topology
• Molecules in the dataset were assigned a fingerprint
• 189 descriptors were used
• Correlated with activity
aromatic ring center
O
N
H
H
N
N,N distance = 2
O
hydrogen bond acceptor
hydrogen bond donor
Refinement of model
• Models to predict activity were generated
• Weighting of descriptors varied to increase q2
Visualization of refinement
Robustness of QSAR models
• Best models are cross-checked against randomized
datasets
• Low q2 of shuffled data supports statistical
significance of models
Validating models with test set
• Best models (q2 > 0.5) were tested on 14 additional
compounds
• Not all statistically significant models (high q2) were
highly predictive (high R2)
Predictive results
• Best models were reasonably predictive for the ED50
of test set
Conclusions
• A series of QSAR models were generated for FAA
anticonvulsants
• Models are optimized to be internally and externally
accurate
• Difficult to visualize results
Discussion questions
•
1. The functionalized amino acids (FAA) that the authors screened were
racemic (a 50/50 mixture of enantiomers). What are the ramifications of using
racemic materials in a bio-assay? Is this a shortcoming of the evaluation?
•
2. The QSAR models that were generated have ambiguous implications about
the real-world criteria for FAA that are highly active. This derives from the 2D
methods used to generate these models. Discuss the relative advantages and
disadvantages of 2D vs. 3D methods in terms of interpreting the computational
output.
•
3. A training set of 48 compounds was used to generate the QSAR models.
When the models with the highest statistical significance (high q2) were
applied to a test set of compounds, good predictability (high R2) was observed
only when the test set was small (7 or 8 compounds). In larger test sets the R2
value did not exceed 0.5. Is the short scope of predictability acceptable given
the size of the training set? Or is it an indication that the models need further
refinement?