Transcript Max
Discriminating between Drugs and
Nondrugs by Prediction of Activity
Spectra for Substances (PASS)
Soheila Anzali, Gerhard Barnickel, Bertram
Cezanne, Michael Krug, Dmitrii Filiminov, and
Vladimir Poroiko (collaboration between Merck
and an academic institution)
Max Shneider – Case study
Overview
Goal – better drug/nondrug classification at
beginning of drug discovery process (ADMET)
Method - Used PASS, a computer system that
predicts more than 500 biological activities using
regression
Has a mean prediction accuracy of about 86%
2D compound representation – includes
information on each atom and its neighbors
Training set – 5,000 drugs from WDI database and
5,000 nondrugs from ACD database
Test set filtering – removed items that were already
in training set, had errors in structural formulas, etc.
Results
Leave-one out (LOO) cross-validation
Mean prediction accuracy of 79.9%
PASS vs Drugs
864 launched and registered compounds from Cipsline database
Predicted 78.5% drugs, 21.5% nondrugs
PASS vs Nondrugs
9,484 compounds with reactive groups, low molecular weight, etc.
Predicted 83.8% nondrugs, 16.2% drugs
PASS vs TOP-100 Drugs
88 compounds from top-100 prescription pharmaceuticals list
Predicted 87.5% drugs, 12.5% nondrugs
Evaluating PASS with Cleaned Training Set
Used filtered “Drugs” and “Nondrugs” test sets from above as training
sets instead of WDI and ACD
LOO cross-validation – mean prediction accuracy of 89.9%
vs TOP-100 Drugs - Predicted 94.5% drugs, 4.5% nondrugs
Discussion
Chemical descriptors and algorithms in PASS provide
highly robust structure-activity relationships and
reliable predictions
PASS is in good accordance with other approaches
(Sadowsky and Kubinyi, Ajay)
PASS is relatively successful on new compounds that
have nontraditional structures and/or belong to new
chemical classes
Computation is fast – one compound can be
predicted in 4 ms on a 300 MHz computer
Using PASS out of the box gives good results, but
better discrimination might be possible with more
specific drug information