Transcript Max

Discriminating between Drugs and
Nondrugs by Prediction of Activity
Spectra for Substances (PASS)
Soheila Anzali, Gerhard Barnickel, Bertram
Cezanne, Michael Krug, Dmitrii Filiminov, and
Vladimir Poroiko (collaboration between Merck
and an academic institution)
Max Shneider – Case study
Overview
 Goal – better drug/nondrug classification at
beginning of drug discovery process (ADMET)
 Method - Used PASS, a computer system that
predicts more than 500 biological activities using
regression

Has a mean prediction accuracy of about 86%
 2D compound representation – includes
information on each atom and its neighbors
 Training set – 5,000 drugs from WDI database and
5,000 nondrugs from ACD database
 Test set filtering – removed items that were already
in training set, had errors in structural formulas, etc.
Results
 Leave-one out (LOO) cross-validation

Mean prediction accuracy of 79.9%
 PASS vs Drugs


864 launched and registered compounds from Cipsline database
Predicted 78.5% drugs, 21.5% nondrugs
 PASS vs Nondrugs


9,484 compounds with reactive groups, low molecular weight, etc.
Predicted 83.8% nondrugs, 16.2% drugs
 PASS vs TOP-100 Drugs


88 compounds from top-100 prescription pharmaceuticals list
Predicted 87.5% drugs, 12.5% nondrugs
 Evaluating PASS with Cleaned Training Set



Used filtered “Drugs” and “Nondrugs” test sets from above as training
sets instead of WDI and ACD
LOO cross-validation – mean prediction accuracy of 89.9%
vs TOP-100 Drugs - Predicted 94.5% drugs, 4.5% nondrugs
Discussion
 Chemical descriptors and algorithms in PASS provide




highly robust structure-activity relationships and
reliable predictions
PASS is in good accordance with other approaches
(Sadowsky and Kubinyi, Ajay)
PASS is relatively successful on new compounds that
have nontraditional structures and/or belong to new
chemical classes
Computation is fast – one compound can be
predicted in 4 ms on a 300 MHz computer
Using PASS out of the box gives good results, but
better discrimination might be possible with more
specific drug information