Catalyst™: Building Hypotheses and Searching Databases

Download Report

Transcript Catalyst™: Building Hypotheses and Searching Databases

Building Hypotheses and
Searching Databases
Two ways of creating an
hypothesis
• Automatically create a chemical featurebased hypothesis from a set of
compounds with respect to a type of
activity.
• Build an hypothesis by assembling
substructures and chemical functions
and specifying the geometric constraints
between them.
Components of an Hypothesis
• Assemble the components from known data such as
the atomic coordinates available from X-ray
crystallographic data.
• Express the characteristics of the hypothesis as a
collection of particular chemical substructures or a
collection of chemical functions such as hydrogen
bond donors and hydrophilic groups, or a
combination of substructures and chemical functions.
Chemical Substructures Available
• The feature dictionary contains a large
library of chemical functional groups
such as primary, secondary, tertiary
amines, hydroxyl, carbonyl, acridyl,
acetoamido, 1-beta-glucopyranosyl,
amino acids etc, etc.
Chemical Functions Available
• The chemical functions available
include HB ACCEPTOR, HB
ACCEPTOR lipid, HB DONOR,
HYDROPHOBIC, HYDROPHOBIC
aliphatic, HYDROPHOBIC aromatic,
NEG CHARGE, NEG IONIZABLE, POS
CHARGE, POS IONIZABLE, RING
AROMATIC
Chemical Functions Available
• The distances, angles, and/or torsions
between items in an hypothesis, the preferred
location of a chemical feature, and a range of
elements per atom position may be specified
within the hypothesis or a substructure of the
hypothesis.
• Excluded volumes may also be specified.
Using the hypothesis?
• Having built the hypothesis databases may
then be searched with the hypothesis to find
compounds within the databases that match
the hypothesis.
Building a Substructure Hypotheses
and Searching Databases
Hydrogen count set to “anything”
Specifying atom range per atom
position
Specifying atom range per atom
position
Searching a Database with an
Hypothesis
• Once the hypothesis has been designed
and built it may be used to search a
database for compounds that contain
the defined features.
Results of Database search.
Default 300 hits.
Hit example 1
Hit example 1
Compound data
Compound sort. Handling large datasets.
Compound property report.
Managing Databases: Coping with
extremely large numbers.
• Are scientists going to manage in the new
world without an in depth knowledge of
mathematics and statistics?
• Are we training scientists to cope with tools
such as, cluster analysis, discriminate
analysis, cross validation techniques, neural
networks, Fourier transformations etc, etc.
• Do we need courses in the design, building
and interrogation of databases.
Building a Feature Based Hypotheses
and Searching Databases
Setting distance constraints.
Setting distance constraints.
Hybrid hypothesis of chemical
functions and fragments.
Using the generic b-adrenergic
agonist to search a database.
Databases?
•
•
•
•
•
Global structural databases e.g. CCSD
Structure specific databases.
Therapeutic area based databases.
Multi-conformer databases.
Composite databases that encapsulate the
information of multi-gigabyte files.
• QSAR based databases.
• Commercially available versus problem
specific databases.