ChemAxon Presentation

Download Report

Transcript ChemAxon Presentation

Solutions for Cheminformatics
Migration from ISIS environment
Szabolcs Csepregi et al
November 2008
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
Product Map
ChemAxon Embedded - examples
• Workflow
– Pipeline Pilot, Inforsense, and KNIME.
• ELN
– Agilent, Contur, DeltaSoft, Kinematic, etc.
• SAR
– Spotfire, Synaptic Science, Omniviz
• Databases
– Aureus, GVK, Jubilant Biosys, Patcore
• Web
– Thomson Reuters, Wiley, Houghton Mifflin, Cengage,
Prentice Hall, Collaborative Drug Discovery, RCSB PDB,
BindingDB, NIH/NLM ChemIDPlus, Molport etc
MarvinSketch/View http://www.chemaxon.com/MarvinSketch_View.ppt
MarvinSpace
http://www.chemaxon.com/MarvinSpace.ppt
The Marvin family
MarvinSketch
MarvinView
MarvinSpace
Available as Java applets for HTML pages and Java beans for standalone apps (full API)
Structure, query &
reaction editing
Individual and
structure table
visualization
Publication quality
macromolecule
visualization
Marvin Development History
1998
1999
2000
2001
Applets,
Molfiles, stereo
support,
Windows,
Unix
SDF, RDF, XYZ
animations,
CML, templates,
compressed
formats, Swing,
3D models
SMILES,
SMARTS,
PDB,
Rgroups,
isotopes,
shortcuts,
Marvin
Beans
2002
Ball and stick
JPG, PNG, SVG,
Cut&Paste with
Isis/ChemDraw,
2D cleaning,
(de)aromatization,
reaction drawing
Mac
support,
signed
applets,
Java Web
Start, atom
mapping
2003
Partial charge,
pKa, logP/logD,
3D optimization,
radicals,
abbreviated
groups
2004
2005
2006
2007
2008
Marvin file format,
enhanced stereo,
shapes, text boxes,
multiple groups, link
nodes, TPSA,
recursive
SMARTS,
Donor/Acceptor,
electron arrows,
Tautomers,
resonance, lone
pairs,
conformers, 3D
sketching,
MarvinSpace,
More Plugins,
more R-groups,
EMF, PDF and
Mol2,
Improved
property storage
in MRV, SDfiles
and Rdfiles.
.NET support in
MarvinBeans.
Structure to
name,
Coordination
compounds,
Polymer
drawing, OLE,
Markush
enumeration
plugin
Configurations
Name to structure,
OLE 2,
Chemical Terms
Customizable GUI
Topology
analysis,
presentation
quality
graphics,...
Calculator Plugins http://www.chemaxon.com/Calculator_Plugins.ppt
Calculator Plugins
A variety of structure based calculations are available from
the Marvin GUI, cxcalc command line tool and the API. The
calculations are widely used within several JChem tools and
are available as functions of Chemical Terms expressions.
Elemental Analysis
IUPAC Name
Standard IUPAC Name
Protonation
pKa, Major Microspecies, Isoelectric Point
Partitioning
logP, logD
Charge
Charge, Polarizability, Orbital Electronegativity
Isomers
Tautomerization, Resonance, Stereoisomer
Conformation
Conformer, Molecular Dynamics
Geometry
Topology Analysis, Geometry, Polar Surface
Area (2D), Molecular Surface Area (3D)
Markush enumeration
Other
Hydrogen Bond Donor-Acceptor, Huckel Analysis,
Refractivity
Chemical Naming
Structure to Name/ Name to structure
Supported nomenclatures :
• Chains, Monocycles/ Traditional names with
and without heteroatom/ Spiro ring systems/
Ethers/ Common characteristic groups, Ionic
compounds/ Unlimited number of atoms
and rings/ All atom types /Stereochemistry/
etc.
Usage:
• drag&drop or copy&paste to MarvinSketch
• Label updated in real-time
• Automatic format recognition
• Batch from command line
JChem family
JChem Base
JChem Cartridge
Instant JChem
Fast substructure and
similarity searching
Tight Oracle SQL integration Desktop application for
scientist
ChemAxon’s proprietary Arbitrary database structure Access local and remote
database structure
databases
JChem development history
2000
2001
Oracle, MySQL,
Clustering,
SQLServer, Access, diversity
hashed fingerprints,
substructure and
similarity search
2002
2003
2004
DB2, PostgreSQL,
Rgroup searching
Reaction searching,
fragmentation,
reaction processing,
standardization,
pharmacophores,
screening
Cartridge, enhanced
stereo searching,
recursive SMARTS,
Chemical Terms,
virtual synthesis
2005
2006
R-decomposition,
R-enumeration,
reaction library,
custom fingerprints,
random synthesis,
link nodes…
Tautomer search,
Instant JChem
reaction similarity,
Library MCS, GUI for
Standardizer/ Reactor
…
2007
Calculated columns,
Installer,
Tautomer Duplicate
filtering, Query tables,
Markush tables,
Speed enhancements for
JChem Cartridge,
form design, relational data
for Instant JChem ...
2008
Position variation queries,
Instant JChem:
-- Federated search,
-- Cartridge support...
-JChem for Excel
Structural Search http://www.chemaxon.com/Structural_Search.ppt
JChem Base http://www.chemaxon.com/JChem_Base.ppt
JChem Base
Features
• Fast and sophisticated searching
(chemical and non-chemical data,
Chemical Terms filter, many options)
• Custom standardization
• Calculated columns
• Combinatorial Markush structure
tables
Interfaces
• Integration with most relational
database engines
• JChem Cartridge for tight Oracle
SQL integration
• JSP integration – open source web
example
• Desktop-ready through Instant
JChem
DB2
Searching in combinatorial Markush structures
Combinatorial Markush structure registration and search
• Markush features handled in search &
enumeration:
• R-groups (nesting to any depth)
• Atom lists, bond lists
• Position variation bond
• Link nodes
• Compatible Markush enumeration plugin
• Not all query features supported
Detailed description:
http://www.chemaxon.com/product/markush_search.html
JChem Cartridge http://www.chemaxon.com/JChem_Cartridge.ppt
JChem Cartridge for Oracle
• Access JChem functionality via SQL functions
• All search features of JChem Base
• JChem index for chemical data in arbitrary database structure
• Chemical filters and property predictors using Chemical Terms
• Standardization (structure canonicalization) during registration
• Structure format conversions
• 2D, 3D image generation
• Library enumeration using
virtual reactions and
Markush structures
Instant JChem: http://www.chemaxon.com/conf/Instant_JChem.ppt
Instant JChem
Desktop application for local and remote chemical database
management, search and structure based prediction
• Simply connect to external
databases and share your
native database simultaneously
• Powerful search functionalities
• Scalable – explore large
datasets (106 +)
• Dynamically predict properties
using Calculator Plugins
• Apply canonicalization rules for
import and viewing
• Wide import / export options
• Merge data sets into a single
set
• Very active development – what
do you want to do?
JChem for Excel
•
Microsoft Excel integrated
solution for Marvin and JChem
functionality
•
Use Excel’s powerful features:
Functions, Sorting, Filtering,
Charts…
•
Implemented in C# .NET, and
Visual Studio
– Proof that ChemAxon APIs
can be used in a Java-less
.NET environment
•
Easy to install and deploy
•
UNDER DEVELOPMENT
Standardizer http://www.chemaxon.com/Standardizer.ppt
Canonicalization with Standardizer
• Structure canonicalization
–
–
–
–
–
–
–
–
–
Mesomers
Tautomers
Solvent and counter ion removal
Aromatization, dearomatization
Explicit/implicit hydrogen conversion
Stoichiometry expansion
Stereo manipulations
2D cleaning
Template based cleaning
• Custom rules
• Availability
–
–
–
–
JChem Base, Cartridge & IJC
API (Java and .NET)
Batch processing
GUI
Drug discovery tools
JKlustor
Screen
Profiling, analysis, diversity
Virtual screening by topological
descriptors
Fragmenter
Reactor
Library profiling and reactant
generation
Virtual reactions and synthesis
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
Contents
• A short introduction of JChem Cartridge
• MDL/Symyx features in JChem
• Migration from MDL/Direct and ISIS/Host
• Migration case studies and user feedback
Purpose of JChem Cartridge
• Access JChem functionality using SQL:
SELECT count(*) FROM nci WHERE jc_contains(structure,
'Brc1cnc2ccccc12') = 1
Access JChem in any programming environment offering Oracle
connectivity (.NET, Java, Perl, PHP, Python, Apache mod_plsql...)
• Execute SQL queries efficiently using extensible
indexes
Precompute chemical information on structures by creating jc_idxtype
indexes:
CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS
jc_idxtype
The jc_idxtype implementation scans the indexed column for eligible
structures in one single performance-optimized operation: domain
index scan
Features of JChem Cartridge
• Adds chemistry knowledge into the SQL language of Oracle
(SELECT, INSERT, UPDATE, ...)
• Substructure, superstructure, exact structure, similarity
searching
• Fast: typically 10k hits in 3M structures within a second
• Complex chemical expressions using the Chemical Terms
language that includes logP, pKa, ...
• Automatic property calculation during registration
• Standardization (canonicalization) during registration
• Structure format conversions (MRV, Molfile, SDfile, RDfile,
SMILES, CML, etc.)
• 2D, 3D image generation
• Structure enumeration using reaction rules
• Interaction with Oracle optimizer
Operators and functions
•
•
•
•
•
•
•
•
•
•
jc_compare: substructure/similarity/exact searching combined with
Chemical Terms expressions
jc_matchcount: number of occurences of the query structure in the
target
jc_evaluate: Chemical Terms evaluation
jc_molweight: molecular weight
jc_formula: molecular formula
jc_react: structure enumeration based on virtual reactions
jc_standardize: structure canonization
jc_molconvert: conversion to different formats (image generation is
supported)
jc_tanimoto: similarity search
jcf.hitColorAndAlign: substructure coloring and alignment
Similarity search example displaying ID, SMILES code, and molweight:
SELECT cd_id, cd_smiles, cd_molweight FROM my_structures
WHERE jc_tanimoto(cd_smiles,
'CC(=O)Oc1ccccc1C(O)=O') >= 0.8;
Structure search features
• Wide range of query atoms
• Query properties
• R-group queries
• Full SMARTS support
• Coordination compounds
• Link nodes
• Position variation
• Pseudo atoms
• Lone pairs
• Relative stereo
• Reaction search features
• Hit coloring
See detailed information on structure search:
www.chemaxon.com/conf/Structural_Search.ppt
Search options
• Chemical Terms filter constraint
• Tautomer search
• sp hybridization state check
• Stereo on/off
• Ignore charge/isotope/radical/valence/mixture brackets
• Vague bond matching modes: „or aromatic”; ignore
bond types
• Inverse hit list
• Maximum search time / number of hits
• SQL SELECT statement for pre-filtering
• Ordering of results
• etc.
Compatibility and integration
File formats:
• SMILES
• MDL molfile
(v2000 and v3000)
• MDL SDF
• RXN
• RDF
• MRV
• IUPAC name, InChI
DB engines:
Operating
systems:
• Windows
• Linux
• Solaris
• HP-UX
• etc.
Oracle versions 9i R2 or above
for alternative RDBMS systems, see the JChem Base
presentation: http://www.chemaxon.com/JChem_Base.ppt
Index parameters
Index parameters affect:
• Fingerprint attributes
• Standardizer configuration
• Table space and storage options of the index table
Examples:
• Standardization by stripping hydrogens and using basic
aromatization:
CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype
PARAMETERS('STD_CONFIG=dehydrogenize:optional..aromatize:b')
• Add structural keys to fingerprint for more efficient substructure
searching (structural keys are defined in table stfp_keys):
CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype
PARAMETERS('STRUCTURALFP_CONFIG=select structure from
stfp_keys')
Supported Column Types
• VARCHAR2: typically for short formats, e.g. SMILES
• CLOB
• BLOB
for longer formats, e.g. MDL molfile,
Marvin (mrv)
MDL Feature Compatibility
The learning curve of chemists familiar with ISIS is very short. After
having some practice, Marvin is reported a more productive drawing
environment. The most of the MDL features are available in Marvin and
JChem, and many others not available in MDL technology.
•
Generic atom types and bond
types
•
Link Nodes
•
R-groups, R-logic
•
Atom query properties
•
Stereochemistry
•
Atom Lists/ not lists
•
Aliases
•
Pseudo atoms
•
Atom values
•
Group and brackets
–
–
–
–
–
–
Abbreviated groups
Multiple groups
Repeating units
Polimers
Mixtures
Attached Data
–
–
–
–
–
Chiral flag
Parity
Double bond stereo
Enhanced stereo (abs/end/or)
inv/ret
•
Reacting center on bonds
•
Reaction mapping
•
Topology (ring/chain)
•
Option for ISIS-like look
•
Others…
MDL Feature Compatibility
What is missing:
•
Polymer search (coming in 5.2)
•
Attached data S-group search (coming in 5.2)
•
3D special features
•
Exact change flag (reaction)
Migration from MDL/Direct cartridge
• 2 alternatives:
– JChem indexes need to be created on structure columns of
existing tables, or
– Structural data migrated to new tables with JChem
Cartridge indexes
• The MDL/Direct SQL operators need to be changed
to JChem operators in all uses.
• Non-chemistry tables: no need for migration
Migration from ISIS/Host
• Molecule source need to be accessible for JChem:
– Through exporting SD/Rdfiles from ISIS and importing into
new tables with JChem index, or
– Setting option in ISIS/Host to include molfile in RCG tables,
and:
• Use SQL to insert mol field into JChem tables, or
• Add JChem index on original tables
• ISIS/Host interfaces need to be rewritten to use SQL
only, referencing JChem operators.
• Hviews and GUI-s need to be replaced separately.
(See further slides later.)
• Non-chemistry tables: no need for migration
An Independent Comparison
FMC migrated from MDL® ISIS/Base ISIS/Host to ChemAxon’s JChem.
They later published their detailed scientific comparison.
•
Used 1.8 million vendor compounds to create a testing database
•
Prepared 115 different query structures for comparison
•
51 simple sub-structure search
•
51 similarity search
•
64 complex search
Identical search hits in almost all cases, major differences result from
MDL’s incorrect aromatic bond definitions in case of 5 member aromatic
rings. ChemAxon's approach is the chemically correct and their
performance is higher (faster).
Identical Results
Differences
Vague Bonds
For the sake of perfect compatibility with
MDL searching ChemAxon provides
vague bond options to retrieve results
according to MDL systems.
Technical Comparison
•
Supported Platforms
–
–
•
Supported Databases
–
–
•
ISIS: Oracle
JChem: Oracle, MySQL, SQL Server, PostgreSQL, Access, DB2
Processing SD Files
–
–
•
ISIS®: Sun Solaris, Windows Servers
JChem: Sun Solaris, Windows Servers, Linux, Irix, MAC
ISIS: 31 hours, Pipeline Pilot & ISIS
JChem: 11 hours, JChem
Technology Transparency
–
–
ISIS®: Unclear Data/Table Structures
JChem: Clear Understanding of
•
•
•
Flow of Data
Structure of Data
Execution Process
Native Oracle Tables and Procedures
•
Performance
–
–
ISIS®: Slow similarity search
JChem: Fast similarity search
Comparison Conclusions
•
Technical Conclusion
–
–
–
–
Clear and straightforward understanding of data representation and system
architecture
Integrated system
Quicker and less error-prone
Less hassle for software development
From technical point of view, ChemAxon is favorable
•
Business Conclusion
ChemAxon was the better choice
Migration Experience Questionnaire
Five companies were interviewed about their JChem Cartridge migration
experiences in the form of a questionnaire containing 14 questions.
•
A UK based service/biotech company
•
A Swedish biotech company
•
A US branch of a Swiss pharmaceutical company
•
A Japanese pharmaceutical company
•
A US branch of a Japanese pharmaceutical company
Migration Experience
1. What was the platform you used before the migration?
–
–
–
–
–
All systems were run using the Daycart cartridge on Linux servers
MDL Cartridge running on Sun Solaris
Daycart
We used ISIS/Host as a server, the client was ISIS/Base customized using ISIS/PL
Daylight and IDBS Chembridge
2. How long did it take to migrate?
–
–
–
–
–
Very simple, hardly any time at all, just a few hours to uninstall old cartridge, install
new cartridge and build indexes. Then modify a few SQL statements in the code to
use the new cartridge functions.
It took a full weekend to switch over and convert all old databases.
Since we use SQL for structure searches, the actually change in the application code
are few. Code changes takes about 1 day. However, we spent at least two weeks to
compare the daylight and jcart.
It took 1 year for planning, and another 1 year for designing and developing the
system. 1-year-migration time includes all of the operation that is needed. That
means our technical people worked for this project 1 year. We migrated the data
structure of HView, but the form was re-designed in order to fit our existing (wet)
workflow.
Two months
Migration Experience
3. How many technical people were required in the migration process?
–
–
–
–
–
It was fairly simple so just one developer with all round programming, database, and
chemistry knowledge.
One person
2 people
6 technical people. 2 were contacting with users. For the system design, 11 users
were involved from chemistry, HTS, eADME groups.
1.5
4. Why did you decide on leaving the previous platform? (problems)
–
–
–
–
–
Purely the cost. We found the Daycart system to be very good, very stable, fast, and
the API was well thought out. However, it was just too expensive for us.
Old technology not offering new functionality. High cost, in particular for new
licenses.
Daycart (at least at that time) did not take MOL query, not all query structures could
be correctly presented as smiles/smarts.
Two main reasons were the maintainance cost, and the accessibility. We had to
suppress the raising system (software) cost, and at the same time we had to enlarge
the number of users and client PCs from which we could use DB system.
Cost, maintenance and risk
Migration Experience
5. What alternative platforms were considered/evaluated?
–
–
–
–
–
Prior to selecting ChemAxon we looked at all the cartridges available at the time
The Accord cartridge was also evaluated. Some others did not qualify for evaluation.
None
Accord (Accelrys), and ChemOffice (Cambridge Soft) were two major alternatives.
Symyx/MDL Direct Oracle cartridge
6. Why did you choose ChemAxon technology? (advantages)
–
–
–
–
–
Cost was a major factor, but also because we felt we could work with ChemAxon to
develop the tools further as we wanted to use them. A very open approach. Another
reason was that all the tools we needed were available from a single vendor, i.e.
Oracle cartridge for searching, and sketching and viewing tools.
Almost as good as Accord but with better impact on improvement and support.
Marvin Sketch and JCart represent the molecules in MOL using exactly the same
backend library. MOL is used instead of smiles/smarts. Much faster search. Price is
good .
We could keep the cost lowest by using ChemAxon, and more than that, the affinity
for the web technology was favorable to our future vision of the cheminformatics
system.
The greatest advantage is the low cost and great support. We have always had
MDL/Direct cartridge, but the greatest advantage is the low cost and stellar support
speaks specifically to ChemAxon.
Migration Experience
7. What were the most problematic issues occurred during the migration?
(negative impressions)
–
–
–
–
–
Understanding the finer points of all the search functions / options i.e. precisely how
things like aromaticity, stereochemistry, etc. are handled. We've also had to spend
time considering how to restandardise structures and how to rewrite SQL. When
doing a straight forward structure search (i.e. benchmarking), the JChem cartridge
performs very well against other systems such as Daylight, however, if you want to
incorporate joins between tables can considerably affect the query times even when
using what we call ChemAxon SQL.
Structure matching bugs in the cartridge and undocumented actions needed to be
performed.
JCart installation was not so smooth 3 years ago. Much better now. Most of the
problem and issues are because some structures are interpreted different between
the two software. Some are Daylight bugs and others are jchem bugs. JChem has fix
all their share.
There were little problem, what I remember is that the response was slower than
expected when the chemical object was included in the page.
Identifying all the integration points.
Migration Experience
8. How could you overcome in these difficulties? (resolutions)
–
–
–
–
–
We spent a lot of time experimenting with the different functions/options so we
completely understand what they do.
The structure search bugs was overcome by rewriting the registration procedures,
undocumented actions were overcome by hard work.
Wait until major bugs in JChem are fixed. We live with about 0.01% of
inconsistencies and work it out later.
The needless chemical objects were replaced by pictures.
Availability and quick turn around to patch any
9. Did you expect any other problem, that did not occur? (positive
impressions)
–
–
–
–
–
We though there may be problems running two different cartridges on the same table
but this worked fine
Not really. Most MDL features were available in JChem. This was one of the
selection criteria, particularly important for chemical registration.
No
We expected that the transfer of the existing data might be problematic, and that the
system change might be inconsistent with existing 'wet' workflow. That was why we
organized 11 users as a system designing team, and I think the team worked well.
Migration went very smooth
Migration Experience
10. What additional components were purchased together with the
JChem Cartridge?
–
–
–
–
–
Most of them!
Descriptor calculations.
None. User probably should consider plug-ins for calculating HBD, HBA, logp, psa,
etc. We did not because we need to stick to CLOGP in order to be consistent with the
rest of the company.
Standardizer.
Standardizer.
11. How much technical support did you need from ChemAxon for the
migration?
–
–
–
–
–
Initially quite a lot, though the products have been developed a lot since then. We
haven't required much support for structure migration, but we've also migrated a load
of SMIRKS and we've needed support for that mainly because of the way in which
they were handled in the old system (non-standard).
A few needed support cases where filed on the support forum and fairly quickly
resolved.
Lots, we had close communication with dev team during the migration.
Our technical people sent e-mail several times to your support team.
Little.
Migration Experience
12. Were/Are you satisfied with the ChemAxon support?
–
–
–
–
–
Yes. Support has always been good.
Yes very satisfied. The support has always been very fast and accurate.
Yes.
Yes.
Yes.
13. Did the migration reach its original goals?
–
–
–
–
–
So far, yes! The systems are up and running.
Yes.
Yes.
Yes.
Yes.
Migration Experience
14. Are you satisfied with the performance/functions of the ChemAxon
powered system?
–
–
–
–
–
The number of functions available and flexibility of the JChem tools is excellent, and
allows us to develop very interesting and useful drug discovery software for our
scientists.
Yes.
Yes.
Yes.
Yes.
Useful migration resources
ChemAxon's Marvin & JChem (v 3.1.3) vs.
MDL® ISIS/Draw ISIS/Host (v 4.0)
Seong Jae Yu, David Roush*, Usha Ganesh, Young Moon,
Henry Liu, FMC Corp.
http://www.chemaxon.com/conf/FMC_ChemAxon_JCHEM_
Cart_xnotes.ppt
User Group Meeting presentations:
http://www.chemaxon.com/UGM/ugm_land.html
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
Cheshire Alternatives from ChemAxon
What is Cheshire?
“Cheshire is a scripting language that enables you to
write scripts to validate, modify, or gather information
about chemical structures, such as molecules and
reactions.”
What alternatives can ChemAxon offer?
• ChemAxon’s Java API (also available from .NET)
• Chemical Terms
• Standardizer
Java API for Cheminformatics from
ChemAxon
ChemAxon’s class library consists of more than 1500 chemistry related classes tuned
for usability and high performance.
Chemical Terms
Chemical Terms offers more than a hundred popular chemistry functions opening up the
power of cheminformatics for those scientists who focus on quick results instead of the
details of programming and scripting. The integration of Chemical Terms makes make
chemistry applications smarter and more customizable.
charge() and match(amine) or
match(hydrazine)
Standardizer for Batch Conversion
Standardizer is a batch conversion utility providing many useful and customizable
functions for the canonicalization of chemical structures and restoration renovation
chemical information in structures from older databases.
Standardizer Actions
Transform
Clear Stereo
Aromatize
Set Absolute Stereo
Dearomatize
Remove Absolute Stereo
Add Explicit Hydrogens
Convert Wedge Interpretation
Remove Explicit Hydrogens
Convert Double Bonds
Clean2D
Alias to Group, Alias to Atom
Clean3D
Contract Group
Wedge Clean
Expand Group
Clear Isotopes
Ungroup
Remove Fragments
Expand Stoichiometry
Remove R-groups
Map Reaction
Tautomerize
Unmap
Mesomerize
Neutralize
Counting Groups – Cheshire
Counting O=S=O groups in Cheshire
Counting Groups – Java API
Counting any functional groups with ChemAxon’s Java API
Counting O=S=O groups in Chemical Terms
Adding Explicit Hydrogens - Cheshire
Adding explicit hydrogens and cleaning the molecule in Cheshire
Adding Explicit Hydrogens – Java API
Adding explicit hydrogens and cleaning the molecule with ChemAxon’s Java API
Adding Explicit Hydrogens –
Standardizer
Adding explicit hydrogens and cleaning the molecule with Standardizer
The same in command line
Group Conversions – Cheshire
Conversion of neutral form of nitro to the ionic one in Cheshire
Group Conversions – Java API
Conversion of neutral form of nitro to the ionic one with ChemAxon’s Java API
Group Conversions – Standardizer
Conversion of neutral form of nitro to the ionic one in Standardizer
The same in command line
Structure Checker Framework
The new Structure Checker framework will provide plenty of validation and correction
functions to detect and repair defective or unpreferred structures.
•
ValenceChecker
•
ChiralFlagChecker
•
AromaticityChecker
•
CovalentSaltChecker
•
OverlappingAtomsChecker
•
FerroceneChecker
•
OverlappingBondsChecker
•
CumulatedRingBondChecker
•
CrossedDoubleBondChecker
•
UnbalancedReactionChecker
•
WigglyDoubleBondChecker
•
MultistepReactionChecker
•
WedgeBondsChecker
•
AtomMapChecker
•
BondLengthChecker
•
MissingAtomMapChecker
•
BondAngleChecker
•
AtomMapStyleChecker
•
AliasChecker
•
RgroupQueryChecker
•
PseudoAtomChecker
•
MarkushChecker
•
AbbreviatedGroupChecker
•
3DCoordinateChecker
•
MultiComponentChecker
•
MolfileChecker
•
QueryChecker
•
RxnfileChecker
•
MoleculeChargeChecker
•
SmilesChecker
•
RadicalChecker
•
SmartsChecker
•
IsotopeChecker
•
InchiChecker
•
ExplicitHydrogenChecker
•
PeptideSequenceChecker
•
StereoDoubleBondChecker
•
CmlChecker
•
TetrahedralStereoAtomChecker
•
PdbChecker
•
UnspecifiedStereoDoubleBondChecker
Summary
•
ChemAxon’s Java API provides similar freedom and flexibility to Cheshire for
programmers to develop chemistry functions for any tears like web clients,
desktop applications, server systems and Oracle stored procedures.
•
Java is a standard language with worlwide community, rich resources and lots
of well educated developers. (The ChemAxon Java API is also accessible from
.NET.)
•
Chemical Terms provides more than a hundred high level, ready to use
functions substituting dozens of lines of complex Cheshire code.
•
Chemical Terms expressions can directly be used in database filters, virtual
reactions, pharmacophore definitions or other cheminformatics applications.
•
Standardizer is an easy to use batch tool and graphical interface for chemists
to create conversion rules without writing a single line of code.
•
The upcoming Structure Checker will provide and extensible set of quick
“problem detection” functions that can be integrated in any applications and
will be added to Marvin and Standardizer as well.
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
Instant JChem is…
• An “out of the box” desktop
application designed for
biologists and chemists
• A modular platform for
developing chemistry
applications
Instant JChem lets users…
• Create or connect to existing structure databases
• Easily manage relational data
• Import/export/merge/edit data
• Build forms for reporting
• Run combined structure + data searches
• Perform structure based predictions
• Access sophisticated chemistry features
• Collaborate with other users
Feature comparison to ISIS/Base
Feature
ISIS/Base
Instant JChem
Databases
Forms
Local + Oracle (differences in
steroechemistry and calculations
etc.).
Form builder.
Local + Oracle + MySQL (no
differences in local and remote db
functionality).
Form builder.
Tabular view
Limited.
Comprehensive.
Relational data
Hview
DataTree.
Deployment
Installer only.
Installer or Java Web Start.
Many deployment features.
Collaboration
Limited.
Many collaboration and sharing
features.
Scalability &
performance
Limited, especially for local DBs.
Good.
IJC Architecture
• Built on modular platform
– Allows easy extension by
ChemAxon, customers
and 3rd parties
– Strong enforcement of
APIs
• API
– Allows extension
– IJC functionality is built
upon these APIs
Current architecture
IJC Client
Local DB
Remote DB
Oracle cartridge
Database
IJC server architecture
Database
IJC Client
IJC Server
IJC
Services API
Oracle cartridge
Web Apps
Web services
IJC server due Q1 2009
Migration issues: general
Database artefacts
IJC is currently table based. Access to views, synonyms
etc. is currently being added.
Use of database links has not been investigated yet, but no
particular problems expected.
Security model
Current implementation provides basic access control:
1. Read-only
2. Read-write
3. Edit database model
Users can create forms/lists etc. even in read-only mode,
but can’t modify data that affects other users.
Security integration
LDAP is probably the most suitable, but the security
implementation is quite flexible and customisable.
Migration issues: migration of ISIS DBs
Hview vs. Data Tree
No direct conversion, so this would currently need to be
done manually, though some automation is potentially
possible.
Data Tree is modelled on the same approach as Hview, so
migration in most cases should be relatively simple.
Forms
No direct import from ISIS, but creating IJC forms is very
simple and fast.
Customisation
Currently no equivalent to ISIS/PL. ISIS applications with
complex logic may be more suited as IJC extensions, or as
standalone web or JWS applications.
Hview vs. Data Tree: standard tables
One-to-many relationship
ISIS Hview
master_table
master_table_id
col1
col2
col3
HVIEW my_data
*
detail_table
detail_table_id
master_table_id
cola
colb
colc
TREE master
DEVICE oracle
USERNAME scott
PASSWORD tiger
TNAME master_table
TREE detail
DEVICE oracle
USERNAME scott
PASSWORD tiger
TNAME detail_table
LINK master (master_table_id) over detail (master_table_id)
IJC Data Tree
Hview vs. Data Tree: Mol + Rxn tables
ISIS Hview
< RC tables>
*
inventory
inventory_id
molregno
cola
colb
colc
HVIEW cpds_inv
TREE compounds
DEVICE chemicaldb
USERNAME CPD/CPD
PASSWORD
TNAME compounds
TREE inventory
DEVICE oracle
USERNAME scott
PASSWORD tiger
TNAME inventory
compounds
molregno
structure [jc_index]
LINK compounds (molregno) over inventory (molregno)
IJC Data Tree
*
inventory
inventory_id
molregno
cola
colb
colc
Migration options
Simple local or Host based
databases used primarily for
searching/reporting
Migrate to IJC
ISIS/Base application with complex
application logic but standard (Hview
based) data structure
(e.g. registration applications)
Create custom IJC extension module built upon
IJC API
(much of the existing IJC functionality is
essentially done this way)
Application with complex data model Either: Create standalone web application
and logic
or: Create standalone JWS application
or: Create custom IJC module that defines its own
data access API.
Migration: local ISIS databases
• Analyse data hierarchy
• Export data as SDF/RDF
• Import into IJC
• Build forms
• Maybe possible to automate by writing COM
application to read data from ISIS and write
to Oracle database.
Migration: ISIS/Host databases
• Analyse database tables and Hview
• Migrate RCG tables to JChem table(s)
• Connect IJC to the database
• Promote tables/columns/foreign keys into
IJC
• Assemble IJC Data Tree
• Build forms
• May be possible for some automation.
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
JChem for Excel
•
Microsoft Excel integrated
solution for Marvin and JChem
functionality
•
Use Excel’s powerful features:
Functions, Sorting, Filtering,
Charts…
•
Implemented in C# .NET, and
Visual Studio
– Proof that ChemAxon APIs
can be used in a Java-less
.NET environment
•
Easy to install and deploy
•
UNDER DEVELOPMENT
ISIS for Excel to JChem for Excel
• Import ISIS SARTables (January 2009)
– Workbook exported from ISIS for Excel
• Migration of standard ISIS Workbooks?
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
Custom Applications
• Java Applications
– Swing
• .NET Applications
– JNBridge: Commercial Java - .NET Proxy
– Byte Code to IL (.NET binary) translation (IKMV)
• Open Source, very good performance
• No full GUI support at the moment, but coming
• JChem for Excel is built using IKVM
• Web Based Applications
– JSP, ASP.NET, AJAX
• SOAP
Custom Components
• Plans to release custom components
– Java Swing
– AJAX Examples
– .NET
•
•
•
•
Visual Studio integrated
Windows Forms (from JChem for Excel),WPF?
ASP.NET
ASP.NET AJAX, MVC
.NET Integration Enhancements
• Problem : ChemAxon API uses Java
Classes, not familiar to .NET developers
• Higher Level .NET wrappers, components
–
–
–
–
Properties, Events
Search results in DataSet, IDataReader
LINQ, IEnumerable interfaces
GUI Components: DataGridView, Property Grids,
Components for Search
Custom Application Migration and Development
• Resources and experience for migrating
custom ISIS(Host - Base) based applications
– ISIS Forms to other applications
– Procedural Language (ISIS/PL)
• Consultation
– Help with custom application development on
ChemAxon platform
– Both in-house (CXN) staff and partner companies
are available
– Custom/prioritised improvements of ChemAxon
products
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• ChemAxon for Developers (Resources)
Web Services
• Extends ChemAxon functionality to Web
applications
• Enables interoperability from multiple
programming languages with SOAP Protocol
• Allows migration of existing web applications
to ChemAxon services
• Encourages creation new web applications
Service Modules
• Application Building Blocks
– DB Searching
• Substructure, Similarity, Exact, etc.
– Molecular Standardization
– Clustering and Diversity
• Chemically Intelligent Tools
– Shorthand Chemical Terms and Calculator Plugins
• Lipinski Rule of 5, pKa, logP, logD, etc.
– Molecular Format Conversion
– Image Generation
SOAP Protocol
• SOAP protocol used by most major web
application languages
• Programming languages
– Java
– .Net (C#, ASP.net)
• Scripting languages
– JavaScript
– Perl
– Python
• Etc.
AJAX Example
Migration of Existing Web Apps
• ChemAxon Web Services can be called
from existing web services
• ChemAxon Web Services can directly
replace specific functionality
• Migrate using Security Standards
– WS-Security, WS-Security Policy
– Integrate with existing authentication services
(e.g. LDAP, Active Directory)
Creation of New Web Apps
• Standard WSDL files allow for automated
client side code generation (Python, Perl,
Java, C#, etc.)
• AJAX provides asynchronous and desktop
application performance
• Easily integrate with Marvin applets
Migration - Topics
• ChemAxon - Product Overview
• From Isis/Host and MDL Direct to JChem
Cartridge
• Alternatives to Cheshire (Standardizer)
• From ISIS/Base to Instant JChem
• From ISIS for Excel To JChem for Excel
• Migrating Custom Applications
• ChemAxon Web Services
• Appendix: ChemAxon for Developers (Resources)
API and Compatibility
Java API (Marvin GUI included)
Marvin Applets for web applications
.NET API over JNBridge (Marvin GUI included)
Native .NET solution under development (Marvin GUI included)
API from SQL: JChem Cartridge for Oracle
SOAP interface (Python, C, .NET, ... over SOAP) under
development
AJAX interface under development (Marvin GUI included)
Instant JChem highly configurable + Java API
Integration: Pipeline Pilot, KNIME, Spotfire, ...
Java API
• Direct manipulation of structures
• Format conversions, name<=>structure, image
generation
• Structure searching with/without DB access
• Standardization of structures
• Property calculations
• Reaction modelling (enumeration)
• Clustering
• Sketcher, 2D/3D viewers (Marvin family)
• Etc
JChem API
Marvin Applets for Web Applications
• All relevant browsers (IE, FF, Safari, ...)
• Manipulation from HTML page (from JavaScript)
• Catching drawing events in JavaScript
• Can be used from .NET applications using the web
browser control
Marvin demo
MarvinSketch Applet Examples
MarvinView Applet Examples
MarvinSpace Applet Examples
.NET API Over JNBridge
• Tight integration with .NET
• Full Java API is mirrored in .NET
• Marvin GUI components are also supported
Native .NET Solution
• Translating the non-GUI elements to Java binary to .NET
binary (using IKVM)
• Building a thin .NET GUI for Marvin and other tools over the
core.
Advantages
• Pure .NET solution, Java is not needed to be installed
• No license issue
• No performance overhead of proxying
under development
JChem Cartridge for Oracle
• API from Oracle SQL
• All features needed for structure handling and
searching
• Fast searching, insertion, and indexing
• Special features:
–
–
–
–
–
Standardization of structures is tied with structure tables
Property calculations
Format conversions, name<=>structure, image generation
Reaction and Markush based structure enumeration
Markush libraries in structure tables (coming soon)
SOAP Interface
• Web services interface to most functionalities
• Bridges to Python, C, Perl, .NET, Java using
WSDL
• Enables both remote and local access to
ChemAxon functionalities
under development
AJAX GUI
• AJAX components for web
applications
• Customization using CSS and XSL
• Accesses SOAP interface
• Structure searching, database
handling example
• Fast and rich GUI
– Floating windows
– Scrolling through large database
without paging
• Marvin Applets are integrated
under development
Instant JChem for Developers
• Sharable forms, queries, lists
• URL-s to sharable items - Demos
• Instant JChem API
Integrations
Several software vendors integrated ChemAxon
components
- Pipeline Pilot
- KNIME (by Infocom)
- Spotfire
- Aureus
- Integrity (Thomson)
- Others: (Agilent, Tripos, Symyx, Deltasoft, GVK, Wiley,
Genedata, Contur, Inforsense, Kinematik, Houghton
Mifflin, Kelaroo, Patcore, Cengage, Prentice Hall,
Crossfire Beilstein, etc)
Visit other technical presentations
ChemAxon Overview
http://www.chemaxon.com/conf/ChemAxon_Overview.ppt
MarvinSketch/View
http://www.chemaxon.com/MarvinSketch_View.ppt
MarvinSpace
http://www.chemaxon.com/MarvinSpace.ppt
Calculator Plugins
http://www.chemaxon.com/Calculator_Plugins.ppt
Structural Search
http://www.chemaxon.com/Structural_Search.ppt
JChem Base
http://www.chemaxon.com/JChem_Base.ppt
Instant JChem
http://www.chemaxon.com/conf/Instant_JChem.ppt
JChem Cartridge
http://www.chemaxon.com/JChem_Cartridge.ppt
Standardizer
http://www.chemaxon.com/Standardizer.ppt
Screen
http://www.chemaxon.com/Screen.ppt
JKlustor
http://www.chemaxon.com/JKlustor.ppt
Fragmenter
http://www.chemaxon.com/Fragmenter.ppt
Reactor
http://www.chemaxon.com/Reactor.ppt
Find out more
• Product descriptions & links
www.chemaxon.com/products.html
• Forum
www.chemaxon.com/forum
• Presentations and posters
www.chemaxon.com/conf
• Download
www.jchem.com/licensefrset.html
Thank you for your attention!