Transcript - ChemAxon

ChemAxon's Java Components in a
Heterogeneous, Server-Centric
Application Environment
ChemAxon 2005 User Group Meeting
May 19th and 20th, Budapest, Hungary
Mark Runyan, Alex Tulinsky,
Richard Sandstrom, and Julie Myhre
Cell Therapeutics, Inc
Topics
• CTI Background
• Tactical Approach
• Infrastructure Overview
• Component Architecture, Features, and
Demonstrations
• Reporting
• Conclusions and Future Directions
Cell Therapeutics, Inc.
• http://www.ctiseattle.com/
• 389 people in the U.S. and Europe
• 25 Discovery Research Scientists in Seattle
• 20 Discovery Research Scientists in Bresso, Italy
• 37 in Pre-Clinical Development in Bresso, Italy
• 4 FTE in Scientific Systems
Informatics Background
• Traditional MDL/ISIS shop
Drawbacks:
- Cost of deployment and licensing
- Lack of native integration capabilities
• Demand for integrating increasingly complex biological
and chemical data
• Research conducted in Italy and the United States
Approach
• Maintain legacy registration system
• Develop cost-effective, scalable web-based system for
data access and mining
• Loose-coupling of component based systems
• JChem chosen for chemical component
• Emphasis on Open Source Infrastructure
Infrastructure
• Oracle and ISIS/Host on Windows 2003 Server
• ActivityBase
• MOE
• Linux platform / open source tools
-
Java
Apache / Tomcat
JDO / Hibernate / Proxool
Eclipse / Jasper Reports
Ant
MySql
CVS / Bugzilla / Docbook
JChem Import
Legacy Registration System
SDF
Automated
Replication
ISIS/Host Db
JChem Db
Import Process
New Registrations or Updates
at
JCHEM.structure
ISIS RCG
Database
ISIS-JCHEM
Differential Query
Command Line wrapper
jchem.db.Importer
JCHEM.jchem_struct_update
delete updated entries
(to be re-registered next round)
Data Model
ctidb.cti_comp_names
jchem.structure
cd_id
1..1
ctidb.library_dict
ct_number
lib_id
0..1
cd_structure
compound_name
cd_smiles
cas_number
...
lib_id
ct_number
...
vendor_key
description
...
chemdb.chem_desc
ct_number
1..1
clogP
chemdb.cti_moltable
1..1
b_1rotn
ct_number
tpsa
cdbregno
reactive
molformula
...
molweight
...
wh.bio_assays
ct_number
0..n
0..n
1..n
ctidb.cti_batch
ctidb.cti_formulation
ct_number
form_lot_number
batch_lot_number
1..n
batch_lot_number
batch_date
chemist
batch_amount
notebook
...
solvents
...
form_lot_number
assay_date
protocol
response
...
DAO-layer (Structure Search Component)
Java
Client
UI client
Browser - universal thin client
...
UI implementation
Webapp
(server-side)
Structure Search Webapp
Webapp
...
1
structure,
params
3
Data Model/
Data Access
params
DAO query API
Hibernate query, db/object
mapping
query ID
Persistent query
definitions
Proxool connection pool
Authorization - Webapp
User Identity/Oracle Role
2
4
DAO query
DAO query
Authentication - Tomcat/
JAAS/AD
Mapping/
Connectivity
Query
Result
Data Access
Object (DAO)
...
DAO structure
search results
query
DAO structure
search query
JChemSearch
Hibernate - object/relational mapper
JDBC - java-database connectivity
Database User Identity triggers, logging, etc.
Persistence
Oracle Database
wraps JChem API
instead of Hibernate,
gets other DAO
features for free
Persistent Query Definition
QueryDef
id:
name:
timestamp:
user_id:
dao_component:
order_clause:
result_count:
1230
my-structure-search
2005-05-05 10:15:48.0
[email protected]
com.ctiseattle.dao.wh.compound.StructureSearch
<null>
<null>
QueryParam
id:
1332
QueryParam
query_id:
1230
name: id:
structureFilter.target
1332
value: query_id:
Marvin 1230
05050510202D ... [mol]
name:
value:
Example of how a
persistent query
definition may
encapsulate a structure
search
structure search parameters
are captured - so same
search can be repeated in a
separate session
structureFilter.searchType
1
QueryMetadata
id:
1657
QueryMetadata
query_id: 1230
name: id: structureSearchId
1658
value: query_id:
502 QueryMetadata
1230
name: id: maxQueryTime
1659
value: query_id:
0
1230
name:
maxResultsCount
value:
0
unique ID of structure
search results - used to
retrieve and page through
results after structure search
is executed
Structure Search Navigation
ChemAxon Marvin
Sketcher applet
browser (client-side)
webapp (server-side)
UI Controller - params
validator
validate parameters
persist query state
display structure search UI
UI Controller - query
executor
do structure search
forward to displayer
UI Controller - results displayer
query for structure search results
display
page through results
Simple Java code
Web Client features
• Localization
• Thin client
-
no client licensing (browser based, .pdf)
no workstation maintenance required
deployment efficiency
applet integration (Marvin)
• User Profiles
- shared result pages (via URL)
- persistent customization
• Mature, Open Source-centric tools
Compound Search – Italian localization
Marvin Sketcher Applet
Compound Profile – Italian localization
UI Examples – HTS Browser
• Query Page
• Hyperlink Documentation
• Result Pages
UI Examples – HTS Browser Cont.
• Pagination Control
• Hyperlink to Compound Profile
• Branch to SSS or Similarly Search
Web Services
• Image processing
• Warehouse sourced graphical objects
-
JChem rendered structures
Chromatograms
Dose response graphs
Statistical Plots
Image Services
URL with unique compound ID:
http:// ... /image-services/StructImage.png?
queryMethod=ctNumber&ctNumber=123 ...
Structure Image
rendered serverside from data
Image Services Webapp
Structure Image IC50 Graph
...
...
molSource = compound.getStructure().getMolSource();
StructImageWriter imageWriter = new StructImageWriter();
imageWriter.writeImageFromMolSource(
molSource,response.getOutputStream());
...
Data Access Layer
Oracle Database
Image Services in UI
Report Design (DAO and web service based)
• Design time: integrated, feature-rich, open source
Report Example
Conclusions
• Successfully implemented ChemAxon tools:
-
JChem structure Import API
JChem structure search API
Marvin Sketcher for structure search input
Structure rendering API for compound image services
Marvin viewer for interactive compound display
• Deployed Jasper Reports for advanced reporting
• Successfully deployed web applications to Bresso, Italy
over wide area network, with language localization.
Future Directions
• Unified Warehouse Browser in Development
- More query fields and features
- Column Selection from full breadth of warehouse data
- Inclusion of complex biological data types
• Oracle Data Cartridge implementation
- Increased performance
- More sophisticated and automated structural searching
• JChem structure warehousing from multiple sites
• Replacement of legacy compound registration system
Acknowledgements
• ChemAxon Technical Support
• CTI Research Scientists
• Ray Luiggi – VP Global Information Technology
• Stewart Chipman – VP Research Programs
• Ambrogio Oliva – CTI Europe - Italian Localization
• Jed Malitz – Oracle DBA
• Jason Shrack – Linux Administration
• Open Source contributors