Transcript Document
Exploring Chemical Structures
using E-Science
(ECSES)
Ken Meacham, IT Innovation
Crystal Grid Workshop,
13-17 Sept 2004
Overview
• Introduction
– Combinatorial Chemistry
– Comb-e-Chem
• ECSES objectives
• ECSES and Grid architecture
– Globus
– Web services
• ECSES migration into Comb-e-Chem
Combinatorial Chemistry
• Parallel synthetic approach
– create hundreds of materials
– screen properties to find those
that fit the bill
• Typically requires several
passes
– find chemical structure of the best
candidates
– create new batches of similar
materials for subsequent passes
• Leads to explosive growth in:
– volume of data generated
– potential to exploit this data
Comb-e-Chem Vision
A Pervasive Grid-Based e-Science Environment
Structure + Properties
Structures
DB
Knowledge + Prediction
Properties
DB
Simulation and
calculation
Comb-e-Chem Programme
Proof of
Concept
Prototype
Links to
Computation
Automation
& Knowledge
“ECSES”
Dissemination
Users
Chemistry: Specialist
Computer Science: Implementation
General users
Research
Statistics: availability of modern techniques to Chemistry
ECSES Objectives
• Build an impressive e-Science demonstrator
– for NeSc opening ceremony
– for other early dissemination venues
• Provide a proof-of-concept for Comb-e-Chem
– prove that Comb-e-Chem can be built
– proving ground for Comb-e-Chem requirements
capture
• Assess the use of Globus in Comb-e-Chem
– Globus = leading Grid environment today
– demo must be based on Globus
ECSES Scope
E-Lab:
X-Ray
Crystallography
Laboratory
Processes
Grid Infrastructure:
Structures
DB
Properties
DB
Properties
Prediction
Visualisation
ECSES: A Proof of Concept
Send sample
material to
NCS service
Collaborate in e-Lab
experiment and
obtain structure
X-Ray e-Laboratory
Search materials database
and predict properties using
Grid computations
Structures
Database
Download full
data on materials
of interest
Computation
Service
ECSES Demo Scenario
• Optically active materials design
– application to sensor technology
– needs high operating temperature
• Researcher has found a candidate material
– good optical properties, but
– melting point is too low for operational use
• Use e-Science structure-property queries
– to find alternative candidate materials
– to rank according to predicted melting points
ECSES Demo Summary
• Log into the X-Ray e-Laboratory
– view experiment to determine crystal structure of
new material
– collaborate with crystallographers in lab (video
conf.)
• Search structures database (remote CCD)
– retrieve structure from the experiment
– find structurally similar compounds
– compute predicted melting points
• Visualise and inspect 2-3 top candidates
• Design next combinatorial synthesis
The Globus Project
• Research
– Combining parallel, multimedia, distributed, and
collaborative computing
• Globus Toolkit
– The core services for grid-enabled applications
• Testbeds
– Multiple deployments to organisations for
prototyping
• Applications
– Distributed projects, tele-immersion, etc.
The Globus Architecture
Applications
Toolkit
Services
Fabric
Computation
Condor-G
HBM
Collaboration
Nimrod/G
MDS
Nexus
Condor
Simulation
DUROC
GSI
MPI
Parameters
MPI
globusrun
GASS
UDP
PBS
GRAM
TCP
Authentication
User
•Private Key
Private key encodes a
challenge string
•Certificate
•Grid ID
Server
Decodes challenge with
public key
Mapfile
CA
Signs Users’ Certificates
Maps from Grid ID to
Local ID
Globus Features / Issues
• Authentication using Globus certificate
– issued by Globus CA
– poor sign-up process (basic checking of identity)
– server “map file” for authentication of specific users
• Scary execution model
– allows user to upload (and run) any executable!
• Has certain useful features
– data staging (access to remote data by Globus-enabled
programs)
• Other problems
– difficult (lengthy) to install, overweight
– complex firewall configuration
Compromise Globus / Web
Services Approach
• Globus used for
– data staging
– remote execution of melting point simulations
– intermediate access to NCS lab “stepping stone”
• Web services used for
– access to NCS lab (from stepping stone)
– pre-determined (restricted) services, e.g.
– download x-ray images, and other raw data
– send/receive messages to/from lab technician
ECSES Architecture
GASS
Grid Data
Service
Globus 1.1.4 GRID
SOAP/HTTPS/PGP
X-Ray e-Laboratory
SOAP/HTTPS/PGP
Computation
Service
NCS Laboratory FIREWALL
Southampton Campus FIREWALL
NCS
GATEWAY
SERVER
Structures
Database
ECSES Network Config
Unregulated (Internal) Network Traffic
Globus
"Stepping
Stone"
Globus 1.1.4
Network
Lab Soap
Server
Experiment
Controller
Soap/HTTP
Transactions
New NCS Firewall
SUCS Firewall
Globus "Demo
Client"
IT Innovation Firewall
Globus
"Compute
Nodes"
IT Innovation DMZ Firewall
NCS Office
W/S
ECSES Architecture
Demo Site (Linux)
Conquest Python
eScience Proxy - Java, Java Media
Framework, CoGKit (including GRAM
Client Library)
Computation Time Reservation
IT Innovation (SGI)
GARA
XML Messaging
(Wraps queries and results,
encryption through GSI)
Grid
Information
Service
Query Resources
Resource
Information
Melting Point calcs and results
GRAM
Lab Globus Gateway (Linux)
GRAM
Properties Database
GASS provides
access to
structure files
Portal Stepping
Stone
GASS URLs + Structure Files
Structure File Cache
Lab Portal Machine (Linux)
SOAP Messaging
(Wraps queries, query results
and structures. Messages
are encrypted and signed using PGP)
Visualisation Streams
(RMI)
Portal Layer Webservice
(accessed through single URL)
Lab Portal Layer
- limits user access to areas in filesystem and data stores,
handles encryption and decryption
Experiment
Visualisation
Melting Point
Prediction
Code
Schedule
Querying
CDDC Mediator -handles
user specific interactions
with DB and CIF export
CCDC
Experiment
Laboratory Experiment Controller Data
(Linux box)
Schedules
Cambridge
Southampton
Crystalographic Crystalographic
Database
Database
ECSES in Comb-e-Chem
(original plans)
• Plan to use ECSES as an initial test rig
– extending Globus grid to include campus systems
• Experiment by trying to extend ECSES
– automatic transfer of experimental data to
databases
– adding multimedia to the experimental archive
– greater range of property predictions
– more sophisticated DOE for analysis service
• Then isolate what works and re-implement
– but this time using web services and not Globus