PI_AFG PRAGMA19 - PRAGMA grid

Download Report

Transcript PI_AFG PRAGMA19 - PRAGMA grid

PRAGMA Institute on Implementation @ PRAGMA 19
Avian Flu Grid:
Transition from Grid to Cloud Computing
Wilfred W. Li, Ph.D., UCSD, USA
Xiaohui Wei, Ph.D., JLU, PRC
Hosted by JLU
Changchun, Jilin, PRC, Sept 13, 2010
Scientific Driver and Use Cases
Harris et al, PNAS, 2006
http://www.reactome.org/
http://www.wikipedia.org
http://library.thinkquest.org/05aug/01479/prevention1.html
Relaxed Complex Scheme and Ensemble based Virtual Screening
Contributed to HIV Integrase Inhibitor Development
Discovery of unexpected binding site in HIV-1 Integrase using MD and AutoDock:
Schames, … & McCammon, J. Med. Chem. (released on web, early 2004)
“ Exploration of the structural basis for this unexpected result … suggests an
approach to the development of integrase inhibitors with unique resistance profiles.”
D. Hazuda et al., Proc. Natl. Acad. Sci. USA (Aug. 2004), refers to Schames, et al. (2004).
MK-0518
New Class of HIV Drugs: Merck & Co.
Source: A. McCammon
February, 2006 – Phase III Clinical Trials
February, 2007 – Name announced:
Isentress (raltegravir)
October, 2007 – FDA “fast track” approval
Ensemble-based Virtual Screening with Relaxed Complex Scheme
NAMD2
Amber
NCI Diversity Set: 3.3 MB, 2000 compounds;
Required at each site
ZINC subset: 200,000. A few hundred MB
AutoDock4
Docking Data: hundreds of MB
Multiple targets: HA, NA subtypes
Each target: 30~50 MD snapshots, 1~2 MB each
Simulation Data: hundreds of GB
Total data to date: ~5 TB in long term storage.
Each experiment is about 1 Petaflops accumulative in computation cost.
Source: Amaro
Advances in Computing Infrastructure Enables Complex Simulations
of Biomolecular Systems
Amaro & Li, CTMC, 2010
New Challenges
• Virtualization – What does it mean to us?
– Rock’n Rolls, on demand virtual machines,
• Production environment – Where is it? What form
should it take?
– GPU clusters, virtual machines, cloud services
• Most work is still done on local clusters, the desire to
use the grid/Cloud is there
– It’s happening, and quite exciting
• Collaboration – How to stay in touch better, PRIME,
MURPA, research in general?
Transparent access of applications on Avian Flu
Grid through middleware
CNIC Duckling Portal
Konkuk Glyco-M*Grid
NBCR CADD
2 – 4 March 2010
PRAGMA 18, San Diego
8
Other Examples of Continued Software Development
at Member Institutions
– Drugscreener-G – KISTI, Korea
– Grid Enabled Virtural Screening Service – ASGC,
Taiwan
– CADD Pipeline – NBCR, USA
– WISDOM project – CNRS, EU
– Glyco-M*Grid – Kookmin & Konkuk U, Korea
Virtual Screening with CSF
• Virtual screening web services with remote clusters including
TeraGrid and PRAGMA Grid resources.
Virtual cluster at SDSC
AMAZON EC2
Integrating Visualization Workflows using
Real-time bioMEdical data Streaming and visualization (RIMES)
Kevin Dong, CNIC
ViewDock TDW
Lau, Haga and Date
New OPAL-CSF4 Cloud model
• OPAL as resource manager of CSF4
• CSF4 allocate service instances of OPAL for jobs
13
PRAGMA 19 workshop, Changchun, Jilin, China, Sep.13-15, 2010.
13
Parallel job scheduling in CSF4
• Two phase resource allocation in parallel job plugin
– Construct virtual clusters according to job requirements
– Distribute real jobs to virtual clusters
14
PRAGMA 19 workshop, Changchun, Jilin, China, Sep.13-15, 2010.
14
Social Networks and Collaborative
Environment
Social Network Site Number of Users
Features
API Examples
Google
170 million (Gmail)
Google Integrated
Suite of Tools
Google Apps Engine
LinkedIn
65 million
Professional
Huddle/Zoho Office
Online
Twitter
100 million
Short MMS/SMS
TwitPic
Google Wave
100,000 X 7?
Upload any file
Google Wave Robot
Facebook
400 million+
Social network
Facebook Apps
Are these too big to fail?
Utility Computing finally?
Kepler Opal Web Services Actor
Web Form for Virtual Screening Service
Cloud Computing with Amazon EC2
AutoDock Workflow
A Virtual Screening Vision Workflow
A web service
Transparent Access Layer for Applications
Opal GUI
PMV/Vision
Kepler
Application Services
Grid/Cloud Resources
Globus
Condor pool
Globus
SGE Cluster
Globus
PBS Cluster
Vision Workflow Snippet Using Opal
•
Two Major Steps
1.
Macro that runs
2.
PDB2QR web service.
Run PDB2PQR web service.
This step is skipped if an
appropriate PQR file
exists on the local
machine.
Run PrepareReceptor web
service.
Output is URL to PDBQT
•
PDB2PQR and
PrepareReceptor are
skipped if an appropriate
PDBQT file exists on the
local machine.
–
Output is PDBQT path on
local machine.
Macro that runs
PrepareReceptor
web service
22
Opal 2 for SaaS
VM Replication Experiment
http://goc.pragma-grid.net/wiki/index.php/VC-replication-2
• VM hosting server:
•Rocks 5.3 Xen roll
• Avian Flu Grid VM
• Rocks VM
• Globus/SGE
• Autodock
• Replication updates
• hostname and IP
• Compute nodes
• Network configurations
• Globus configuration
• SGE configuration
User Interface
AutoDock
NAMD
Service Manager
OPAL2
Resource Manager
(edu.sdsc.nbcr.opal.manager.CSFJobManager)
generate RSL files
Scheduling:
Workflow Job
CSF4
Array Job
Input/Ouput Files:
StageIn and StageOut
MetaScheduler
Grid Resources
Grid Sites
Biomedical CLOUD
AFG VM
(original)
SDSC VM hosting server
AFG VM
(copy)
NBCR VM hosting server
AFG VM
(copy)
AIST VM hosting server