BiosciencesReport - PRAGMA grid

Download Report

Transcript BiosciencesReport - PRAGMA grid

Biosciences Working Group Update
Wilfred W. Li, Ph.D., UCSD, USA
Habibah Wahab, Ph.D., USM, Malaysia
You-Qiang Song, Ph.D, HKU, PRC
Hosted by HKU
Hong Kong, PRC, March 2-4, 2010
PRAGMA: model for international
collaboration in Technology and Science
Broadening Impact of Technology
Engaging Future Generations
PRIME Student 2009: Jessica Hsieh, USM
Scientific Drivers and Use Cases: Influenza A Virus
Harris et al, PNAS, 2006
http://www.reactome.org/
http://www.wikipedia.org
http://library.thinkquest.org/05aug/01479/prevention1.html
2009 H1N1 Pandemic Influenza
•Cumulative cases
represented in
Google Map as of 21
Apr, 2010
•WHO: 18769 deaths
to date
•US: 4642 deaths;
480230 total cases
•Malaysia: 74 deaths,
6463 total cases
•China: 650 deaths;
124300 total cases
•Postpandemic
period as of Aug
2010
•0.5~1% death rate,
similar to seasonal flu
•Targets younger and
healthy individuals,
different from
seasonal flu (90% >
65 years older)
Source: http://flutracker.rhizalabs.com/
WHO Status Update
Week 17 (Apr 19, 2009) to Week 13 (Apr 3, 2010)
http://www.who.int/csr/disease/swineflu/laboratory23_04_2010/en/index.html
Transparent access of applications on Avian Flu
Grid through middleware
CNIC Duckling Portal
Konkuk/Kukmin
Glyco-M*Grid
NBCR CADD
Relaxed Complex Scheme and Ensemble based Virtual Screening
Contributed to HIV Integrase Inhibitor Development
Discovery of unexpected binding site in HIV-1 Integrase using MD and AutoDock:
Schames, … & McCammon, J. Med. Chem. (released on web, early 2004)
“ Exploration of the structural basis for this unexpected result … suggests an
approach to the development of integrase inhibitors with unique resistance profiles.”
D. Hazuda et al., Proc. Natl. Acad. Sci. USA (Aug. 2004), refers to Schames, et al. (2004).
MK-0518
New Class of HIV Drugs: Merck & Co.
Source: A. McCammon
February, 2006 – Phase III Clinical Trials
February, 2007 – Name announced:
Isentress (raltegravir)
October, 2007 – FDA “fast track” approval
Ensemble-based Virtual Screening with Relaxed Complex Scheme
NAMD2
Amber
NCI Diversity Set: 3.3 MB, 2000 compounds;
Required at each site
ZINC subset: 200,000. A few hundred MB
AutoDock4
Docking Data: hundreds of MB
Multiple targets: HA, NA subtypes
Each target: 30~50 MD snapshots, 1~2 MB each
Simulation Data: hundreds of GB
Total data to date: ~5 TB in long term storage.
Each experiment is about 1 Petaflops accumulative in computation cost.
Source: Amaro
Advances in Computing Infrastructure Enables Complex Simulations
of Biomolecular Systems
Amaro & Li, CTMC, 2010
Opal 2 for SaaS
Opal WS: Transparent Access Layer for Applications
Opal App
MGLTools
CADD
Taverna
Vistrails
Kepler
Opal Application Services
Grid/Cloud Resources
Condor
Condor pool
CSF4
TeraGrid/PRAGMA Grid
Globus
PBS/SGE Clusters
Opal Plugins for Popular Workflow Software
CADD: Opal Web Services for Biomedical Applications
• Ren et al, NAR 2010,
Web Server Issue
• http://cadd.nbcr.net
• Modules supporting
MD simulation and
analysis, Virtual
Screening, Docking,
Visualization
• Project management
under development
14
Opal MetaService: Transparent Access
to Workflows and Applications
Social Networks and Collaborative Environment
Social Network Site Number of Users
Features
API Examples
Google
170 million (Gmail)
Google Integrated
Suite of Tools
Google Apps Engine
LinkedIn
65 million
Professional
Huddle/Zoho Office
Online
Twitter
100 million
Short MMS/SMS
TwitPic
Google Wave
100,000 X 7?
Upload any file
Google Wave Robot
Facebook
500 million+
Social network
Facebook Apps
Are these too big to fail?
Utility Computing finally?
New OPAL-CSF4 Cloud model
• OPAL as resource manager of CSF4
• CSF4 allocate service instances of OPAL for jobs
17
PRAGMA 19 workshop, Changchun, Jilin, China, Sep.13-15, 2010.
17
2 – 4 March 2010
PRAGMA 18, San Diego
18
Integrating Visualization Workflows using
Real-time bioMEdical data Streaming and visualization (RIMES)
Kevin Dong, CNIC
VM Replication Experiment
http://goc.pragma-grid.net/wiki/index.php/VC-replication-2
• VM hosting server:
•Rocks 5.3 Xen roll
• Avian Flu Grid VM
• Rocks VM
• Globus/SGE
• Autodock
• Replication updates
• hostname and IP
• Compute nodes
• Network configurations
• Globus configuration
• SGE configuration
User Interface
AutoDock
NAMD
Service Manager
OPAL2
Resource Manager
(edu.sdsc.nbcr.opal.manager.CSFJobManager)
generate RSL files
Scheduling:
Workflow Job
CSF4
Array Job
Input/Ouput Files:
StageIn and StageOut
MetaScheduler
Grid Resources
Grid Sites
Biomedical CLOUD
AFG VM
(original)
SDSC VM hosting server
AFG VM
(copy)
NBCR VM hosting server
AFG VM
(copy)
AIST VM hosting server
ViewDock TDW
Lau, Haga and Date
Other Examples of Continued Software Development
at Member Institutions
– Drugscreener-G – KISTI, Korea
– Grid Enabled Virtural Screening Service – ASGC,
Taiwan
– CADD Pipeline – NBCR, USA
– WISDOM project – CNRS, EU
– Glyco-M*Grid – Kookmin & Konkuk U, Korea
Meeting the New Challenges
• Virtualization – What does it mean to us?
– Virtual machines, CSF server, Gfarm server and virtual
clusters
• Production environment – Where is it? What form
should it take? -- EC2, VC replication
• Collaboration – How to stay in touch better, PRIME,
MURPA, research in general?
PRAGMA 20: Look Around Session
• Arry Yanuar, University of
Indonesia
– Molecular Dynamics
Simulation of disordered
regions the RGK-family of
small GTPase revealed no
GTPase activity
– Simulation often suffered
from communication failure
– Request: Application service
for GROMACS in PRAGMA?
GTP
GDP
GEF
Small
GDP
GTPase
Small
GTP
GTPase
Inactive
Small
GDP
GTPase
GDI
Effector
Active
GAP
Pi
Downstream sig
Look Around 2
• Suntae Hwang
– South Korea Kookmin
University
– MGrid Service on
Nationwide Consortium
of Supercomputing
Infrastructure (PLSI) in
Korea
– Features: IP free,
virtualized resource
selection
MGrid
PLSI
LoadLeveler
Look Around Session
• Day 2– NBCR CADD Pipeline
Demonstration
– CADD for Computer Aided
Drug Discovery, Based upon
the Vision workflow platform.
• Nadya Williams
• Requested features:
– Ability to track data
provenance
– Ability to run workflow
using own cluster
• Potential collaboration
– Alzheimer’s Disease -HKU
Look Around
– KISTI BioWorks
• Bioworks workflow
environment for
Bioinformatics and Virtual
Cell applications in Systems
Biology
• Seok Jeong Yu
• Features:
– Client server architecture
– Uses Java Web Start
technology
– Allows users to share, edit
and execute workflows
– Will add PSExplorer
(parameter space explorer)
• Not ready for release yet
Looking ahead
• NBCR/SDSC and KISTI: co-development of
Opal Plugin for Bioworks; scholastic exchange
• UCSD and HKU: Use CADD pipeline in
Alzheimer’s Disease Research
• KISTI and HKU: Use Bioworks in Alzheimer’s
Disease Research
• UCSD and University of Indonesia: Security
requirement for proprietary compound library
and Opal services
Looking Ahead
• PRAGMA Resources:
– More application services? Amber? Virtual Machines?
– How can one replicate application services at own site? Is
this necessary?
– Data resource? Gfarm-iphone bridge is cool. Can we
actually share data securely this way?
– PRAGMA Institute? More hands on training on applications
such as CADD for local participants?
– Nimrod/K 2.0 for parameter sweep in Sapporo, Japan
– Opal App for Biomedical Services – Leveraging Google App
Engine and Social Network Infrastructure
– Duckling Portal with Opal 2.4 support