volunteer computing

Download Report

Transcript volunteer computing

Drugdiscovery@home - distributed volunteer
computing project in the fields of cancer, aging
and stem cells
Andrey Voronkov* and John Shultz
*[email protected]
What is VCSC and distributed computing?
Computational tasks go from server to locally or globally
distributed computers and computed results go back to the server.
Volunteers
Projects
BOINC server
Local network –
VCSC
Internet – volunteer
computing
Helps science
Involves public in science
http://boinc.berkeley.edu/trac/wiki/BoincPapers
DRUGDISCOVERY@HOME PROJECT WORKFOW
METHODS OF THE PROJECT:
•
•
•
•
Distributed computing, GPU computing
Virtual screening with flexible amino acids
Relaxed complex scheme for docking
Molecular dynamics with explicit solvent models for proteinligands complexes stability evaluation
• Pathway interactive mapping with dynamics changes
modeling
FIELDS OF THE RESEARCH:
• Biotargets involved in stem cell niches signaling pathways
• which are related but not limited to cancer and
neurodegenerative diseases pathways. Biotargets which fit to
cancer/aging regulation according to hypothesis on Pic. A.
• Example of biotargets: proteins involved in Wnt, Shh and
Notch signaling pathways.
• Other biological targets, related to cancer, degenerative
diseases and stem cells biology can be considered in
collaboration with the experimental biologists groups.
The working hypothesis of cance/degeneration and
symmetric/asymmetric division of stem cells
ACCOMPLISHMENTS:
– Initial integration project website with Drupal
– High throughput Molecular Docking CPU
• Distributes Python
• Distributes some MGLTools Packages
• Managed by BOINC Wrapper
– GROMACS integration with BOINC Wrapper for CPU
• Simulate 100 ps in 2.5 hrs
• Trajectory Files Range from 10-40MB
• Results compress with 7zip format
– Autodock 4.0 integration with BOINC Wrapper for CPU
– Protein-ligand docking->MD workflow setup (acpypi)
– Major Platform Integrations
• Windows
• Mac PPC & Intel
• Linux
Team:
• Andrey Voronkov, PhD, Moscow State
University, department of chemistry – project
leader, molecular modeling, drug design,
BOINC server setup
• John Shultz, National academy of sciences,
Washington D.C., IT, coding, BOINC server
setup
• Jorden van der Elst, main software tester
• Also we collaborate with several people from
industry, which make systems biology part and
which want to be undisclosed for now.
COLLABORATION
OPTION 1: Collaboration with the experimental biologists
OPTION 2: Virtual Campus Super Computing for
universities and organizations
Advantages against cluster supercomputing:
• New pool of computing power for very low cost
• Enhanced stability compared to clusters & supercomputers
• Applications not built for the cluster architecture
• Positive PR for University
Advantages against distributed volunteer computing:
• Purely VCSC, no volunteers outside network
o No Credits, no cheaters, only need one result per workunit
(better performance per 1 CPU), better security, more flexible
regarding software licenses
• Volunteer Project
o Need to preven cheating, validate results, more limitations on
redistributing licensed software
Examples of applications for drug design
VCSC increases computing resources by several orders of magnitude and enables
to apply some of the existing software application to more of objects.
Example 1. Virtual screening by docking of organic compounds to biotargets.
1 average CPU ~1 000 000 compounds screened by rigid
protein model docking with Autodock 4.0
100 days
VCSC
~ 1 000 000 compounds Autodock 4.0. docking
with 200 CPUs to rigid protein model or ~ 50 000 compounds
docking with flexible protein model
1 day
Example 2. Molecular dynamics of protein-ligand complexes
with explicit water molecule models
1 average
Molecular dynamics of 100 100 picoseconds
CPU
aminoacids of complex of
protein with small
VCSC
100 trajectories by 100 ps for
molecule ligand with
one complex or
with 200 CPUs
explicit water and explicit
100 different ligand proteinsalts during 2 days
complexes by one 100 ps
trajectory
GPU usage can increase computing resources from ~10 to 50 times against CPUs
Virtual Campus Supercomputing Center
creation process
 I. Campus virtual supercomputing center BOINC
server setup
• I.1 Evaluation of potential computing
resources and server requirements
• I.2 BOINC server setup
 II. Communication with computer owners
and system administrators
 III. Communication with computational scientists
• Identification of scientists with computationally-intensive
applications that map well to volunteer computing.
• Porting of applications to BOINC
• Applications compilation for CPU Windows/Linux
• Applications compilation for GPU Nvidia/ATI AMD
• BOINC options setup (priority system, tasks limits)
 IV. VCSC maintenance
TOTAL TIME for VCSC: 2-3 human*months
PLANS (2 years):
1) GPU coding for applications and BOINC client – significant increase
of computational power for virtual screening and molecular
dynamics with explicit solvent models.
2) Implications of several protein flexibility methods like relaxed
complex scheme and protein Monte Carlo dynamics.
3) Dynamic modeling of signaling pathways network which must result
in interactive mapping and prediction of most prospective biotargets
for suggested diseases.
4) Drug design and biological compounds trials for prospective
biotarters of Wnt signaling pathway (1st year) ~8-10 biotargets, and
Shh, Notch and other stem cell niche regulating proteins for the
second year (10-15 biotargets).
Funding required 150 000$/year:
-full-time salary for 4 persons, hosting, some software licenses
Funding alternatives which are considered now:
- Grants for small entities - required to make project as noncommercial (in collaboration with universities)
- Sales and services (volunteers profits sharing, initial general business
plan available upon request), an office required, preferably in
Maryland, US
Thank you for the attention!