Biomedical computing Michael Welge, Ian Brooks

Download Report

Transcript Biomedical computing Michael Welge, Ian Brooks

Biomedical computing
Michael Welge, Ian Brooks, Victor Jongeneel
• Biomedicine identified as being of high importance to
NCSA, but definition of strategic plan still in early stages
• Expect to have a fully formed strategy by end 2010
• Overall process for developing a plan:
• Understand the strengths of NCSA and be prepared to capitalize
on them; identify weakness that need to be addressed
• Define the overall areas of opportunity for high-performance
computing in the biomedical area
• Identify strategic partnerships on- and off-campus, with the
potential to initiate projects congruent with the above
• Engage in substantive discussions with partners and define the
scope of common projects
NCSA Strategic Planning Presentation (April 20,2010)
Understanding the strengths of NCSA
• Most relevant NCSA activities (View of NCSA):
•
•
•
•
•
High-performance data services (2)
Science visualization for scientists and engineers (5)
Cyber-applications for scientific and engineering communities (6)
Security technologies and software (11)
High-performance computing services (1)
• Core competence in large-scale data mining
• Handling and extracting information from massive datasets
• Building on past successes
• Continuing work on NAMD and its applications
• NCSA seminal contribution to MIDAS (Models of Infectious Disease
Agents Study)
• Collaborative projects with IGB (e.g. Evolution Highway)
• International and national disease monitoring and control
NCSA Strategic Planning Presentation (April 20,2010)
Defining areas of opportunity
(from 2006 NSF workshop)
• Biomolecular Structure Modeling (for example extending classical
Molecular Dynamics calculations to account for quantum
mechanical effects, multidimensional free energy surfaces, transition
state ensembles)
• Modeling Complex Biological Systems (for example developing
models of cell and organ function)
• Genomics (for example search calculations mapping phylogeny to
ontogeny)
• Customized Patient Care (for example computing drug interactions
in the context of individual physiology and blood chemistry)
• Ecological component of earth system modeling (for example adding
plant cover to climate models)
• Infectious disease modeling (for example modeling of disease
spreading and the likely impact of containment strategies)
NCSA Strategic Planning Presentation (April 20,2010)
Information-based medicine
MAYO CLINIC
DIVISION OF BIOMEDICAL SCIENCES
Additional areas of opportunity
• Data management solutions for high-throughput biology
• “Next generation” sequencing currently generates tens of TB raw
data per experiment, steep increases likely
• Other technologies are also rapidly increasing output:
proteomics with prior spatial and/or chemical separation, highthroughput high-resolution imaging, …
• Understanding the relationship between genotype and
phenotype
• Rapidly increasing production of full genome sequences from
individuals within one species (mostly human) and from different
species; millions of differences observed, thousands of genomes
being sequenced
• Identifying genomic determinants of phenotypic differences is a
major data mining / statistical problem
NCSA Strategic Planning Presentation (April 20,2010)
High-throughput biology - one recent example
190,000 movies recording over 19 million cell divisions…
NCSA Strategic Planning Presentation (April 20,2010)
Data analysis and hit detection.
B Neumann et al. Nature 464, 721-727 (2010) doi:10.1038/nature08869
A schema for implementing individualized medicine
NCSA Strategic Planning Presentation (April 20, 2010)
Strategic Partnerships
• On campus
• IGB – medium-scale sequencing projects, comparative genomics, evolution
• Beckman, Bioeng. – high-throughput imaging and analysis
• Genomic Institute of Singapore
• Existing collaboration with Ed Liu, institutional ties, immediate need for
expertise
• Mayo Clinic
• MoU in place, unmatched patient records, committed to developing
individualized medicine, poised to boost sequencing capability
• Genome Center at Wash U
• One of the pre-eminent genome centers worldwide
• Faced with critical data management and analysis problems in spite of
heavy investments
• Possible new partnerships
• Wellcome Trust Sanger Institute – where much of high-throughput biology is
happening
• EMBL/EBI, Broad Institute, BGI-Shenzhen, Genoscope, ???
NCSA Strategic Planning Presentation (April 20,2010)
Scope of projects (with partners)
• Discussions are ongoing, nothing decided yet
• Overall themes emerging:
• Practical solutions for large-scale data management in the life
sciences
• Development of tools, including visualization, for comparative
genomics (within or between species)
• Data mining for personalized medicine – integration of “omic”
data with phenotypic records
• Disease prevention through monitoring and modeling
• Real-time medical decision support
• Needed soon: a WOW project highlighting the know-how
and capabilities of NCSA
NCSA Strategic Planning Presentation (April 20,2010)
Sources
• NSF Workshop Report: Petascale Computing in the
Biological Sciences (2006)
• Edited by Allan Snavely, Gwen Jacobs, and David A. Bader
• Presentation from Division of Biomedical Sciences on
“Information-based Medicine”
• Discussion with Roberto Fabbretti, Jacques Rougemont,
Ioannis Xenarios (Lausanne) on data needs of next-gen
sequencing
• Many discussions with faculty and staff from NCSA, IGB,
Bioengineering, other UIUC Departments
• Dozens of recent papers, particularly in Nature and
Nature Genetics
NCSA Strategic Planning Presentation (April 20,2010)