EGEE_aspera_gw

Download Report

Transcript EGEE_aspera_gw

Enabling Grids for E-sciencE
The EGEE project and the
future of European grids
Guy Wormser
Director of “Institut des Grilles du CNRS”
Aspera Workshop
www.eu-egee.org
EGEE-II INFSO-RI-031688
EGEE and gLite are registered trademarks
Grid: Resource Sharing
Enabling Grids for E-sciencE
•
•
Share more than information
Data, computing power, applications
• Middleware handles everything
Your
Program
The Grid
Single computer
PROGRAMS
Word/Excel
Games
MIDDLEWARE
User Interface
Machine
Your
Program
Email/Web
Resource
Broker
OPERATING SYSTEM
Disks, CPU etc
EGEE-II INFSO-RI-031688
Disk
Server
CPU
CPU
Cluster
Cluster
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
2
Electricity Grid
Enabling Grids for E-sciencE
Analogy with the Electricity Power Grid
Power Stations
Distribution Infrastructure
'Standard Interface'
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
3
EGEE : Enabling Grids for E-sciencE
Enabling Grids for E-sciencE
Goal
create a general European Grid
production quality infrastructure on top of
present and future EU RN infrastructure
Build on
EU and EU member states major
investment in Grid Technology
Several pioneering prototype results
Largest Grid development team in the
world
Goal can be achieved for about €100m/4 years on top
of the national and regional initiatives
Approach
Leverage current and planned national
and regional Grid programmes (e.g.
LCG)
Work closely with relevant industrial Grid
developers, NRNs and US
EGEE-II INFSO-RI-031688
applications
EGEE
Geant network
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
4
The Large Hadron Collider Project
Enabling Grids for E-sciencE
4 detectors
CMS
ATLAS
LHCb
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
5
Bat 40
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
6
New solutions are necessary!
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
7
How e-Infrastructrures help e-Science
Enabling Grids for E-sciencE
•
e-Infrastructures provide easier access for
– Small research groups
– Scientists from many different fields
– Remote and still developing countries
•
To new technologies
– Produce and store massive amounts
of data
– Transparent access to millions of files
across different administrative domains
– Low cost access to resources
 Mobilise large amounts of CPU & storage
on short notice (PC clusters)
– High-end facilities (supercomputers)
•
And help to find new ways to collaborate
– Develops applications using distributed
complex workflows
– Eases distributed collaborations
– Provides new ways of community building
– Gives easier access to higher education
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
8
EGEE
Enabling Grids for E-sciencE
Flagship grid infrastructure project co-funded by the European Commission
Now in 2nd phase with 91 partners in 32 countries
Main Objectives
• Operate a large-scale,
production quality grid
infrastructure for e-Science
• Attract new resources and
users from industry as well
as sciences
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
9
EGEE – What do we deliver?
Enabling Grids for E-sciencE
• Infrastructure operation
– Currently includes ~250 sites across 45 countries
 Continuous monitoring of grid services & automated site
configuration/management
 Support many Virtual Organisations from diverse
research disciplines
• Middleware
– Production quality middleware distributed under
business friendly open source licence
 Implements a service-oriented architecture that virtualises
resources
 Adheres to recommendations on web service interoperability and evolving towards emerging standards
• User Support - Managed process from first contact
through to production usage
–
–
–
–
Training
Expertise in grid-enabling applications
Online helpdesk
Networking events (User Forum, Conferences etc.)
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
10
240 sites
45 countries
41,000 CPUs
5 PetaBytes
>5000 users
>100 VOs
>100,000 jobs/day
Enabling Grids for E-sciencE
No. CPU
No. Sites
Aug…
Mar…
Oct-…
May…
Dec…
Jul-05
Feb-…
Sep-…
0
Apr-…
Aug…
Mar…
Oct-…
Ma…
Dec…
Jul-05
Feb…
50000
Sep…
400
200
0
Apr-…
Archeology
Astronomy
Astrophysics
Civil Protection
Comp. Chemistry
Earth Sciences
Finance
Fusion
Geophysics
High Energy Physics
Life Sciences
Multimedia
Material Sciences
…INFSO-RI-031688
EGEE-II
32%
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
11
Types of applications
Enabling Grids for E-sciencE
•
•
•
•
•
•
Simulation
– LHC Monte Carlo simulations; Fusion; WISDOM
– Jobs needing significant processing power; Large number of
independent jobs; limited input data; significant output data
Bulk Processing
– HEP ; Processing of satellite data
– Distributed input data; Large amount of input and output
data; Job management (WMS); Metadata services; complex
data structures
Parallel Jobs
– Climate models, computational chemistry
– Large number of independent but communicating jobs; Need
for simultaneous access to large number of CPUs; MPI
libraries
Short-response delays
– Prototyping new applications; grid Monitoring grid;
Interactivity
– Limited input & output data; processing needs but fast
response and quality of service
Workflow
– Medical imaging; flood analysis
– Complex analysis algorithms; complex dependencies
between jobs
Commercial Applications
– Non-open source software; Geocluster (seismic platform);
FlexX (molecular docking); Matlab, Mathematics; Idl, …
– License server associated to an application deployment
model
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
12
LHC Computing Model
Enabling Grids for E-sciencE
Lab m
Uni x
Uni a
USA
Brookhaven
Lab a
UK
USA
FermiLab
Physics
Department
France
The LHC Computing
Tier
1
Centre
Tier2
CERN
Uni n
……….
Italy
Desktop

NL
Germany
Lab b
Lab c
Uni y

EGEE-II INFSO-RI-031688
[email protected]

Uni b
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
13
SEISMOLOGY[1]
Enabling Grids for E-sciencE
Fast Determination of mechanisms of important earthquakes (IPGP:
E. Clévédé, G. Patau)
Challenge
Provide results 24h -48h after its
occurrence
5 Seisms already ported: Peru, Guadeloupe,
Indonesia (Dec.), Japon, Indonesia (Feb.)
Application to run on alert
Collect data of 30 seismic stations from
GEOSCOPE worldwide network
Select stations and data
Peru earthquake, 23/6/2001, Mw=8.3
Definition of a spatial 3D grid +time
Data used: 15 Geoscope Stations
Run for example 50-100jobs
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
14
Management of water resources in
Mediterranean area (SWIMED)
Enabling Grids for E-sciencE
G. Lecca (CRS4 Italy), P. Renard (Unine, CH),
J. Kerrou (INAT, Tunisia), R. Ababou (IMFT, Fr)
Korba coastal aquifer
Tunisia
45 km
Cape Bon
Peninsula
70km
south-east
of Tunis
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
15
GEOSCIENCES
Enabling Grids for E-sciencE
• Generic seismic platform software, based on
Geocluster commercial software developed by CGG
• Includes 400 geophysical modules, implemented on
EGEE
• Used by both academics and private companies.
• Free of charge for Academics, with charge for R&D
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
16
GATE
Enabling Grids for E-sciencE
GEANT4 Application to Tomography Emission
• Scientific objectives
Radiotherapy planning for improving the treatment of cancer by ionizing
radiations of the tumours.
Therapy planning is computed from pre-treatment MR scans by
accurately locating tumours in 3D and computing radiation doses applied
to the patients.
• Method
GEANT4 base software to model
physics of nuclear medicine.
Use Monte Carlo simulation to
improve accuracy of computations (as
compared to the deterministic classical
approach)
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
17
Drug Discovery
Enabling Grids for E-sciencE
• WISDOM focuses on in silico drug discovery for
neglected and emerging diseases.
• Malaria — Summer 2005
– 46 million ligands docked
– 1 million selected
– 1TB data produced; 80 CPU-years used in 6 weeks
• Avian Flu — Spring 2006
– H5N1 neuraminidase
– Impact of selected point mutations on eff. of existing drugs
– Identification of new potential drugs acting on mutated N1
• Fall 2006
– Extension to other neglected diseases
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
18
High Throughput Virtual Docking
Enabling Grids for E-sciencE
Millions of chemical
compounds available
in laboratories
Chemical compounds : ZINC
Molecular docking : FlexX, Autodock
Targets structures : PDB
Grid infrastructure : EGEE
Chemical compounds :
Chembridge – 500,000
Drug like – 500,000
High Throughput Screening
1-10$/compound, nearly impossible
Molecular docking (FlexX, Autodock)
~80 CPU years, 1 TB data
Computational data challenge
~6 weeks on ~1000/1600 computers
Targets :
Plasmepsin II (1lee, 1lf2, 1lf3)
Plasmepsin IV (1ls5)
EGEE-II INFSO-RI-031688
Hits screening
using assays
performed on
living cells
Leads
Clinical testing
Drug
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
19
Grid workflow
Enabling Grids for E-sciencE
Results
Compounds list
Software
Storage
Element
Site1
Computing
Element
Statistics
Parameter settings
Target structures
Compounds sublists
User interface
Compounds
database
Storage
Element
Results
Computing
Element
Site2
Software
• FlexX license server :
– 3000 floating licenses given by BioSolveIT to SCAI
– Maximum number of used licenses was 1008
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
20
gPTM3D
Enabling Grids for E-sciencE
3D Medical Image Analysis Software
• Scientific objectives
Interactive volume reconstruction on large radiological data.
PTM3D is an interactive tool for performing computer-assisted 3D
segmentation and volume reconstruction and measurement (RSNA 2004)
Reconstruction of complex organs (e.g. lung) or entire body from modern
CT-scans is involved in augmented reality use case e.g. therapy planning.
• Method
Starting from an hand-made rough
Initialization,a snake-based algorithm
segments each slice of a medical volume.
3D reconstruction is achieved in parallel
by triangulating contours from consecutive
slices.
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
21
Grids key compettive advantages
Enabling Grids for E-sciencE
• Transparent access to distributed data
– Exemples Earth sciences, Life sciences
• Handling of huge datasets
– Physique des particle Physics, astrophysics, human sciences
• Large flexibility in computing ressources
– Disasters management
– Avian flu, malaria challenges
• Synergy between the grid network and the human
network
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
22
EGEE User Forum 2007
Enabling Grids for E-sciencE
Co-located with OGF20:
900+ attendees
~50 booths
EGEE-II INFSO-RI-031688
user forum:
~30 sessions
100+ presentations
20 demos
~60 posters
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
23
Registered Collaborating Projects
Enabling Grids for E-sciencE
25 projects have registered as of September 2007: web page
Infrastructures
geographical or thematic coverage
EGEE-II INFSO-RI-031688
Applications
Support Actions
improved services for academia,
industry and the public
key complementary functions
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
24
Collaborating infrastructures
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
25
EGEE-II to EGEE-III
Enabling Grids for E-sciencE
•
•
EGEE-III proposal currently under negotiation with European Commission
Key objectives
– Expand/optimise existing EGEE infrastructure, include more resources and user
communities
– Prepare migration from a project-based model to a sustainable federated
infrastructure based on National Grid Initiatives
•
2 year period – spring 2008 to spring 2010
– No gap between EGEE-II and EGEE-III
•
Similar consortium
– Now structured on a national basis (National Grid Initiatives/Joint Research Units)
Networking activities
Specific Service Activities
NA1: Management
SA1: Operations
NA2: Dissemination & Business outreach
SA2: Networking Support
NA3: Training
SA3: Integration, testing & Cert.
NA4: Applications
Joint Research Activities
NA5: International Coop. & Policy
JRA1: Middleware engineering
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
26
European Grid Initiative
Enabling Grids for E-sciencE
• Need to prepare permanent, common Grid infrastructure
• Ensure the long-term sustainability of the European e-Infrastructure
independent of short project funding cycles
• Coordinate the integration and interaction between National Grid
Infrastructures (NGIs)
• Operate the production Grid infrastructure on a European level for a
wide range of scientific disciplines
Must be no gap
in the support of
the production
grid
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
27
• EGI Design Study proposal approved to the
European Commission (started 1st September’07)
• Supported by 30+ National Grid Initiatives (NGIs)
• 2 year project to prepare the setup and operation
of a new organizational model for a sustainable
pan-European grid infrastructure
• Federated model bringing together NGIs to build
a European organisation
• Well defined, complimentary responsibilities
between NGIs and EGI
de l'Institut des Grilles,
http://www.eu-egi.orgInauguration
Paris, 3rd December 2007
28
Characteristics of NGIs
Each NGI
• recognized national body with a single point-ofcontact
• mobilise national funding and resources
• operate the national e-Infrastructure
• support user communities (application
independent, and open to new user communities
and resource providers)
• contribute and adhere to international standards
and policies
Responsibilities between NGIs and EGI are clearly
separated and complementary
Inauguration de l'Institut des
Grilles, Paris, 3rd December 2007
29
37 European NGIs
+ Asia, US, Latin America
+ PRACE
+ OGF-Europe
+…
Inauguration de l'Institut des
Grilles, Paris, 3rd December 2007
30
Why a « Grid Institute »
Enabling Grids for E-sciencE
• Considerable importance of Grids related activity within CNRS
• Federate all activities in CNRS related to research on Grids, grids
for research and production grids
– Better visibility
– Meilleure efficiency
– Strengthen the links between these domains
• Provide a single well identified Point of contact for national and
international collaborations
– CNRS representative for all european contracts, for disucssion within
french Ministry
– central core for the emerging French « National Grid Initiative »
– Parternship with all major French reserach organisms, CEA, CNES,
INRIA, …
• Outreach activities, evangelisation of new scientific communities,
training,..
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
31
Institut des Grilles composition
Enabling Grids for E-sciencE
• 30 laboratories:
APC, CC_IN2P3, CPPM, CREATIS, LIP, I3S, IBCP, IN2P3_adm,
IPGP, IPHC, IPNL, IPNO, IRISA, IRIT, LABRI, LAL, LAPP, LIFL,
LIG, LIP6, LLR, LORIA, LPC Clermont, LPNHE, LRI, IPSL,
LPSC, LSIT, Subatech, UREC
– 13 IN2P3 labs linked to EGEE/LCG
– 11 computing science labs
– 5 labs linked to various applications
– Administrative support
• GDR Architecture Systèmes et Réseaux (ASR)
• Total of 310 people
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
32
IdG objectives and means
Enabling Grids for E-sciencE
• Scientific animation
– Organization of the national prospective on the needs of the
scientific community related to Grids
– Call for proposals
• Dialog Forum between Grilles production grids and
research grids
– Interoperability GRID5000/EGEE
– Grids Observatory
– Middleware of the future
• Training
• Communication
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
33
The national prospective working
groups
Enabling Grids for E-sciencE
• Thematic working groups
–
–
–
–
–
–
–
Planet and Universe sciences; environmental sciences
Life sciences
Human sciences
Chemistry
Physics
Engineering and computing sciences
Subatomic physics (including astroparticles)
• Transverse working groups
–
–
–
–
–
–
Data grids
Grids and supercomputers
Regional grids
Grids and very large research infrastructures
Grids and the user
Relationship with industry
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
34
Summary
Enabling Grids for E-sciencE
•
Grids are all about sharing – they are a means of working with groups around the world
– Today we have a window of opportunity to move grids from research prototypes to
permanent production systems (as networks did a few years ago)
•
Interoperability is key to providing the level of support required for our user
communities
•
EGEE operates the world’s largest multi-disciplinary grid infrastructure for scientific
research
– In constant and significant production use
•
A third phase of EGEE is under preparation
•
Need to prepare the long-term
– EGEE, collaborating projects, national grid initiatives and user communities are working to
define a model for a sustainable grid infrastructure that is independent of short project
cycles
•
In France, creation of the CNRS Grid Institute as the nucleus of the French NGI to help
implementing this future!
•
Bright opportunities for grids and astroparticles!
www.eu-egee.org
EGEE-II INFSO-RI-031688
Inauguration de l'Institut des Grilles, Paris, 3rd December 2007
35