Slides - Indico LAL
Download
Report
Transcript Slides - Indico LAL
Grids in Europe and in France
EGEE and Institut des Grilles
Guy Wormser
Institut des Grilles Director
Visite to Meraka Institute, South Africa
May 12, 13
1
Outline
Rationale for this visit
The EGEE project
EGEE and International collaboration
Status report on European Grid Initiative
(EGI)
Presentation of « Institut des grilles » du
CNRS
Conclusion
Visite to Meraka Institute, South Africa
May 12, 13
2
Rationale for this visit
Computing grids are a very efficient tool to
promote scientific collaboration
We observe a very strong correlation between
the presence of a grid node and the local
scientific activity on the Grid
Grids can be an excellent tool to promote
North-South scientific collaboration and
especially in Africa
We would like to explore the possibility to help
install a grid node in South Africa and to
promote several applications on the grid
Visite to Meraka Institute, South Africa
May 12, 13
3
Grid: Resource Sharing
Enabling Grids for E-sciencE
•
•
Share more than information
Data, computing power, applications
• Middleware handles everything
Your
Program
The Grid
Single computer
PROGRAMS
Word/Excel
Games
MIDDLEWARE
User Interface
Machine
Your
Program
Email/Web
Resource
Broker
OPERATING SYSTEM
Disks, CPU etc
EGEE-II INFSO-RI-031688
Disk
Server
CPU
CPU
Cluster
Cluster
Visite to Meraka Institute, South Africa May 12, 13
4
Electricity Grid
Enabling Grids for E-sciencE
Analogy with the Electricity Power Grid
Power Stations
Distribution Infrastructure
'Standard Interface'
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
5
EGEE : Enabling Grids for E-sciencE
Enabling Grids for E-sciencE
Goal
create a general European Grid
production quality infrastructure on top of
present and future EU RN infrastructure
Build on
EU and EU member states major
investment in Grid Technology
Several pioneering prototype results
Largest Grid development team in the
world
Goal can be achieved for about €100m/4 years on top
of the national and regional initiatives
Approach
Leverage current and planned national
and regional Grid programmes (e.g.
LCG)
Work closely with relevant industrial Grid
developers, NRNs and US
EGEE-II INFSO-RI-031688
applications
EGEE
Geant network
Visite to Meraka Institute, South Africa May 12, 13
6
The Large Hadron Collider Project
Enabling Grids for E-sciencE
4 detectors
CMS
ATLAS
LHCb
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
7
Bat 40
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
8
New solutions are necessary!
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
9
How e-Infrastructrures help e-Science
Enabling Grids for E-sciencE
•
e-Infrastructures provide easier access for
– Small research groups
– Scientists from many different fields
– Remote and still developing countries
•
To new technologies
– Produce and store massive amounts
of data
– Transparent access to millions of files
across different administrative domains
– Low cost access to resources
Mobilise large amounts of CPU & storage
on short notice (PC clusters)
– High-end facilities (supercomputers)
•
And help to find new ways to collaborate
– Develops applications using distributed
complex workflows
– Eases distributed collaborations
– Provides new ways of community building
– Gives easier access to higher education
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
10
EGEE
Enabling Grids for E-sciencE
Flagship grid infrastructure project co-funded by the European Commission
Now in 2nd phase with 91 partners in 32 countries
Main Objectives
• Operate a large-scale,
production quality grid
infrastructure for e-Science
• Attract new resources and
users from industry as well
as sciences
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
11
EGEE – What do we deliver?
Enabling Grids for E-sciencE
• Infrastructure operation
– Currently includes ~250 sites across 45 countries
Continuous monitoring of grid services & automated site
configuration/management
Support many Virtual Organisations from diverse
research disciplines
• Middleware
– Production quality middleware distributed under
business friendly open source licence
Implements a service-oriented architecture that virtualises
resources
Adheres to recommendations on web service interoperability and evolving towards emerging standards
• User Support - Managed process from first contact
through to production usage
–
–
–
–
Training
Expertise in grid-enabling applications
Online helpdesk
Networking events (User Forum, Conferences etc.)
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
12
240 sites
45 countries
41,000 CPUs
5 PetaBytes
>5000 users
>100 VOs
>100,000 jobs/day
Enabling Grids for E-sciencE
No. CPU
No. Sites
Aug…
Mar…
Oct-…
May…
Dec…
Jul-05
Feb-…
Sep-…
0
Apr-…
Aug…
Mar…
Oct-…
Ma…
Dec…
Jul-05
Feb…
50000
Sep…
400
200
0
Apr-…
Archeology
Astronomy
Astrophysics
Civil Protection
Comp. Chemistry
Earth Sciences
Finance
Fusion
Geophysics
High Energy Physics
Life Sciences
Multimedia
Material Sciences
…INFSO-RI-031688
EGEE-II
32%
Visite to Meraka Institute, South Africa May 12, 13
13
Types of applications
Enabling Grids for E-sciencE
•
•
•
•
•
•
Simulation
– LHC Monte Carlo simulations; Fusion; WISDOM
– Jobs needing significant processing power; Large number of
independent jobs; limited input data; significant output data
Bulk Processing
– HEP ; Processing of satellite data
– Distributed input data; Large amount of input and output
data; Job management (WMS); Metadata services; complex
data structures
Parallel Jobs
– Climate models, computational chemistry
– Large number of independent but communicating jobs; Need
for simultaneous access to large number of CPUs; MPI
libraries
Short-response delays
– Prototyping new applications; grid Monitoring grid;
Interactivity
– Limited input & output data; processing needs but fast
response and quality of service
Workflow
– Medical imaging; flood analysis
– Complex analysis algorithms; complex dependencies
between jobs
Commercial Applications
– Non-open source software; Geocluster (seismic platform);
FlexX (molecular docking); Matlab, Mathematics; Idl, …
– License server associated to an application deployment
model
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
14
LHC Computing Model
Enabling Grids for E-sciencE
Lab m
Uni x
Uni a
USA
Brookhaven
Lab a
UK
USA
FermiLab
Physics
Department
France
The LHC Computing
Tier
1
Centre
Tier2
CERN
Uni n
……….
Italy
Desktop
NL
Germany
Lab b
Lab c
Uni y
EGEE-II INFSO-RI-031688
[email protected]
Uni b
Visite to Meraka Institute, South Africa May 12, 13
15
SEISMOLOGY[1]
Enabling Grids for E-sciencE
Fast Determination of mechanisms of important earthquakes (IPGP:
E. Clévédé, G. Patau)
Challenge
Provide results 24h -48h after its
occurrence
5 Seisms already ported: Peru, Guadeloupe,
Indonesia (Dec.), Japon, Indonesia (Feb.)
Application to run on alert
Collect data of 30 seismic stations from
GEOSCOPE worldwide network
Select stations and data
Definition of a spatial 3D grid +time
Peru earthquake, 23/6/2001, Mw=8.3
Data used: 15 Geoscope Stations
Run for example 50-100jobs
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
16
Management of water resources in
Mediterranean area (SWIMED)
Enabling Grids for E-sciencE
G. Lecca (CRS4 Italy), P. Renard (Unine, CH),
J. Kerrou (INAT, Tunisia), R. Ababou (IMFT, Fr)
Korba coastal aquifer
Tunisia
45 km
Cape Bon
Peninsula
70km
south-east
of Tunis
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
17
GEOSCIENCES
Enabling Grids for E-sciencE
• Generic seismic platform software, based on
Geocluster commercial software developed by CGG
• Includes 400 geophysical modules, implemented on
EGEE
• Used by both academics and private companies.
• Free of charge for Academics, with charge for R&D
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
18
GATE
Enabling Grids for E-sciencE
GEANT4 Application to Tomography Emission
• Scientific objectives
Radiotherapy planning for improving the treatment of cancer by ionizing
radiations of the tumours.
Therapy planning is computed from pre-treatment MR scans by
accurately locating tumours in 3D and computing radiation doses applied
to the patients.
• Method
GEANT4 base software to model
physics of nuclear medicine.
Use Monte Carlo simulation to
improve accuracy of computations (as
compared to the deterministic classical
approach)
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
19
Drug Discovery
Enabling Grids for E-sciencE
• WISDOM focuses on in silico drug discovery for
neglected and emerging diseases.
• Malaria — Summer 2005
– 46 million ligands docked
– 1 million selected
– 1TB data produced; 80 CPU-years used in 6 weeks
• Avian Flu — Spring 2006
– H5N1 neuraminidase
– Impact of selected point mutations on eff. of existing drugs
– Identification of new potential drugs acting on mutated N1
• Fall 2006
– Extension to other neglected diseases
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
20
High Throughput Virtual Docking
Enabling Grids for E-sciencE
Millions of chemical
compounds available
in laboratories
Chemical compounds : ZINC
Molecular docking : FlexX, Autodock
Targets structures : PDB
Grid infrastructure : EGEE
Chemical compounds :
Chembridge – 500,000
Drug like – 500,000
High Throughput Screening
1-10$/compound, nearly impossible
Molecular docking (FlexX, Autodock)
~80 CPU years, 1 TB data
Computational data challenge
~6 weeks on ~1000/1600 computers
Targets :
Plasmepsin II (1lee, 1lf2, 1lf3)
Plasmepsin IV (1ls5)
EGEE-II INFSO-RI-031688
Hits screening
using assays
performed on
living cells
Leads
Clinical testing
Drug
Visite to Meraka Institute, South Africa May 12, 13
21
Grid workflow
Enabling Grids for E-sciencE
Results
Compounds list
Software
Storage
Element
Site1
Computing
Element
Statistics
Parameter settings
Target structures
Compounds sublists
User interface
Compounds
database
Storage
Element
Results
Computing
Element
Site2
Software
• FlexX license server :
– 3000 floating licenses given by BioSolveIT to SCAI
– Maximum number of used licenses was 1008
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
22
gPTM3D
Enabling Grids for E-sciencE
3D Medical Image Analysis Software
• Scientific objectives
Interactive volume reconstruction on large radiological data.
PTM3D is an interactive tool for performing computer-assisted 3D
segmentation and volume reconstruction and measurement (RSNA 2004)
Reconstruction of complex organs (e.g. lung) or entire body from modern
CT-scans is involved in augmented reality use case e.g. therapy planning.
• Method
Starting from an hand-made rough
Initialization,a snake-based algorithm
segments each slice of a medical volume.
3D reconstruction is achieved in parallel
by triangulating contours from consecutive
slices.
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
23
Grids key compettive advantages
Enabling Grids for E-sciencE
• Transparent access to distributed data
– Exemples Earth sciences, Life sciences
• Handling of huge datasets
– Physique des particle Physics, astrophysics, human sciences
• Large flexibility in computing ressources
– Disasters management
– Avian flu, malaria challenges
• Synergy between the grid network and the human
network
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
24
Registered Collaborating Projects
Enabling Grids for E-sciencE
25 projects have registered as of September 2007: web page
Infrastructures
geographical or thematic coverage
EGEE-II INFSO-RI-031688
Applications
Support Actions
improved services for academia,
industry and the public
key complementary functions
Visite to Meraka Institute, South Africa May 12, 13
25
Collaborating infrastructures
Enabling Grids for E-sciencE
Nothing
there
yet!
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
26
The Montpellier workshop
Enabling Grids for E-sciencE
• Held in France in December 10-12 2007
• Grid workshop to develop France-Africa collaboration
• Sponsored by CNRS and fondation « Share the
knowledge »
• Focus on African development via science and
excellence of African scientists
• Promote Internet connectivity and Grid nodes in Africa
• First actions selected: implant two grid nodes in Africa
• South Africa and Senegal selected as the best places
to start
• Prepare the launch of a « EuroAfrica » FP7 program
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
27
EGEE-II to EGEE-III
Enabling Grids for E-sciencE
•
•
EGEE-III proposal currently under negotiation with European Commission
Key objectives
– Expand/optimise existing EGEE infrastructure, include more resources and user
communities
– Prepare migration from a project-based model to a sustainable federated
infrastructure based on National Grid Initiatives
•
2 year period – spring 2008 to spring 2010
– No gap between EGEE-II and EGEE-III
•
Similar consortium
– Now structured on a national basis (National Grid Initiatives/Joint Research Units)
Networking activities
Specific Service Activities
NA1: Management
SA1: Operations
NA2: Dissemination & Business outreach
SA2: Networking Support
NA3: Training
SA3: Integration, testing & Cert.
NA4: Applications
Joint Research Activities
NA5: International Coop. & Policy
JRA1: Middleware engineering
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
28
European Grid Initiative
Enabling Grids for E-sciencE
• Need to prepare permanent, common Grid infrastructure
• Ensure the long-term sustainability of the European e-Infrastructure
independent of short project funding cycles
• Coordinate the integration and interaction between National Grid
Infrastructures (NGIs)
• Operate the production Grid infrastructure on a European level for a
wide range of scientific disciplines
Must be no gap
in the support of
the production
grid
EGEE-II INFSO-RI-031688
Visite to Meraka Institute, South Africa May 12, 13
29
• EGI Design Study proposal approved to the
European Commission (started 1st September’07)
• Supported by 30+ National Grid Initiatives (NGIs)
• 2 year project to prepare the setup and operation
of a new organizational model for a sustainable
pan-European grid infrastructure
• Federated model bringing together NGIs to build
a European organisation
• Well defined, complimentary responsibilities
between NGIs and EGI
Meraka Institute, South
http://www.eu-egi.org Visite toAfrica
May 12, 13
30
Characteristics of NGIs
Each NGI
• recognized national body with a single point-ofcontact
• mobilise national funding and resources
• operate the national e-Infrastructure
• support user communities (application
independent, and open to new user communities
and resource providers)
• contribute and adhere to international standards
and policies
Responsibilities between NGIs and EGI are clearly
separated and complementary
Visite to Meraka Institute, South
Africa May 12, 13
31
37 European NGIs
+ Asia, US, Latin America
+ PRACE
+ OGF-Europe
+…
Visite to Meraka Institute, South
Africa May 12, 13
32
Grids and CNRS
CNRS is a multidiscipliray research
organism with strong competences in :
Computing sciences
Supercomputer services and operations (IDRIS)
Computing services to a very large user community especially
through IN2P3 computing center in Lyon
In 2000, CNRS decided to invest heavily
in Grid technology:
Project EUROGRID, precursor of DEISA, the « grid
for Supercomputers »
Project DATAGRID, precursor of EGEE, the « production
grid » based on distributed clusters
Grid5000 , the grid for computing scientists
Visite to Meraka Institute, South Africa
May 12, 13
33
Short presentation of CNRS
« Institut des Grilles »
In 2007, recoàgnition of the considerable importance of Grids related
activity within CNRS : creation of the « Insitut des Grilles »
Federate all activities in CNRS related to research on Grids, grids for
research and production grids
Better visibility
Better efficiency
Strengthen the links between these domains
Provide a single well identified Point of contact for national and
international collaborations
CNRS representative for all international contacts
Central core for the emerging French « National Grid Initiative »
Parternship for grid-relate work with all major French research
organisms, CEA, CNES, INRIA, …
Outreach activities, evangelisation of new scientific communities,
training,..
Visite to Meraka Institute, South Africa
May 12, 13
34
Institut des Grilles composition
30 laboratories all over France:
APC, CC_IN2P3, CPPM, CREATIS, LIP, I3S, IBCP, IN2P3_adm,
IPGP, IPHC, IPNL, IPNO, IRISA, IRIT, LABRI, LAL, LAPP,
LIFL, LIG, LIP6, LLR, LORIA, LPC Clermont, LPNHE, LRI,
IPSL, LPSC, LSIT, Subatech, UREC
13 IN2P3 labs linked to EGEE/LCG
11 computing science labs
5 labs linked to various applications
Administrative support
Total of 350 people!
Visite to Meraka Institute, South Africa
May 12, 13
35
IdG objectives and means
Scientific animation
Organization of the national prospective on the
needs of the scientific community related to Grids
Call for proposals
Dialog Forum between Grilles production
grids and research grids
Interoperability GRID5000/EGEE
Grids Observatory
Middleware of the future
Training
Communication
Visite to Meraka Institute, South Africa
May 12, 13
36
The national prospective working
groups
Thematic working groups
Planet and Universe sciences; environmental sciences
Life sciences
Human sciences
Chemistry
Physics
Engineering and computing sciences
Subatomic physics (including astroparticles)
Transverse working groups
Data grids
Grids and supercomputers
Regional grids
Grids and very large research infrastructures
Grids and the user
Relationship with industry
Visite to Meraka Institute, South Africa
May 12, 13
37
Summary
Grids are all about sharing – they are a means of working with groups
around the world
EGEE operates the world’s largest multi-disciplinary grid infrastructure for
scientific research
In constant and significant production use
Phase III of EGEE has just started
Need to prepare the long-term
EGEE, collaborating projects, national grid initiatives and user
communities are working to define a model for a sustainable grid
infrastructure that is independent of short project cycles
In France, creation of the CNRS Grid Institute as the nucleus of the French
NGI to help implementing this future!
Grid are ideal tools to promote scientific collaborations!
Visite to Meraka Institute, South Africa
May 12, 13
38
Summary of Informal discussion
Focus on practical actions towards 4
main goals:
Users-oriented large school in December
Creation of a local grid in South Africa
Official registration of a South African Grid
Node in EGEE
A few applications driven by South African
scientists running on EGEE
Visite to Meraka Institute, South Africa
May 12, 13
39
Practical steps
Overall Steering committee
SA: J. Eksteen, A. Gazendam
FR: A. Corval, V. Baron, G. Wormser
Step 1: Invite 3 SA sys admin to a
specific EGEE session to be held in Lyon
in September-October (Preferred date
around Sep 15)
3 days of tutorial + two days of hands-on experience in a
Regional Operation Center (ROC)
To do list:
NAME the SA experts (SA)
FIX the date (FR)
Step 1 managersVisite
: D.
Bouvet
(FR),
to Meraka
Institute,
Southxxx
Africa (SA)
May 12, 13
40
Step 2: Sys Admin School
Step 2: Sys Admin school organized in
SA (U. of Pretoria)
Attendance : ~15 seats (should cover all potential grid nodes
in SA)
Leading role to be taken by the 3 experts having travelled to
France
Preferred date : Oct 15
2 experts coming from France
Step 2 managers : D. Bouvet (FR), xxx
(SA)
To do list
Fix the date
Name the SA manager
Milestone 1 (1 SA Grid node fully
Visite to Meraka Institute, South Africa
May 12, 13
41
Practical steps for Milestone 1
Establish contacts related to Certification
Authority (Alice )
Establish formal link between the French
ROC and the SA EGEE Grid node
Visite to Meraka Institute, South Africa
May 12, 13
42
Step 3: User oriented Grid School in
South Africa
Proposed venue : Cape Town , coupled
with SCAW
Dates: 1st Week of December ?
Attendance :100-150 people from SA
and other african countries
Milestone 2: Local SA grid up and
running with at least 3 sites
Milestone 3: 3 applications with strong
SA participation up and running in EGEE
Training team 5-6 people from France
Visite to Meraka Institute, South Africa
May 12, 13
43
Practical details for Step 3
Fix the date
Form Local Organizing committee
Contact SCAW
Publicize
Outreach, « big » political event
Select the target applications (WISDOM,
ATLAS, langage translation?,
chemistry?)
Name the visiting team
Collect the training material
Visite to Meraka Institute, South Africa
May 12, 13
44
Funding issues
Hope that there is no show-stoppers
Multiple sources
French Embassy
Meraka Institute
Institut des Grilles
SA DST
….
Establish the needed budget for Step 1,
2 and 3
Visite to Meraka Institute, South Africa
May 12, 13
45
Conclusion
Very ambitious program!
If everything works as planned, one gets
at the end:
A running grid in SA visible in EGEE
A few applications with high impact
High visibility and many local users
Let’s do it!
Visite to Meraka Institute, South Africa
May 12, 13
46