Prof. Giuseppe Longo Department of Physical Sciences

Download Report

Transcript Prof. Giuseppe Longo Department of Physical Sciences

Virtual organizations in astronomy
and beyond
Tblisi, March 28-30 2007
Prof. Giuseppe Longo
Chair of Astrophysics - Department of Physical Sciences
University of Napoli Federico II – Italy
National Institute of Astrophysics – Napoli Unit
[email protected]
http://people.na.infn.it/~longo/
The Exponential Growth of Information in Astronomy
1000
100
10
1
0.1
1970
1975
1980
1985
1990
1995
2000
CCDs
Total area of 3m+ telescopes in the
world in m2, total number of CCD
pixels in Megapix, as a function of
time. Growth over 25 years is a
factor of 30 in glass, 3000 in pixels.
• Gigapixel arrays are a reality,hence
optical and near infrared surveys
are becoming common
• Space missions archives are being
federated
• Old datasets (space and ground
based instruments) are being
federated
Glass
• Estimated 1 TB per day in 2008
Astronomy, more than other sciences is facing a Major
Data Avalanche ( … a true tsunami…)
Large survey projects
from ground and from
space
Distributed data repositories
Data are not where the users are
PetaBytes of data / week
(past, ongoing, future)
Massive numerical simulations
Distributed computing
(PB per simulation)
Data federation of MDS
Adoption of standards and
common onthologies
Data analysis and interpretation
Need for a new generation of tools
(A.I. based) capable to work in a
distributed environment
International Virtual Observatory Alliance
GRID
INFRASTRUCTURE
The distributed environment
• Once the VO’a will come operationals, there will be no need
to have locally powerful computing facilities,
• Federation of existing and new databases through adoption of common
standards Network access to the databases
• To provide the user with user friendly access to all federated data
• To allow the user to access distributed computing facilities and to exploit all
available data withouth moving the data but the codes (… data remain at data
centers where the expertise is)
• To open entirely new paths to discovery process in astronomy (but not only!)
What are some
of the goals of
VO’s
• VO are the most democratic tool ever implemented by any
scientific community.
• Data repositories are mostly public (either immediately or
after proprietary period of observers)
• Data analysis and data mining tools are available to the
international community through a distributed computing
environment
• Every one can contribute (either with new data or with new
SW-tools)
• Once the VO be implemented, new – top level science will
be at the “fingers” of any competent scientist who has
minimal computing facilities and a good access to the
WWW
What is being done in Napoli 1 – The surveys
VLT Survey Telescope
(Napoli,ESO)
P.I: Prof. M. Capaccioli
2.5 m diameter - OPTICAL
1x1 sq deg f.o.v.
16 k x 16k CCD mosaic
(optical)
New technology
Adaptive optics
0.2 arcsec psf
Operational end 2007
100 GB raw data/night
Nobel laureate R. Giacconi visiting
VST factory
VLT site, cerro Paranal (Chile)
What is being done in Napoli 2 – The detector
Omegacam
French – Netherlands – Italy consortium
16 k x 16 k array CCd mosaic
Ready
Data processing pipeline
European FP6 network ASTROWISE
Real time storage and processing of
the VST data
What is being
done in Napoli 3 – The computing
CAMPUS GRID
Campus GRID
Dipartimento di Scienze Fisiche
Locale 1G01
“Sala dell’infrastruttura GRID principale”
512 +15 + 24 + 16 + 128 nodes
150 TB storage
(IBM, DEC - Alpha, etc.)
Armadio
telematico
infrastrutturale
CDS
GARR
16 GBaud optical fibers backbone
di Centro stella
Recently evolved into
PON - SCOPE
Dipartimento di Chimica
Dipartimento di
Matematica e Applicazioni
3.6 M€ (8.2 M€ total) for Hardware (512
boards with 4 CPU’s)
Financed by Italian Government
Operational end 2007
What is being done in Napoli 4 – The Data mining
Draco Project
building the GRID infrastructure
for the Italian VO
400 k€ - MIUR
Cost- Action 283 EU
Euro – VO, VO-Tech
European Virtual Observatory
Technological Infrastructures
European Infrastructures for VO
(UK, D, I, F, etc.) 6.6 M€ - EU
VO- Neural (Napoli lead)
Building Data Mining and Visualization
for Massive Data Sets in a Distributed
Environment
Complex parameter space
Parameter space of incredibly
high dimensionality (N>>100)
Example 1: panchromatic view of the universe
X
IR.
Opt.
radio
Crab Nebula: SN 1054 a.C.
Example 2: a new way to do conventional astronomy
Selection of quasar candidates from a 3 band photometric
survey
Example: exploring a 3D Parameter Space
Given an arbitrary
parameter space:
•
•
•
•
•
•
Data Clusters
Points between Data Clusters
Isolated Data Clusters
Isolated Data Groups
Holes in Data Clusters
Isolated Points
Nichol et al. 2001
Slide courtesy of Robert Brunner @ CalTech.
Example:
21-D parameter space
VO- Neural
Probabilistic Principal Surfaces
Negative ENtropy Clustering +
Dendrogram
• Multiwavelenght – multiepoch – multinstrument data (federation of
databases) hence there is a strong need for a new generation of data
processing, data visualization and data-mining tools
• These tools must be largely based on Artificial Intelligence
• Interoparibility is a must (Plastic is a standard)
THESE TOOLS ARE OF WIDE APPLICATION: bioinformatics, geophysics
(environment, stratigraphy, etc.), business (stock market, marketing
strategies, etc.). Therefore interdisciplinarity is a must!
Many probles to be solved:
• Missing data (bew data models are needed)
• Parallelization of existing codes
• Sensibilization of the community through selected scientific
cases (astrophysics, bioinformatic, marketing, etc.)
We (UK, F, I, D, USA, India) intend to pursue the above tasks using the
following instruments:
National funds and private companies
EU funds through new COST Action and ITN
Eventually through RI
US funds through NSF
Conferences and Schools for young students (dissemination is CRUCIAL)
NEW POTENTIAL PARTNERS ARE ENCOURAGED TO CONTACT ME:
[email protected]
Plate or digital archives of astronomical data
Other types of scientific data
Advanced programming and mathematical know-how’s