Px_UK_eScience

Download Report

Transcript Px_UK_eScience

The Australian Virtual
Observatory
e-Science Meeting
School of Physics, March 2003
David Barnes
What is a Virtual Observatory?
• A Virtual Observatory (VO) is a distributed, uniform
interface to the data archives of the world’s major
astronomical observatories.
• A VO is explored with advanced data mining and
visualisation tools which exploit the unified interface
to enable cross-correlation and combined
processing of distributed and diverse datasets.
• VOs will rely on, and provide motivation for, the
development of national and international
computational and data grids.
Scientific motivation
• Understanding of astrophysical processes depends
on multi-wavelength observations and input from
theoretical models.
• As telescopes and instruments grow in complexity,
surveys generate massive databases which require
increasing expertise to comprehend.
• Theoretical modeling codes are growing in
sophistication to consume available compute time.
• Major advances in astrophysics will be enabled by
transparently cross-matching, cross-correlating and
inter-processing otherwise disparate data.
Aus-VO in 2003
• “Phase A” funded AUD 260K by a 2003 ARC grant:
–
–
–
–
The University of Melbourne
The University of Sydney
CSIRO Australia Telescope National Facility
Anglo-Australian Observatory
• Funded common format on-line archive projects:
–
–
–
–
HIPASS: HI spectral line and 1.4-GHz continuum survey
SUMSS: 843 MHz continuum survey
ATCA archive: spectral line and radio continuum images
2dFGRS: optical spectra of >200K southern galaxies
www.aus-vo.org
www.aus-vo.org/twiki
... thinking about the Aus-VO
Grid, having data nodes and
compute nodes...
GrangeNet: Grid and Next
Generation Network – a 10 Gbit
backbone
CPU?
Parkes?
Data CPU?
ATNF/AAO
2dFGRS
RAVE
Data
Canberra
CPU?
ATCA
MSO
Adelaide
Theory?
CPU
Data CPU?
VPAC
Melbourne
HIPASS
Gemini?
Theory
Data
Sydney
SUMSS
GrangeNet
CPU
APAC
CPU
Swinburne
Theory
VO Interface & Portal
• Agreement with AstroGrid (UK e-Science
project) to be testers for their data
publication and portal creation code.
• Collecting the necessary resources and
intend to have an AstroGrid-based portal
serving HIPASS catalogue data for
demonstration at IAU General Assembly in
July 2003.
The MACHO Grid!
• MACHO: 8-yr lightcurves for >18
million stars
• ANU, APAC and MSO have the data
on mass store, and are working on a
VOTable XML description of the data
(metadata).
• Agreement with San Diego
Supercomputer Center to install a
storage resource broker (SRB) at
ANU, with a view to making the
MACHO data available on an
international Grid.
Grid-based Visualisation
• ATNF will build a Java
PixelCanvas so that
AIPS++ visualisation
applications can be
deployed as WebService and GridService Java Applets
• AIPS++ is modern,
OpenSource software
for reducing (radio)
astronomy data, 1.6M
lines of code.
Grid-based Volume Rendering
• Agreement between Melbourne and AstroGrid to develop our
existing distributed-data volume rendering code into a fullyfledged Grid-Service.
• Challenge is to interactively render a multi-GB cube at the IAU
GA 2003, using GridFTP to transfer the data volume from a
remote data warehouse to a remote rendering cluster.
Time to render 512x512 view of
1024x1024x1024 volume (seconds)
1000
100
10
1
0
10
20
number of nodes
30
40
DataGrids for Aus-VO
• Australian archives range from ~10 GB to
~10 TB in processed (reduced) size.
• providing just the processed images and
spectra on-line requires a distributed, highbandwidth network of data servers – that is,
a DataGrid.
• users may want some simple operations
such as smoothing or filtering, applied at the
data server. This is a Virtual DataGrid.
ComputeGrids for Aus-VO
• More complex operations may be applied
requiring significant processing:
– source detection and parameterisation
– reprocessing of raw or intermediate data
products with new calibration algorithms
– combined processing of raw, intermediate or
"final product" data from different archives
• These operations require a distributed, highbandwidth network of computational nodes
– that is, a ComputeGrid.