or My year as a NeSC Research Leader Bob Mann Institute for

Download Report

Transcript or My year as a NeSC Research Leader Bob Mann Institute for

“A wide-field astronomer in King
Malcolm’s court”
or
My year as a NeSC Research Leader
Bob Mann
Institute for Astronomy and NeSC
University of Edinburgh
Outline of talk
My background
Duties of a NeSC Research Leader
Some of my highlights from the year
Astronomical testbed for edikt’s BinX project
Scientific Data Mining, Integration & Visualization
Sky Survey Database Design
Virtual Observatory as a Data Grid
Conclusions from the year
My background
“Generalist” astronomer
Theory and observations
X-ray, optical, infrared, submillimetre, radio
Formation/evolution of galaxies & clusters of galaxies
Member of Wide Field Astronomy Unit
Home of UK’s largest sky survey databases
Member of AstroGrid team
UK contribution to building an
international Virtual Observatory
Duties of a NeSC Research Leader
encouraging the uptake of Grid technologies in
Astronomy and related fields
encouraging visitors, with whom you have a research
overlap, to visit Edinburgh and work with you and other
local colleagues
organising and running research workshops
assisting with the development of new core Grid and
scientific database technologies
promoting NeSC within the Universities of Edinburgh and
Glasgow through, for example, personal presentations
and more widely at conferences and workshops.
…and all that in 0.5 FTE!
BinX astronomy testbed
What is BinX?
edikt project
see www.edikt.org/binx - download BinX v1.0!
XML language description of binary data files
library of tools for manipulating files
Why BinX and astronomy?
Two main data formats for tabular data:

VOTable (XML) and FITS binary tables
XML good for interoperability and transformation, but
verbose & lots of legacy data in FITS files
Want FITS some of the time & VOTable the rest
BinX astronomy testbed (2)
VOTable
FITS conversion with BinX
it works! - some performance improvements desirable
(use SAX, not DOM)
workable solution for astronomy & proof of concept for
edikt
Possible extensions to BinX
data extraction from binary files with XPath 1.0?
delivering SAX events from binary files to apps?
closer integration with databases - ELDAS?
RL time significantly improved interaction
Science drivers for GGF DFDL WG
“Scientific Data Mining, Integration
& Visualization”
Two-day workshop in October 2002
Focus for visit by Roy Williams (Caltech)
Fifty attendees - astronomy, atmospheric science, bioinformatics,
chemistry, digital libraries, engineering, environmental science,
experimental physics, marine sciences, oceanography, and statistics…plus
CS and software engineers
Report [UKeS-2002-06] with 12 recommendations
R5. A mechanism should be sought whereby the peer-reviewed publication of datasets can be made
part of the standard scientific process.
R8. A set of tutorials should be created and maintained, for introducing application scientists to new
key concepts in e-science.
Spawned e-science Data Mining SIG
now - want to discuss solutions, not just problems
“Sky Survey Database Design”
One-day workshop in April 2003
~10 people:
AstroGrid, UK wide field astronomy, IBM, Oracle
Identified spatial indexing in large databases as
a problem of interest beyond astronomy
Spawned research programme on spatial
indexing in sky survey databases - NeSC,
WFAU, IBM, Oracle and Microsoft
future applications to other spatially-indexed domains
“The Virtual Observatory as a Data
Grid”
Three day meeting in June/July 2003
Focus for visits by Jim Gray (Microsoft), Alex
Szalay (Johns Hopkins), Roy Williams (again!)
25 participants - Virtual Observatory, database and
data grid communities in UK, US, Europe
Report [UKeS-2003-03]
Spawning “SkyQuery-G”
take SkyQuery.Net WWW service for matching
astronomical sources, and make it a grid service, using
OGSA-DAI and/or ELDAS
Conclusions
RL positions valuable from applications side
long, steep learning curve for average scientist
RL positions valuable from “infrastructure” side
sustained involvement with user community
allows creation of realistic testbeds
Visitor(s)  Workshop  Report model works
Serious Concern
conceptual chasm between infrastructure builders and
application scientists
RL-type positions help bridge it, but what else?