Powering the Virtual Observatory

Download Report

Transcript Powering the Virtual Observatory

talk structure
•
•
•
•
who are we ?
what is a VO ?
what are the challenges ?
what is an e-project ?
Andy Lawrence
2002
Garching June
AstroGrid
http://www.astrogrid.org.uk
Belfast
Cambridge
Edinburgh
Jodrell
Leicester
MSSL
RAL
Andy Lawrence
2002
Garching June
AstroGrid
http://www.astrogrid.org.uk
Optical
Infrared
X-ray
Radio
Solar
Space Plasma
Andy Lawrence
Garching June
2002
people
• PL =
• PS
• PM
Andy Lawrence
=
Nic Walton
=
Tony Linde
slogans
•
•
•
•
the archive is the sky (stolen from US-NVO)
everybody can be a power user
shift the results not the data
a supercomputer on your desk
goals
• A working datagrid for key UK databases
– machines, middleware and user interface
• tools for simultaneous browsing
– plus advanced visualisation, links to spectra etc.
• tools for on-line data analysis
– complex queries, statistics, model fitting, cluster analysis etc
• system for uploading code
• resource discovery method
the project
• PPARC funded project Sept 2001-2004
– approx 6Meuro
• Grid technology development programme
– part of UK e-Science programme
– links to GridPP, MyGrid (Bio), OGSA (Globus)
• Stepping stone to Virtual Observatory
– partner in AVO
– working links with US-VO
methodology
•
•
•
•
•
using Unified Process / UML
architecture driven
use case centric
iterative
open project approach
– open collaboration interactive web sites
– code sharing intended
– open source coding under consideration
status
• elaboration phase two-thirds finished
– science problems / use cases / architecture
– technology assessment reports
– collaborative web pages set up
• construction phase begins Sept 2002
• continuing R&D stream
– contribution to AVO
– collaboration with OGSA project
AstroGrid & AVO
• AstroGrid partner in AVO
– subset of work credited as AVO work
• CDS = V0.2
• demos early 2003 = V0.5
• AstroGrid late 2004 = V1
– fully working but functions limited as necessary
• AVO Phase B 2004+ = V2
collectivisation and democratisation
• thirty year trend towards communal organisation
–
–
–
–
–
facility class (common-user) instruments
central development of data reduction s/w
calibrated archives with simple tools
information services (Vizier, ADS, NED)
large consortium projects (MACHO, 2dF, SDSS, UKIDSS, VISTA...)
• next steps
– inter-operable archives (joint queries)
– communal exploration and analysis tools (data mining)
– automated resource discovery (registry)
the Virtual Observatory concept
• Aim to make all archives
speak the same language
–
–
–
–
–
all searchable and analysable by the same tools
all data sources accessible through a uniform interface
all data held in distributed databases that appear as one
archives form the Digital Sky
eventual interface to real observatories
the archive is the sky
the Grid concept
• shared managed distributed resources
– documents + data + software + storage + cycles + expertise
•
•
•
•
•
•
network : ability to pass messages
web : transparent document system
a supercomputer
computational grid : transparent CPU on your desktop
datagrid : transparent data access and services
information grid, knowledge grid ... ? everybody can
be a power user
Virtual Organisations ?
obstacles to overcome
•
•
•
•
sociology
internet technology
i/o bottleneck
network bottleneck
obstacles to overcome (1)
• sociology
– need agreed formats for data, metadata, provenance
– need standardised semantics ("ontology")
• internet technology
–
–
–
–
need protocols for publishing and exchanging data
need registry for publishing service availability and semantics
need method of transmitting authentication/authorisation
need methods for managing distributed resources
obstacles to overcome (2)
• i/o bottleneck
– need database supercomputers
– need innovative search and analysis algorithms
• network bottleneck
– data centres must provide analysis service
– facility class analysis code needed
shift the results not the data
grid geometry needed
• not P2P (like Napster)
• not a hierarchy (like LHC grid)
• service providers + users (like most commerce)
– some unplanned open use
– some registered use
– variety of access rights
two rivers
• academic / Globus
– remote log on
– identity/authentication/authorisation
– resource management
• commercial / W3C
– exchange of data (B2B)
– service description and publication
– "Web services" = XML + SOAP + WSDL (cf GLU)
Globus problems
• only half works
• data transfer primitive
– flat files only
• not a services-user structure
Web service problems
•
•
•
•
•
one-to-one
bulky
no general auth/auth solution
no accepted service registry solution
no ontology solution
solutions ?
• Grid Services : OGSA project
– Globus + IBM + more
• Database access : OGSA-DAI project
– UK e-science programme
– MyGrid and AstroGrid = early adopters
• need Astro-registry solution
• need Astro-ontology service (cf UCDs)
• need Astro-QL
e-project
• open, interactive, collaborative web sites
– reading open to all
– registered users can post/edit items
• AstroGrid News
• AstroGrid Forum
• AstroGrid Wiki
– shared documents
– shared code
– collaborative development
FIN
Web DB access today
DB
engine
SQL
data
front
end
CGI
request
html
browser
web
page
user
Web service
native
data
web
service
XML
data
application
DB
engine
SQL
XML
request
anything
user