Interoperability of astronomy data bases Françoise Genova, CDS

Download Report

Transcript Interoperability of astronomy data bases Françoise Genova, CDS

Interoperability of
astronomy data bases
Françoise Genova, CDS
F. Genova, VO as a Data Grid, 2003/06/30
1
Astronomy
• A small discipline
• Few commercial constraints
• A long term partnership to define exchange
standards and links
FITS: early data exchange format
on-line information in astronomy:
from observations to results
F. Genova, VO as a Data Grid, 2003/06/30
2
Object name
>> position
An early example
of interoprerability:
Name resolver
F. Genova, VO as a Data Grid, 2003/06/30
3
Evolving world
- Web Services
- Very Large
Surveys
Simbad Name
Resolver
>>
Sesame Web
Service
F. Genova, VO as a Data Grid, 2003/06/30
4
Web Services
as building
blocks
F. Genova, VO as a Data Grid, 2003/06/30
5
The astronomy bibliographic network
Links between
–
–
–
–
electronic journals
the ADS bibliographic database
on-line services (SIMBAD, NED)
… archival data/published results
bibcode
e.g. 1999A&A...351.1003G
F. Genova, VO as a Data Grid, 2003/06/30
6
References in an
on-line article
 links to ADS
 quality check
F. Genova, VO as a Data Grid, 2003/06/30
7
ADS
- Abstracts
- Scanned articles
- Links to original
on-line paper
- Links to other,
distributed information
e.g. original
observations
- Also-read articles
F. Genova, VO as a Data Grid, 2003/06/30
8
ADS
From bibliography
To the original
observations
Search an
author’s
publications
F. Genova, VO as a Data Grid, 2003/06/30
9
F. Genova, VO as a Data Grid, 2003/06/30
10
F. Genova, VO as a Data Grid, 2003/06/30
11
HST archive
From observation
To publication
F. Genova, VO as a Data Grid, 2003/06/30
12
A&A on line
From publication
to SIMBAD
and more
F. Genova, VO as a Data Grid, 2003/06/30
13
Simbad
Information
About the object
F. Genova, VO as a Data Grid, 2003/06/30
14
Links
To high energy
observations in
HEASARC,
To Catalogues
F. Genova, VO as a Data Grid, 2003/06/30
15
Links from
object names
in journals:
an early example:
IBVS
F. Genova, VO as a Data Grid, 2003/06/30
16
Links to
images
and data
F. Genova, VO as a Data Grid, 2003/06/30
17
GCVS
F. Genova, VO as a Data Grid, 2003/06/30
18
Lessons learnt
• De facto standard
SIMBAD/NED, ADS, journals, archives
• Cooperation between all the actors
+ snowball effect
Journals, ADS, data centres, archive centres
• The community is trained (everyday tools)
• Easy-to-build link
but contents / validation are fundamental
the role of experts remains fundamental for
building value-added services
F. Genova, VO as a Data Grid, 2003/06/30
19
Data federation
Tabular data in astronomy
A common description for tabular data
–
–
–
–
Reference catalogues
Published tables
Surveys
Catalogues of observations in archives
ReadMe
physical organization
F. Genova, VO as a Data Grid, 2003/06/30

contents
20
ReadMe
F. Genova, VO as a Data Grid, 2003/06/30
21
Tables published
in articles are
usable data!
F. Genova, VO as a Data Grid, 2003/06/30
22
Links to observational
data (images, spectra,
time series),
often distributed
in observatory
archives
F. Genova, VO as a Data Grid, 2003/06/30
23
An homogenized view of
heterogeneous information
With links to data
F. Genova, VO as a Data Grid, 2003/06/30
24
Towards data integration
• XML
• VOTable > Roy’s talk
F. Genova, VO as a Data Grid, 2003/06/30
25
HEARSARC
Browse
Astrores
in action!
F. Genova, VO as a Data Grid, 2003/06/30
26
With a touch of GLU
Générateur de Liens Uniformes
• Resource dictionary (shared, distributed,
hierarchical name space, clone
management…), knows the query syntax
• Resolver: symbolic name > URL
• An early registry prototype with many of
the required functionalities (conversions,
failure tests, …)
F. Genova, VO as a Data Grid, 2003/06/30
27
Observations
from European
archives
Inclusive:
NASA missions
HST, Chandra,
surveys, …
F. Genova, VO as a Data Grid, 2003/06/30
28
Glu Tag
Resolver
Symbolic name
>>
URL
F. Genova, VO as a Data Grid, 2003/06/30
29
Data mining: the Uniform Content
Descriptors
• A set of UCDs has been first developed in the
frame of the ESO/CDS Data Mining project to
describe VizieR catalogue columns (100.000
columns, 1.300 UCDs)
• Name for concept
• Assigned semi-automatically using column label,
description and unit
• Used e.g. to check the coherence of information in
tables
F. Genova, VO as a Data Grid, 2003/06/30
30
UCD browser
F. Genova, VO as a Data Grid, 2003/06/30
31
UCD+units: conversion, selection
Select catalogues & target
Join on UCDs :
F. Genova, VO as a Data Grid, 2003/06/30
32
Colour excess
characterizes
carbon stars
F. Genova, VO as a Data Grid, 2003/06/30
33
F. Genova, VO as a Data Grid, 2003/06/30
34
http://cdsweb.u-strasbg.fr/UCD/
F. Genova, VO as a Data Grid, 2003/06/30
35
Extension of UCDs
• SDSS: 1.500 columns
– Manual verification needed
– Very few additional UCDs
• On-going:
– UCDs for VOX
– UCDs for IDHA data model
Add ‘Metadata’ branch
… FITS keywords are often not accurate
F. Genova, VO as a Data Grid, 2003/06/30
36
Validation, maintenance
• A new structure
• Prune the existing UCD tree
• UCD steering group
– Proposal in October 2003
– UCD V1.0
– Evolution mechanism
F. Genova, VO as a Data Grid, 2003/06/30
37
F. Genova, VO as a Data Grid, 2003/06/30
38
The future
• Other knowledge basis
– List of object names
– Journal keywords
– Thesaurus (built by librarians)…
Astronomy described with different points
of view
Converging towards an ontology of
astronomy??
F. Genova, VO as a Data Grid, 2003/06/30
39