CDI Controlled Vocabularies

Download Report

Transcript CDI Controlled Vocabularies

CDI Controlled Vocabularies
Roy Lowry, Karen Vickers (BODC)
Michele Fichaut, Catherine Maillard (SISMER)
Reinhard Schwabe (DOD)
4 June 2003
Objective




To provide vocabularies to describe what was
measured
Used to restrict CDI search hit count
Vocabulary dynamically generated from
existing data/metadata systems
Therefore, bottom-up design rather than topdown
Scope

CDI requires three vocabularies





Platform
Instrument
Parameter
Platform and instrument vocabularies
developed by DOD
Parameter vocabulary developed by BODC
and SISMER
Platform Vocabulary



GF3 did a fairly good job (except grids!)
Vocabulary based on this
How is this going to be distributed and/or
maintained?
Instrument Vocabulary




Vocabulary describes either sample collection
or in-situ measuring technique
Compatibility with ROSCOP taken into
account
I think we now have an agreed list
Again, how is this to be maintained and
distributed?
Parameter Vocabulary



Strategy to develop a set of parameter
groups derived from data file parameter
codes
Started calling these ‘keywords’ but the word
implies a ‘top-down’ design approach
Settled on the name ‘Agreed Parameter
Groupings’
Parameter Vocabulary





Initial APG set based on BODC and SISMER
dictionaries
Parameter count in each group kept as
uniform as possible
Facilitates a list box interface
Almost succeeded but species-linked
parameters need further work
Further development possible with current
groupings operational
Parameter Vocabulary

36 groupings mapped to the disciplines:








Biology
Chemistry
Physical oceanography
Geology and geophysics
Meteorology and atmospheric chemistry
Multidisciplinary
Discipline independent
Discipline indicated by first byte of code
Parameter Vocabulary

The groupings are:

Biology









B005 Bacteria and viruses
B015 Birds, mammals and reptiles
B020 Fish
B025 Microzooplankton
B027 Other biological measurements
B030 Phytoplankton
B035 Pigments
B040 Zoobenthos
B045 Zooplankton
Parameter Vocabulary

The groupings are:
 Chemistry













C003 Amino acids
C005 Carbon, nitrogen and phosphorus
C010 Carbonate system
C015 Dissolved gases
C017 Fatty acids
C020 Halocarbons (including freons)
C025 Hydrocarbons
C030 Isotopes
C035 Metal concentrations
C040 Nutrients
C045 Other inorganic chemical measurements
C050 Other organic chemical measurements
C055 PCBs and organic micropollutants
Parameter Vocabulary

The groupings are:

Physical oceanography






D005 Acoustics
D010 Currents, sea level and waves
D015 Optical properties
D020 Other physical oceanographic measurements
D025 Sea temperature and salinity
Geology and geophysics




G005 Gravity, magnetics and bathymetry
G010 Sediment properties
G012 Sonar and seismics
G015 Suspended particulate matter
Parameter Vocabulary

The groupings are:

Meteorology and Atmospheric Chemistry



Multidisciplinary



M005 Atmospheric chemistry
M010 Meteorology
O005 Fluxes
O010 Rate measurements (including production,
excretion and grazing)
Discipline independent

Z005 Administration and dimensions
APG Implementation



Groupings incorporated in BODC Oracle
dictionary
Dynamic web interface including plain text
descriptions to assist group mappings
System is fully dynamic
Problems

Biological entity properties



Needs further subdivision
Further work once BODC dictionary has been mapped to
ITIS
Atmospheric chemistry




Very uncomfortable about this
Think through mappings of atmospheric pCO2
Rename as ‘Other atmospheric gases’ and map to
chemistry?
Further work as BODC/BADC develop common controlled
vocabulary for NERC Data Grid
Problems

Grouping codes




Having discipline defined by first byte is a problem
Remapping a grouping between disciplines (e.g.
chemistry to multidisciplinary) involves recoding
Recoding is an accident waiting to happen
Can we drop this rule and manage
mapping/ordering through explicit fields?
Problems

Multidisciplinary




This is a ‘catch-all’ that dilutes search
effectiveness
Necessary because discipline to APG mapping is
simple one to many
Could be replaced by a many to many mapping
Implications need to be considered for non-BODC
systems and CDI interface design