CDI Controlled Vocabularies
Download
Report
Transcript CDI Controlled Vocabularies
CDI Controlled Vocabularies
Roy Lowry, Karen Vickers (BODC)
Michele Fichaut, Catherine Maillard (SISMER)
Reinhard Schwabe (DOD)
4 June 2003
Objective
To provide vocabularies to describe what was
measured
Used to restrict CDI search hit count
Vocabulary dynamically generated from
existing data/metadata systems
Therefore, bottom-up design rather than topdown
Scope
CDI requires three vocabularies
Platform
Instrument
Parameter
Platform and instrument vocabularies
developed by DOD
Parameter vocabulary developed by BODC
and SISMER
Platform Vocabulary
GF3 did a fairly good job (except grids!)
Vocabulary based on this
How is this going to be distributed and/or
maintained?
Instrument Vocabulary
Vocabulary describes either sample collection
or in-situ measuring technique
Compatibility with ROSCOP taken into
account
I think we now have an agreed list
Again, how is this to be maintained and
distributed?
Parameter Vocabulary
Strategy to develop a set of parameter
groups derived from data file parameter
codes
Started calling these ‘keywords’ but the word
implies a ‘top-down’ design approach
Settled on the name ‘Agreed Parameter
Groupings’
Parameter Vocabulary
Initial APG set based on BODC and SISMER
dictionaries
Parameter count in each group kept as
uniform as possible
Facilitates a list box interface
Almost succeeded but species-linked
parameters need further work
Further development possible with current
groupings operational
Parameter Vocabulary
36 groupings mapped to the disciplines:
Biology
Chemistry
Physical oceanography
Geology and geophysics
Meteorology and atmospheric chemistry
Multidisciplinary
Discipline independent
Discipline indicated by first byte of code
Parameter Vocabulary
The groupings are:
Biology
B005 Bacteria and viruses
B015 Birds, mammals and reptiles
B020 Fish
B025 Microzooplankton
B027 Other biological measurements
B030 Phytoplankton
B035 Pigments
B040 Zoobenthos
B045 Zooplankton
Parameter Vocabulary
The groupings are:
Chemistry
C003 Amino acids
C005 Carbon, nitrogen and phosphorus
C010 Carbonate system
C015 Dissolved gases
C017 Fatty acids
C020 Halocarbons (including freons)
C025 Hydrocarbons
C030 Isotopes
C035 Metal concentrations
C040 Nutrients
C045 Other inorganic chemical measurements
C050 Other organic chemical measurements
C055 PCBs and organic micropollutants
Parameter Vocabulary
The groupings are:
Physical oceanography
D005 Acoustics
D010 Currents, sea level and waves
D015 Optical properties
D020 Other physical oceanographic measurements
D025 Sea temperature and salinity
Geology and geophysics
G005 Gravity, magnetics and bathymetry
G010 Sediment properties
G012 Sonar and seismics
G015 Suspended particulate matter
Parameter Vocabulary
The groupings are:
Meteorology and Atmospheric Chemistry
Multidisciplinary
M005 Atmospheric chemistry
M010 Meteorology
O005 Fluxes
O010 Rate measurements (including production,
excretion and grazing)
Discipline independent
Z005 Administration and dimensions
APG Implementation
Groupings incorporated in BODC Oracle
dictionary
Dynamic web interface including plain text
descriptions to assist group mappings
System is fully dynamic
Problems
Biological entity properties
Needs further subdivision
Further work once BODC dictionary has been mapped to
ITIS
Atmospheric chemistry
Very uncomfortable about this
Think through mappings of atmospheric pCO2
Rename as ‘Other atmospheric gases’ and map to
chemistry?
Further work as BODC/BADC develop common controlled
vocabulary for NERC Data Grid
Problems
Grouping codes
Having discipline defined by first byte is a problem
Remapping a grouping between disciplines (e.g.
chemistry to multidisciplinary) involves recoding
Recoding is an accident waiting to happen
Can we drop this rule and manage
mapping/ordering through explicit fields?
Problems
Multidisciplinary
This is a ‘catch-all’ that dilutes search
effectiveness
Necessary because discipline to APG mapping is
simple one to many
Could be replaced by a many to many mapping
Implications need to be considered for non-BODC
systems and CDI interface design