Data at risk of being lost
Download
Report
Transcript Data at risk of being lost
Knowledge base for growth and innovation in
ocean economy: assembly and dissemination of
marine data for seabed mapping
LOT 5 – BIOLOGY
WP4 Data archaeology and rescue
[email protected]
Within GODAR (Global Oceanographic Data Archeology and Rescue) project
scientific community agreed upon the definition of the term “data at risk of being
lost”, “data archeology” and “rescue” as follows:
"Data at risk of being lost":
(i) Held on a medium which will degrade physically in the coming years, or
(ii) Held on electronic medium or in a format which will not be readable in the
coming years, or
(iii) Held by individuals or institutions which will not be able to keep together the
data itself and related metadata, or
(iv) Held on medium which makes public access difficult in spite of its relevance for
today's research on global change.
"Data Archaeology" is the term used to describe the process of seeking out,
restoring, evaluating, correcting and interpreting historical datasets.
"Rescue" refers to the effort to save data at risk from being lost to the science
community.
The overall objective of the work package 4 is to fill the
spatial and temporal gaps in EMODnet data availability by
implementing data archaeology and rescue activities.
Specific objectives:
• To identify historical data that are at risk of being lost
and mobilize the human resources for their archeology
and rescue.
• To run a framework of small grants for their digitization,
standardization and quality control.
• To propose a mechanism for the networking of the
supporting community to ensure continuous inflow of
datasets in the future.
IBSS + HCMR, GBIF, VLIZ
Focus: Black Sea and Eastern Mediterranean Seas
(but will not be limited to this region)
Activity: gap analyses and inventory of historical data that are at risk of
being lost
Zooplankton EurOBIS/EMODNet
1976
1986
1983
Activity: gap analyses and inventory of historical data that are at risk of
being lost
• Marine biologists and other potential data holders in the
region will be contacted in order to provide the relevant
information.
• Search in the archives of the marine biology centers and
institutes will be performed in order to locate the
potential resources either in paper or electronic format.
(mainly IBSS)
Black Sea – IBSS, Mediterranean – HCMR (?), GBIF – (?)
Should be discussed
Activities: run of small grant system
A specialized working group will be organized by IBSS, HCMR and the
coordination board to evaluate datasets proposed for digitization and
standardization during the first phase.
A short-term, small-scale grant system will be implemented in order to
provide resources for the digitization, standardization and online publication
of new datasets from the Black Sea and Eastern Mediterranean.
Budget. Should be discussed
EMODNET Biology preparatory actions
Phase II
9 individuals (scientists) and 3 digitizers
Phase I
Activities: run of small grant system
The process of digitizing will be divided on following steps:
1. dataset analyses, to be sure that it is valuable for digitizing;
2. review the existing databases to find out if the dataset was already
digitized;
3. if dataset already have been digitized check the completeness of digitized
data and metadata fields to identify necessary additions;
4. digitization of all metadata information;
5. digitization all information from paper source without any deletion or
reduction;
6. use of external sources (cruise reports, publications etc.) to find all
additional metadata, which can be attached to the samples: responsible
persons, instruments, identification guides used, sampling, preservation
procedures, etc.;
…
7. Load of raw data to database table (all digitized fields should be placed
in the database without any changes or losses). Information in this
table should not be changed in the future;
8. scanning (photo) the paper with data and attach it to the digitized data;
9. transfer primary data into the format agreed;
10. quality control;
11. dataset documentation;
12. submission of files for load to EMODNET.
Activities: implementation of mechanism to
ensure continuous inflow of datasets in the
future
The proposed mechanism includes the following:
(a) workshops in which the managers and the users of the data are trained in
using the tools and services provided by the portal(s) from which the data
are made available;
(b) specific tools which notify data producers and managers on the use of their
datasets;
(c) development of a regional mechanism that will support continuous activities
in data rescue and archaeology of marine biological data;
(d) periodical compilation of statistics from which the gaps filled with the new
datasets are described;
(e) making the data more useful by local scientists in their research because of
improved quality, accessibility and association with complementary
biological and environmental data;
(f) enabling local scientists to publish their data into OBIS and GBIF.
Deliverables
D4.1: Report on data availability and gap analyses for the Black Sea M9
D4.2: Description of identified historical datasets M12
D4.3: Report describing the datasets that are digitized, standardized and
mobilized into system including dataset documentation and QC
procedures applied M24
D4.4: Workshop report including description of mechanisms and
guidelines on mobilization of historical data into the systems M36
WP4 Data archaeology and rescue
Overview of datasets
Institute of biology of the southern seas - IBSS
##
Dataset title
Origin institute
Current format
3
Phytoplankton
database
IBSS, IMS
METU, SIO RAS
Paradox
database
Temporal
cover
1968-1997
Geographic
cover
Black sea
Taxonomic
cover
phytoplankton
Size
More than
50000 records
Preliminary inventory of IBSS archive phytoplankton section
##
Dataset title(1)
Current format
Temporal cover
Geographic cover
Taxonomic cover
Size(2) , pages
summary quantity and biomass tables
Origin
institute
IBSS
1
paper
1951, 1952, 1954, 1956
Black Sea
phytoplankton
84
2
hydrobiological observations
IBSS
paper
1952
unknown
phytoplankton
96
3
expedition journal (incl. sample
harvesting with nets, bathometer, on
24h stations)
IBSS
paper
1952
E and NW parts of the
Black Sea, unknown,
central part of the BS,
sections: eastern part of
the BS - Novorossijsk open sea, Hersonissos open sea, Tuzlo - Sulina,
open sea Tuapse,phyllophora field,
mouth of the Dnestr
phytoplankton
176
4
primary samples processing
IBSS
paper
1952
Sevastopol bay, unknown
phytoplankton
180
5
expedition journal (incl. sample
harvesting with nets, bathometer, on
24h stations)
IBSS
paper
1953
Hersonissos cape - open
sea, unknown, Bosphorus,
Sukhumi bay, Kelaguri
phytoplankton
189
The Southern Scientific Research Institute of Marine
Fisheries and Oceanography – YugNIRO
##
Dataset title
Origin
instit
ute
1
Zoobentos
2
Zooplankton
YUGN
IRO
YUGN
IRO
Curre
nt
forma
t
paper
paper
Tempor
al cover
19862009
19631998
Geographic
cover
Black Sea,
Azov Sea
Black Sea,
Azov Sea
Taxonomic cover
Samples
Records per
samples(1)
Size
Zoobentos
425
10
4250
Zooplankton
8000
15
120000
The Institute of Marine Sciences (IMS) Middle
East Technical University – IMS METU
#
#
4
Dataset title
Origin institute
Current format
Phytoplankton
IMS-METU
Paper, files, obsolete
databases
5
Zooplankton
IMS-METU
Paper, files, obsolete
databases
Temporal
cover
1985-
1985-
Geographic
cover
Black Sea,
Mediterranea
n sea
(Aegean)
Black Sea,
Mediterranea
n sea
(Aegean)
Taxonomic
cover
phytoplankton
Size
Zooplankton
unavailabe
unanvailable
Thank you