Data driven research in Earth and Environmental Sciences

Download Report

Transcript Data driven research in Earth and Environmental Sciences

Joining the Dots
Managing and identifying geolocated data by DOIs and IGSNs
Jens Klump | OCE Science Leader Earth Science Informatics
20 August 2014
MINERAL RESOURCES FLAGSHIP
TERENO
Terrestrial Environmental
Observatory
Managing data in environmental monitoring
What is TERENO?
• TERENO is an infrastructure initiative by the Helmholtz Association
to provide an environmental monitoring infrastructure for the
scientific community.
• Construction started in 2008, operation planned for 25 years.
• 4 regional observatories
• TERENO Northeast has
•
•
•
•
8 study sites
32 platforms
Approx. 35 M data entries from various sensors
More platforms being added
• The other three regional observatories are of similar scale.
3 | Joining the dots | Jens Klump
Regions of high climate vulnerability
Regions of high vulnerability
 Droughts
 Heat waves
 Floods
 Winter storms
 Loss of biodiversity
 Landsides
From:
Rüdiger Glaser (2008)
Klimageschichte Mitteleuropas
1200 Jahre Wetter, Klima, Katastrophen
4 | Joining the dots | Jens Klump
TERENO Regional Observatories
 Northeastern German Lowland
Observatory
•Coordination: GFZ
 Harz / Central German Lowland
Observatory
•Coordination: UFZ
 Eifel / Lower Rhine Valley
Observatory
•Coordination: FZJ
 Bavarian Alps / pre-Alps Observatory
•Coordination: HMUG und KIT
5 | Joining the dots | Jens Klump
TERENO Research Goals
Investigate interactions and feedbacks
between different compartments:
Atmosphere
Terrestrial
Biosphere
6 | Joining the dots | Jens Klump
Terrestrial
Hydrosphere
& Pedosphere
Bridging the gap between measure-ment,
model and management:
TERENO Northeast
7 | Presentation title | Presenter name
Combination of geoarchives with process
observations

Region impacts of Global Change on near-natural terrestrial ecosystems and
landscape in space and time

Integrated system analysis of climate- and landscape development/process
understanding

Combination of real-time process observations (e.g. soil moisture, hydrology,
vegetation) and evaluation of geoarchives (lacustrine, colluvials, peats, soils)
Remote Sensing
8 | Presentation title | Presenter name
Field observation
Geoarchive
TERENO data management
9 | Presentation title | Presenter name
System architecture
10 | Presentation title | Presenter name
TERENO data portal
11 | Joining the dots | Jens Klump
Looking Ahead:
Future Directions
Data Driven Research in the Geological Sciences
Identifiers for software
• Similar to data an specimens, also software should be identifiable
in a persistent way.
• Establish the missing link between papers and data.
• Make software recognisable as a scientific achievement.
• Make science more transparent and reproducible.
• Simply assigning DOI to software is a good start but might not be
good enough.
• Again, we encounter the question of identity (version) and
location (repository).
www.sciforge-project.org
13 | Joining the dots | Jens Klump
Managing Data from Sensor Networks
14 | Joining the dots | Jens Klump
Working with very large data sets
• Some data sets are too large
to be inspected in detail, or
even to be loaded on a
desktop PC.
• Example: How would one
check three years of
meteorological radar data for
anomalies?
• Data mining today mainly
involves numerical and textual
media.
• Processing will have to move
from the desk top to the cloud
for large data sets.
15 | Joining the dots | Jens Klump
Linked Data
1. Use URIs to denote things.
2. Use HTTP URIs so that these things
can be referred to and looked up
("dereferenced") by people and user
agents.
3. Provide useful information about the
thing when its URI is dereferenced,
leveraging standards such as RDF,
SPARQL.
4. Include links to other related things
(using their URIs) when publishing
data on the Web.
16 | Joining the dots | Jens Klump
Summary
• Persistent identifiers allow us to publish, cite, identify data,
specimens and software.
• Data publication is now becoming more common.
• The principles of data identification can also be used with
materials (e.g. IGSN) and software.
• Future publications might consist of elements linked by identifiers:
•
•
•
•
Interpretation (“Paper”)
Data
Materials
Software and workflows
• More and more data repositories offer API based on linked data.
• Future data “publication” will also cater both for people and user
agents.
17 | Joining the dots | Jens Klump
Thank you
Mineral Resources Flagship
Jens Klump
OCE Science Leader Earth Science
Informatics
t +61 8 6436 8828
e [email protected]
w www.csiro.au
MINERAL RESOURCES FLAGSHIP