Data and dissertations Joachim Schöpfel Primož Južnič
Download
Report
Transcript Data and dissertations Joachim Schöpfel Primož Južnič
Data and dissertations
Joachim Schöpfel
Primož Južnič
Hélène Prost
Cécile Malleret
Anna Češarek
Teja Koler-Povh
Data and dissertations
CONTEXT
GL17 - Dec 1-2, 2015
2
1.
2.
3.
4.
5.
6.
A framework for a collaborative data infrastructure
Additional funding
Measuring and rewarding data value
Training experts and broaden public understanding
Incentives for green technologies
High-level group for future planning
GL17 - Dec 1-2, 2015
3
E-science
Context
https://www.kth.se/en/forskning/forskningsplattformar/ict/forskning/e-vetenskap-1.323973
GL17 - Dec 1-2, 2015
4
eScience
• Data-driven science
• Data-intensive scientific discovery
– Data deluge
• 4th paradigm (Hey et al. 2009)
–
–
–
–
Distributed computer and knowledge systems
Information and communication technologies
Large-scale and collaborative sciences and engineering
Mathematical modelling, numerical analysis, visualization
techniques, data mining…
– Integration of theories, simulations and experiments
• Scientific information has become a continuum
between publication and data
GL17 - Dec 1-2, 2015
5
Research data
• « Recorded factual material commonly accepted
in the scientific community as necessary to
validate research findings »
US OMB Circular 110
• « Re-usable research results, collected, observed
or created for purposes of analysis to produce
original results »
University of Edinburgh (cited by Burnham 2013)
– Large variety of formats, sources and types
– Data as material (input) v. Data as results (output)
GL17 - Dec 1-2, 2015
6
Big data vs small data
Heterogeneous
formats and
content
Volume
Variety
Big
data
Velocity
Public
data
Small
data
Dark
data
Individual
projects
Restricted
open access
How to valorise listed data in the appendices ?
GL17 - Dec 1-2, 2015
7
Data life-cycle
Data published:
• How?
• Where?
http://www.lancaster.ac.uk/library/rdm/plan/data-lifecycle/
GL17 - Dec 1-2, 2015
8
Data publication
https://www.elsevier.com/connect/can-data-be-peer-reviewed
GL17 - Dec 1-2, 2015
9
Publication and data
• Document as data
– Exploited as primary data source for TDM
• Data vehicle
– Supplementary materials of publication
• Gateway to data
– Publication contains links to data, integrated or
not in the text
GL17 - Dec 1-2, 2015
10
Integration of DataVerse and OJS –
An example of the gateway function
GL17 - Dec 1-2, 2015
11
http://journal.code4lib.org/articles/10989
Data and dissertation
THE CASE OF ETDS
GL17 - Dec 1-2, 2015
12
The challenge of ETDs
Life-cycle management
GL17 - Dec 1-2, 2015
http://educopia.org/research/electronic-theses-and-dissertations
13
The small data of ETDs
Originality
Dark
science
Linked to a scientific program
Little
science
No commercial and public character
Small
data
ETD
Institutional
repositories
Small
science
Smart
data
Research
data
Small
research
collection
GL17 - Dec 1-2, 2015
14
ETDs and data
GL17 - Dec 1-2, 2015
15
The potential of ETDs
• Contain the results of at least three years of
scientific work
• Variety and richness of appendices
• Availability in open access
• Contribute to eScience
GL17 - Dec 1-2, 2015
16
Data and dissertations
EMPIRICAL RESULTS
GL17 - Dec 1-2, 2015
17
Data management : practices & needs
Survey at Lille 3
83% on private computer
49% on professional computer
declare themselves responsible for data backup
Expressed needs
97%
GL17 - Dec 1-2, 2015
18
Data management : practices & needs
The PhD students
63% are
motivated to
submit data
The less
experimented
in the field of
research data
Preference for
local or
institutional
repository
Response
rate
13%
GL17 - Dec 1-2, 2015
Interested in
the ethical
and legal
issues
Seek advice
for the
publication of
data
19
Research data in dissertations
A French-Slovenian survey
GL17 - Dec 1-2, 2015
780 dissertations analysed
20
The size of appendices
GL17 - Dec 1-2, 2015
21
GL17 - Dec 1-2, 2015
22
Sources of appendices
Domains
Total
Total
GL17 - Dec 1-2, 2015
23
GL17 - Dec 1-2, 2015
24
Typologies of appendices
Domains
Total
GL17 - Dec 1-2, 2015
Total
25
Link between text and data appendices
GL17 - Dec 1-2, 2015
26
Data and dissertations
OBSERVATIONS
GL17 - Dec 1-2, 2015
27
Text and data
Structure & presentation
It’s like bicycles !
GL17 - Dec 1-2, 2015
28
Text and data
Text and appendices separated (1)
which
CD-ROM
which
GL17 - Dec 1-2, 2015
29
Text and data
Text and appendices separated (2)
GL17 - Dec 1-2, 2015
30
Text and data
Text and data attached
Appendices are inserted at end of each chapter
Other data are included in text and classified
No appendices
All data are included in text with classification
or not
GL17 - Dec 1-2, 2015
31
ETDs in engineering science
A case study from IR DRUGG
• All dissertations(100 %) are archived in the DRUGG.
• In period 2008-2014 = 86 dissertations.
• 18,981 appendices integrated in all 86 dissertations
+
• also 237 attached appendices in 28 dissertations
• All content described by metadata
• Most data in format pdf
GL17 - Dec 1-2, 2015
32
DRUGG - typologies of
appendices
9286
Equations
7268
Images - drawings
2044
Tables
Graphs - figures
383
GL17 - Dec 1-2, 2015
33
Photo? Map?
Figure!
GL17 - Dec 1-2, 2015
34
The high frequenced appendices
in engineering
1. Equitations (total = 9,286).
They are a part of 65 dissertations (from 86).
2. Figures (total = 7,591).
They are presented in all of 86 dissertations.
Specialty:
Maps and photos (they are many) are
treated as figures!
Conclusion: The appendices are important source of
information in engineering.
GL17 - Dec 1-2, 2015
35
Potential use & valorisation of data
Images
• Databases
Texts
• Lexical analysis,
data mining
Historic
data
• Prosopographies
GL17 - Dec 1-2, 2015
36
Barriers to open data
Incomplete,
inadequate
or missing
description
Missing
organisation
Inadequate
format
Datasets and/or
individual data are
not or incompletely
documented
Research data are
presented without
any structuration or
organisation, often
together with other,
not reusable
material in a kind of
information mash-up
not suitable for
further
research
GL17 - Dec
1-2, 2015
Data and text are
glued together in a
PDF file instead of
being separated and
published in
adequate file
formats
37
Data and dissertations
RECOMMENDATIONS
GL17 - Dec 1-2, 2015
38
Recommandations
GL17 - Dec 1-2, 2015
39
ETDs and data
GL17 - Dec 1-2, 2015
40
ETDs and data
GL17 - Dec 1-2, 2015
41
Helping PhD students to manage data
Services
• Seminars,
conferences,
training
• Online resources
(guidelines, FAQ...)
• Alert service
Education
• Legal and technical
help (data
management plan)
• Technical
assistance for
deposit
• Liaison with
laboratories
• Mediation for
deposit
• Partnership with
networks and
repositories
• Development of
tools on the
campus
Advice,
assistance
Infrastructures
GL17 - Dec 1-2, 2015
42
Data management at university
Principles
1
A discipline-specific approach
An integration into the doctoral education
2
3
4
5
A proposal of data management plans
Incentives for the digital deposit of research data
A contribution to the preservation and dissemination of data
GL17 - Dec 1-2, 2015
43
Data and dissertations
THANK YOU !
Contact
[email protected]
[email protected]
References
http://www.citeulike.org/user/Schopfel/tag/gl17
GL17 - Dec 1-2, 2015
44