Transcript open data
Open Science is More than Open Access,
but what is it?
Susan Reilly
@skreilly
Executive Director
Brussels, October 2015
I’ll be talking about…
• LIBER & Open Science
• Definition of Open Science
• Building blocks of Open Science
• What’s it all for? Knowledge Discovery
LIBER: reinventing the library of the future
• Largest network of European research libraries: 410 in over 40
countries
Mission:
“To provide an information infrastructure to enable
research in LIBER institutions to be world class”
LIBER: Information Infrastructure for World Class
Research?
• Collaborative
• 90% of research papers are collaborative
• International
• 40% of French & German research outputs a result of international
collaboration
• Rate of citation grows as geographic extent of collaboration
increases
• Interdisciplinary
• Foundation of frontiers research
• Data intensive
• supports interdisciplinary exploration
• … and open
Higgs boson
2012 Journal
Physics B
paper: 6235
citations
Libraries enabling Open Science
“We believe that the move towards openness will lead to
increased transparency, better quality research, a higher
level of citizen engagement, and will accelerate the pace of
scientific discovery through the facilitation of data-driven
innovation.”
http://libereurope.eu/wpcontent/uploads/2014/09/LIBER_Statement-on-openscience-final.pdf
Open Science Definition
“The conduction of science in a way that
others can collaborate and contribute, where
research data, lab notes and other research
processes are freely available, with terms that
allow reuse, redistribution and reproduction of
the research”
https://www.fosteropenscience.eu/foster-taxonomy/open-science-definition
The problem with defining Open Science
• Means is often confused with the end
• Ultimate goal is to be a transient term i.e.
Open Science = Science
• Aims to bring coherence and vision to a range
Science
of different Open
open activities
e.g. open access,
= Diversity
open data, open
software
• Key is changing practice and culture, which is
different for every stakeholder
Open Science Goals
• Transparency in experimental methodology, observation,
•
•
•
•
and collection of data
Public availability and reusability of scientific data
Public accessibility and transparency of scientific
communication
Citizen engagement*
Using web-based tools to facilitate scientific collaboration
Dan Gezelter, http://www.openscience.org/blog/?p=269
*EU
Move away from a 300 year old model!
Work.
Finish.
Publish!
(Faraday)
To an Open Science Landscape
Collaboration
Open infrastructure
New forms of peer review
Research data management
Open access publishing
Policy
Massive Open Online Courses (MOOCs)
Alternative Metrics
Open data
Open science
Advocacy & training
Coyright & licencing
Open educational resources
Building block: open access
“We write to
communicate an
untenable situation
facing the Harvard
Library. Many large
journal publishers have
made the scholarly
communication
environment fiscally
unsustainable and
academically
restrictive. ” Harvard
University Library,
2012
Moved beyond the tipping point: http://www.sciencemetrix.com/pdf/SM_EC_OA_Availability_20042011.pdf
It’s not just about the cost, it’s about the
value…
• Accessible: make it available in accessible formats (XML)
• Findable: put it in open and sustainable infrastructure
• Reusable: attach clear permissions statements/licences
Building block: open data
• Need to release the value of data
• Benefits:
• Jobs (Copernicus= 50000 jobs)
• Research productivity (big bang)
• Help communities (flood hack)
• Cost of not sharing (bird flu)
1.7 million billion
bytes of data
every minute
X 34
Data must be..
• Open by default (G8,LERU)
• Usable by all
• Available
• Findable
• Interpretable
• Citable
• Curated/preserved
Building block: skills and
training
ReCODE Recommendation 10: Support the transition to open
research data through curriculum-development and
training.The transition to an open science paradigm where
research data plays a significant role requires training and
education for researchers and for data managers who support
open science. Courses for getting researchers and data
managers up-to date with current relevant issues are
necessary, as well as the development of curricula that
contribute towards the development of data science and
information management as distinct and legitimate career
paths.
•Need to embed training in post graduate education
•Invest in the development of the data professional
•Training provision as and when needed (importance of train the
trainer)
•Training and support for new tools and methods
Building block: infrastructure
• International
• Open
• Interoperable
• Cross disciplinary
• Facilitate collaboration
•
•
•
•
•
Store & Share
Sync & Exchange
Replicate
Compute
Find
•
•
•
•
•
Content
TDM tools
Workflows
Standards
Interoperability
Building block: advocacy…
• Advocate for roadmaps and policies that promote open
science at institutional and national level
• Advocate for changes in practice e.g. data citation, use
of cc licences
• Promote your Open Science project
• Engage new audiences
and incentives
• Need to change system of incentives and assessment
• Move away from journal based metrics
• Consider value and impact of ALL research outputs (data,
software…)
• Align assessment with institutional values
• Only a change of system of incentives will truly change
practice and culture
Building block: policy and legislation
• Legal clarity
• Interoperabilty (WIPO solution?)
• Ensure researchers have right to secondary
publication
• Standard open access licences
• CC-by and CC0/PD
Knowledge Discovery in the Digital Age
• Ultimate goal of text and data mining is to
extract high level knowledge from low level data
• Allows analysis across disciplines
• “Undiscovered public knowledge” (Swanson)
• Identifies patterns in the data to produce new
knowledge
• It’s not a new thing, it’s just digital information
makes it a whole lot more powerful and
relevant!
Human Computers (1901)
453253424
"This above all:
to thine own self
be true".
http://marlowe-shakespeare.blogspot.nl/2009/02/on-mendenhall-and-compelling-evidence.html
Use of words(2009)
Marsden J, Budden D, Craig H, Moscato P (2013) Language Individuation and Marker
Words: Shakespeare and His Maxwell's Demon. PLoS ONE 8(6): e66813.
doi:10.1371/journal.pone.0066813
Cancer diagnosis(2013)
•
http://theconversation.com/shakespeare-and-cancer-diagnoses-how-bard-can-it-be15381ata sets
Malhotra A, Younesi E, Gurulingappa H, Hofmann-Apitius M (2013) ‘HypothesisFinder:’ A Strategy for the
Detection of Speculative Statements in Scientific Text. PLoS Comput Biol 9(7): e1003117.
doi:10.1371/journal.pcbi.1003117
Copyright v TDM
• Because it involves the copying of content in
order to convert into machine readable format
TDM may infringe copyright
• European Database Directive
prohibits copying of substantial
parts of databases
• In US TDM is covered
by fair use, other parts of the
world have a specific exception
e.g. Japan, UK
https://www.flickr.com/photos/apelad/304195427/
The problem with licences
• Permission culture: Why relicence? Can’t licence
everything!
• Not scalable or cost effective
• Will licence reflect how the researcher actually
performs TDM?
ME 442 Permission" by Nina Paley - http://mimiandeunice.com/2011/08/30/permission-2/. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons http://commons.wikimedia.org/wiki/File:ME_442_Permission.png#mediaviewer/File:ME_442_Permission.png
Elsevier TDM Policy
• Access through API only
• Text only- no images, tables
• Research must register details
• Click-through licence
• Terms can change any time
• Reproducibility of results
1. INTELLECTUAL PROPERTY WAS NOT
DESIGNED TO REGULATE THE FREE FLOW OF
FACTS, DATA AND IDEAS, BUT HAS AS A KEY
OBJECTIVE THE PROMOTION OF RESEARCH
ACTIVITY
4. ETHICS AROUND THE USE OF
CONTENT MINING TECHNIQUES WILL
NEED TO CONTINUE TO EVOLVE IN
RESPONSE TO CHANGING TECHNOLOGY
• The Hague Declaration: http://thehaguedeclaration.com/the•
•
•
•
•
•
•
hague-declaration-on-knowledge-discovery-in-the-digital-age/
LERU Roadmap for Research Data
http://www.leru.org/index.php/public/news/press-release-leruroadmap-for-research-data/
EUDAT http://eudat.eu/
Research Data Alliance https://rd-alliance.org/
LIBER 10 Recommendation on Getting Started in RDM
http://libereurope.eu/wpcontent/uploads/The%20research%20data%20group%202012%20v7%2
0final.pdf
OpenAire https://www.openaire.eu/
San Francisco Declaration
http://www.ascb.org/dora-old/files/SFDeclarationFINAL.pdf