Editorial Review Quarter 1 - Open Access Scholarly

Download Report

Transcript Editorial Review Quarter 1 - Open Access Scholarly

Open by default: Applying the right
license to research data in open access
journals
COASP, Budapest, 20 September 2012
Iain Hrynaszkiewicz
Publisher (Open Science), BioMed Central
[email protected]
BioMed Central open data initiatives
1.
2.
3.
4.
5.
6.
7.
8.
Data journals and article types
Open Data Award
Data deposition (repositories), citation, and linking
Data/workflow integration (LabArchives partnership)
Data licensing
Human subjects – confidentiality and consent
Guidance and best practice
Data formats and standards
Sources of data in OA journals
• Additional files (supplementary material) which
include data sets supporting reported results
• Bibliographic data including reference lists
• Numerical tables in main text of articles
• Data points underlying graphs
• Text-minable terms and other machineharvestable information
Sources of data in OA journals
• Additional files (supplementary material) which
include data sets supporting reported results
• Bibliographic data including reference lists
• Numerical tables in main text of articles
• Data points underlying graphs
• Text-minable terms and other machineharvestable information
• In other words, we’re all data publishers
JISC text mining report, March 2012
“Legal uncertainty, inaccessible information silos, lack of information and
lack of a critical mass are barriers to text mining within UKFHE.”
http://www.jisc.ac.uk/media/documents/publications/reports/2012/value-textmining.pdf
The Guardian, 23 May 2012:
“Bergman, Murray-Rust, Piwowar and countless other academics are prevented from
using the most modern research techniques because the big publishing companies
such as Macmillan, Wiley and Elsevier, which control the distribution of most of the
world's academic literature, by default do not allow text mining of the content that
sits behind their expensive paywalls.”
http://www.guardian.co.uk/science/2012/may/23/text-mining-research-toolforbidden
Hargreaves report, May 2011
“According to the Wellcome Trust, 87 per cent of the material housed in
UK’s main medical research database (UK PubMed Central) is unavailable
for legal text and data mining.”
http://www.ipo.gov.uk/ipreview-finalreport.pdf
Why does open data licensing matter?
• Open data is a means to do better science
more efficiently
• Licenses, copyright and IP are legal barriers to
data sharing and reuse
• Removal maximises potential for data reuse,
integration and discovery of new knowledge
“BioMed Central believes that the concept of open data, analogous to its policy on open
access to journals, goes beyond making data freely accessible. Data should also be free to
distribute, copy, re-format, and integrate into new research, without legal impediments”
BioMed Central’s draft position statement on open data. September 2010
http://blogs.openaccesscentral.com/blogs/bmcblog/resource/opendatastatementdraft.pdf
http://pantonprinciples.org/
“[P]eople mis-use copyright licenses on
uncopyrightable materials and data sets: the
confusion of the legal right of attribution in
copyright with the academic and professional
norm of citation of one's efforts.” John
“...any restrictions on use should be strongly
Wilbanks, VP, Science, Creative Commons,
http://bit.ly/djl5Fa August 11, 2010
resisted and we endorse explicit encouragement
of open sharing.” Schofield et al.: Post-publication
sharing of data and tools. Nature 2009, 461:171.
“The data should be released in standardized
formats without intellectual property constraints.”
http://www.isitopendata.org/
Conway PH, VanLare JM: Improving Access to
Health Care Data: The Open Government
Strategy. JAMA 2010;304(9):1007-1008.
Copyright and data
If data = numerical representation of facts then
they are generally not copyrightable, but...
• Many levels of data/derived digital data
• Jurisdictional differences (e.g. US vs.
Australian law; EU database rights)
= ambiguity about legal status of content
Licenses and waivers for data
• Licenses are for asserting rights; waivers are
for giving them up
• Several licenses/waivers are compliant with
Open Knowledge definitions
http://opendefinition.org/licenses/
• “Attribution stacking” inherent in CC-BY
problematic for large/combined datasets
Ball A: How to License Research Data 2011
http://www.dcc.ac.uk/resources/how-guides/license-research-data
Why Creative Commons CC0?
• interoperability: CC0 is human and
machine-readable
• universality: CC0 is global and universal
and widely recognized
• simplicity: no need for humans to make,
and respond to, individual data requests
Schaeffer P: Why does Dryad use CC0?
http://blog.datadryad.org/2011/10/05/why-does-dryad-use-cc0/
http://creativecommons.org/publicdomain/zero/1.0/
CC0 use cases – LabArchives ELN
• BioMed Central authors entitled to
LabArchives’ electronic lab notebook with
100Mb of free storage (http://www.labarchives.com/bmc)
• Features include:
-
Data publishing with DOIs assignment
Citable, linkable data supporting publications
Reusable/integrate-able data with CC0
Integrated manuscript submission to BMC journals
Additional free storage (standard is 25Mb)
http://www.biomedcentral.com/about/supportingdata
LabArchives partnership
Implementing CC-BY-CC0 in journals –
why?
• Removes ambiguity about legal status of data
• Helps facilitate reuse including text mining
e.g. Testing of analysis tools against data
harvested from journals
• Open bibliography – diversification and
democratization of impact measures
• Faster progress where lack of combinable
datasets are hampering research e.g. EvoMRI
Hrynaszkiewicz I, Cockerill MJ: Open by default: a proposed copyright
license and waiver agreement for open access research and data
in peer-reviewed journals. BMC Research Notes 2012, 5:494
http://www.biomedcentral.com/1756-0500/5/494
Implementing CC-BY-CC0 in journals –
how?
• Specify a date from which the new license
would apply to data (CC-BY remains for other
content)
• Some relatively minor technical and
operational implications
• Cultural change may be the biggest challenge
• Public consultation with authors, editors,
funders and other stakeholders
Hrynaszkiewicz I, Cockerill MJ: Open by default: a proposed copyright
license and waiver agreement for open access research and data
in peer-reviewed journals. BMC Research Notes 2012, 5:494
http://www.biomedcentral.com/1756-0500/5/494
Proposed new license statement
“© 2012 <Author> et al.
This is an Open Access article distributed under the terms of the
Creative Commons Attribution License
(http://creativecommons.org/licenses/by/2.0), which permits
unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly cited.
Data included in this article, its reference list(s) and its additional
files, are distributed under the terms of the Creative Commons
Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/;
http://www.biomedcentral.com/about/access).”
But what do we mean by data?
• Definitions vary quite widely
• For implementation, general guidelines with
specific examples needed
• Examples in journal articles/additional files
include tabular data, XML, CSV, graphical
data points, bibliographic data (including
reference lists), RDF
Open by default – opt out
• Public domain with no copy/other rights is not
always possible, even for open access content
• Non-standard licenses needed, as already
happens for e.g. US government employees
• Few changes to standard procedures and
author behaviour needed for implementation
• New license only applies to content submitted
for publication
Questions, concerns?
•
•
•
•
Will I risk loss of credit (citations)?
Will I put competitors at an advantage?
Will plagiarism be more likely?
Will I lose any right to express wishes about
future uses of my data?
• ....?
Join the data debate
• How appropriate is public domain dedication
for data you (already) publish in journals?
• How do you define data – what data file types
do you commonly publish as additional files?
• How might removing legal restrictions on data
sharing benefit (or harm) your research?
• And, publishers: how adoptable is this
http://blogs.biomedcentral.com/bmcblog/2012/09/10
model?
/put-the-open-in-open-data
Questions?
Iain Hrynaszkiewicz
Publisher (Open Science), BioMed Central
[email protected]
http://www.mendeley.com/profiles/iain-hrynaszkiewicz/
http://uk.linkedin.com/in/iainhz
@iainh_z
Attribution vs. citation
Activity
Attribution and/or citation
Printing an article for display at a
conference
Attribution
Translating an article for
publication in another journal
Attribution + citation
Paraphrasing a concept or finding
within an article
Citation
Reusing a figure, table or graph
Attribution + citation
Publication of a reanalysis of data
published as an additional file in a
journal
Citation