Responses to questions
Download
Report
Transcript Responses to questions
Research Data Access and Preservation Summit
Panel 2 - Promoting Re-Use of Scientific
Collections
Some responses to the questions posed...
John Harrison
SHAMAN Project
University of Liverpool
[email protected]
How do you handle organization of
collections today?
We created a highly structured hierarchy of
directories within our storage system (currently
iRODS)
Allows logical separation, but association of:
Collection data
Supporting documentation (context, provenance)
System
Policies
Software code
Configurations, Workflows
Discovery mechanisms (indexes)
What are the biggest issues with building
collections for new communities?
Scalability; quantity of data is increasing rapidly
More important to select, and prioritize data with
most potential to be useful to future generations.
Mechanisms for identifying useful items in large
reference collections become more important.
When new communities access existing
data collections, what new access
capabilities are required?
It's difficult to generalize; depends a great deal on
expectations of the community in question.
Viewing the data will be essential for all communities
One important aspect of our approach has been to
develop a display technology, independent of the
originating application
Emulation, but with a layer of abstraction from the
operating system (Java Virtual Machine)
Provides a platform for development of new and
unforeseen capabilities for interaction with legacy
(potentially obselete) file formats.
What level of description is required to meet
the expectations of new communities?
Impossible to say for certain. Expectations
evolve as technology develops.
Best we can do:
Rigidly adhere to most stringent and well
documented standards of today.
Preserve the means for future generations to
interpret these descriptions by preserving
documentation on the standard
Tag libraries + Schemas for XML
Ontologies
Is long-term sustainability enabled through
re-purposing of collections?
Theoretically, yes; only time will tell for sure
Best change of achieving sustainability by using
open standards to describe:
Digital objects, their structure and associations
Metadata (digital objects and the archive as a
whole)
Data management policies and processes
Are there other driving purposes behind
promoting re-use of collections?
Data may provide insights into unforeseen
areas.
e.g. results of drug trials might inform future drug
development in the pharmaceutical industry
In such a highly regulated industry, the ability to get
back to raw data to ensure authenticity is very
important!
Which institutions can be approached for
sustaining re-purposed collections?
So far, it seems to be mainly memory institutions
that are looking at issues of digital preservation
(Libraries, Archives, Museums)
Anyone with significant data should be thinking
about issues surrounding preservation of their
knowledge/information assets.
In the future, funding bids should consider the
costs of preserving the results of their research.
I think inevitably many organizations will end up outsourcing digital preservation.