Responses to questions

Download Report

Transcript Responses to questions

Research Data Access and Preservation Summit
Panel 2 - Promoting Re-Use of Scientific
Collections
Some responses to the questions posed...
John Harrison
SHAMAN Project
University of Liverpool
[email protected]
How do you handle organization of
collections today?
We created a highly structured hierarchy of
directories within our storage system (currently
iRODS)
 Allows logical separation, but association of:




Collection data
Supporting documentation (context, provenance)
System




Policies
Software code
Configurations, Workflows
Discovery mechanisms (indexes)
What are the biggest issues with building
collections for new communities?

Scalability; quantity of data is increasing rapidly

More important to select, and prioritize data with
most potential to be useful to future generations.

Mechanisms for identifying useful items in large
reference collections become more important.
When new communities access existing
data collections, what new access
capabilities are required?

It's difficult to generalize; depends a great deal on
expectations of the community in question.

Viewing the data will be essential for all communities

One important aspect of our approach has been to
develop a display technology, independent of the
originating application

Emulation, but with a layer of abstraction from the
operating system (Java Virtual Machine)

Provides a platform for development of new and
unforeseen capabilities for interaction with legacy
(potentially obselete) file formats.
What level of description is required to meet
the expectations of new communities?

Impossible to say for certain. Expectations
evolve as technology develops.

Best we can do:

Rigidly adhere to most stringent and well
documented standards of today.

Preserve the means for future generations to
interpret these descriptions by preserving
documentation on the standard


Tag libraries + Schemas for XML
Ontologies
Is long-term sustainability enabled through
re-purposing of collections?


Theoretically, yes; only time will tell for sure
Best change of achieving sustainability by using
open standards to describe:



Digital objects, their structure and associations
Metadata (digital objects and the archive as a
whole)
Data management policies and processes
Are there other driving purposes behind
promoting re-use of collections?

Data may provide insights into unforeseen
areas.

e.g. results of drug trials might inform future drug
development in the pharmaceutical industry
In such a highly regulated industry, the ability to get
back to raw data to ensure authenticity is very
important!
Which institutions can be approached for
sustaining re-purposed collections?
So far, it seems to be mainly memory institutions
that are looking at issues of digital preservation
(Libraries, Archives, Museums)

Anyone with significant data should be thinking
about issues surrounding preservation of their
knowledge/information assets.

In the future, funding bids should consider the
costs of preserving the results of their research.


I think inevitably many organizations will end up outsourcing digital preservation.