Presentation Title

Download Report

Transcript Presentation Title

Indiana University
Data Publishing Service
Stacy Kowalczyk
April 9, 2010
Questions
• Which phases of the data life cycle are
managed by your repository?
• How do data management requirements
differ across the data life cycle?
• What systems do you use to support the
data life cycle?
• Can you generalize the mechanisms used to
migrate data between different phases of the
data life cycle?
Data Publishing Service
• A new service of the IUScholarWorks institutional
repository and the Scholarly Data Services
• Providing data management support and data access
• Data will have a persistent URL so it can be linked to
publications
• The service will combine our DSpace repository with
IU’s Scholarly Data system (formerly known as
MDSS), a system that researchers are already uses
• Allows discovery over the Web
• Preservation – bit level
Current Data Lifecycle Model Implementation
Scholarly Data Service
IU ScholarWorks
Data creation
research design
data management planning
data collection (surveying, experimentation, measuring etc.)
data checking and cleaning
Preservation of data
storage of data
migration to suitable format/medium
metadata creation
↓
Data analysis
analysis
derived data creation
creation of data documentation
↓
↓
Distribution/publication of data
↓
Re-use of data
by same researcher
by other researchers
End of research
research outputs
preparing data for preservation
http://www.data-archive.ac.uk/sharing/lifecycle.asp
Scholarly Data Service
•
•
•
•
Massive Data Storage System
Current system for research data storage
Installed in 1998
Based on IBM developed High Performance
Storage System (HPSS) software
• It offers over 2.8 petabytes of disk- and
tape-based storage. Distributed between
Indianapolis and Bloomington campuses
Distributed between IUB and IUPUI
IUB
Campus
Network
Bloomington
Users
IUPUI
Campus Network
IUB
Subsystem
IUPUI
Subsystem
HPSS Core
Servers
TCP/IP Wide
Area Network
Research
Network
Research
Network
Disk Arrays
Indianapolis
Users
HPSS
Movers
HPSS
Movers
SAN
SAN
Tape Library
Disk Arrays
Tape Library
Data Publishing in IU Scholarworks
• Discovery and access of datasets and related
publications through the IUScholarWorks
Repository service
• DSpace records that are searchable, indexed,
and harvested and available at stable URLs
• DSpace records that contain DSpace bitstreams
for small datasets
• DSpace records that link via stable URLs to
large datasets in IU MDSS
IUScholarWorks Data: Linking to MDSS and delivery via HTTP
Item record with
URL’s of datasets
in MDSS
HTTP
Server
hpssfs
filesystem
MDSS web server
IU MDSS
Data Publishing in IU Scholarworks
• Facilitating the submission process for
both the researcher and collection
manager
• We facilitate the process for submitters via
the DSpace Configurable Submission
system
• We facilitate the data collection manager’s
process via steps in the DSpace workflow
system
IUScholarWorks Data: Item submission user interface
Phase 2, automated workflow
DSpace Configurable Submission System
Instructions
and
preparation
Describe
item
metadata
form(s)
MDSS and
dataset
info/form
File upload
step
Review
step
Non-interactive
processing
steps
Update
metadata
IU MDSS
Initiate MDSS
actions (move
datasets, etc.)
Query MDSS
technical
metadata
(checksum, etc.)
Finalize/
Accept
License
Planning for a More Curated Life
Cycle Model
http://libraries.mit.edu/guides/subjects/data-management/cycle.html
July 17, 2015
Active and Social Curation
• Engage researchers during projects not at
the end
• Use immediate benefits to drive automatic
capture and 'volunteering’ of metadata
• Reduce costs by re-engineering curation
processes to leverage this rich metadata
and volunteered effort
Active Curation
Data
Acquisition,
Analysis and
Simulation
Active Data
Systems
Curation Boundary
Automated Curation
Workflow/Rule
Engine
Metadata
Management
Operates on Metadata,
Content Objects and
Trigger Events
DDI3. METS, PREMIS,
MODS, DC, SensorML,
OGC, …
Appraisal
and
Selection
OAIS Repository Federation
Ingest scripts:
fixity, integrity,
authentication,
transformation
Scholarly
Communication
Ingest, AIPs
Compound Objects - OAI-ORE
Trusted
Digital Repository Federation
(OAIS compliant)
Preservation
Actions
Dissemination Packages
Wide-Area File System
Search,
Browse,
Annotation,
Visualization
Tools
Use, Reuse,
Repurposing
Tools
Contributor
User
Migration
and
Emulation
Tools
Access Mechanisms and
E-Scholarship Services