Transcript SIP

General Model
of
E-ARK Services
DLM Forum Members’ Meeting
10-11 June 2014, Athens
Istvan Alföldi
National Archives of Hungary
Agenda
E-ARK General Model
• Conceptual framework
• Used methodology
• Results (not final)
General Model of E-ARK Services
”The scope of the E-ARK service is to provide a reference
implementation, which integrates these currently non-
interoperable tools into a replicable and scalable, common
seamless workflow, allowing data owners and repositories
to flexibly select and use the components most relevant for
their specific situations. To achieve this, a set of common
interfaces and information package formats will be defined
by the E-ARK project and implemented using these tools.”
(E-ARK DoW Part B Finalised version 2.0 - B1.3)
OAIS Reference Model
Data and Content
Management
P
r
o
d
u
c
e
r
Pre-Ingest
e-Document
Archive
Ingest
SIP
AIP
Access
AIP
Preservation Planning
Archival Storage
Administration
C
o
queries n
result sets
s
orders u
m
DIP
e
r
Conceptual Framework
Pre-Ingest
Ingest
Archival Storage
Preservation Planning
Administration
Data Management
Access
Conceptual Framework – Standards and Format Definitions
Pre-Ingest
Archival Storage
Preservation Planning
Administration
Ingest
Access
Data Management
Standards and Format Definitions
OAIS, METS, ISAD(G), Moreq2010, PREMIS, SIARD, CMIS, Bagit,
InterPARES 2, EAD, EAD-CPF, etc.
E-ARK
SIP
SIP
E-ARK
AIP
AIP
E-ARK
DIP
DIP
Conceptual Framework
Pre-Ingest
Ingest
Archival Storage
Preservation Planning
Administration
Data Management
Access
Conceptual Framework – Tools and Interfaces
Pre-Ingest
Archival Storage
Preservation Planning
Administration
Ingest
Access
Data Management
Tools and Interfaces (already established)
DMS
RMS
Archival Repository
Content and
Records
Management
System
CMIS
interface
Alfresco
DB
ET, DBExport,
UAM, RODA-in
E-ARK
SIP
E-ARK
AIP
ESSArch Preservation
Platform (EPP) , Safety
Deposit Box (SDB),
RODA, The National
Danish Bit Repository
(NDBR)
Scalable
Computation
Staging Area
Lily
Apache Hadoop
E-ARK
DIP
SOFIA
Conceptual Framework – Tools and Interfaces
Pre-Ingest
Archival Storage
Preservation Planning
Administration
Ingest
AIP-DIP Conversion Comp.
Tools and Interfaces (to be developed)
SIP Creation
Tools
Access
Data Management
Search, Access
and Display
Interface
SIP-AIP
Conversion
Component
DMS
RMS
Archival Repository
Content and
Records
Management
System
CMIS
interface
Alfresco
DB
ET, DBExport,
UAM, RODA-in
E-ARK
SIP
E-ARK
AIP
ESSArch Preservation
Platform (EPP) , Safety
Deposit Box (SDB),
RODA, The National
Danish Bit Repository
(NDBR)
Scalable
Computation
Staging Area
Lily
Apache Hadoop
E-ARK
DIP
SOFIA
Data Mining
Showcase
Conceptual Framework – Tools and Interfaces
Pre-Ingest
Archival Storage
Preservation Planning
Administration
Ingest
AIP-DIP Conversion Comp.
Tools and Interfaces (to be developed)
SIP Creation
Tools
Access
Data Management
Search, Access
and Display
Interface
SIP-AIP
Conversion
Component
DMS
RMS
Archival Repository
Content and
Records
Management
System
CMIS
interface
Alfresco
DB
ET, DBExport,
UAM, RODA-in
E-ARK
SIP
E-ARK
AIP
ESSArch Preservation
Platform (EPP) , Safety
Deposit Box (SDB),
RODA, The National
Danish Bit Repository
(NDBR)
Scalable
Computation
Staging Area
Lily
Apache Hadoop
E-ARK
DIP
SOFIA
Data Mining
Showcase
Conceptual Framework
Pre-Ingest
Ingest
Archival Storage
Preservation Planning
Administration
Data Management
Access
Conceptual Framework – Processes and Use Cases
Process and Use Case Diagrams
GM-PI-6
Create SIP
ESSArch
GM-PI-7
Producer
SIP Creator
(WP3)
Start transfer to archive
UAM
RODA
SIP
Conceptual Framework – Processes and Use Cases
Process Diagrams
Conceptual Framework – Processes and Use Cases
Business Process Modeling Notation (BPMN)
Process Flow
Participant role
Activity
Conceptual Framework – Processes and Use Cases
Business Process Modeling Notation (BPMN)
Collapsed view
Conceptual Framework – Processes and Use Cases
Standard Process Diagrams (BPMN)
Detail view
Conceptual Framework – Processes and Use Cases
Use Case Diagrams
GM-PI-6
SIP Creator
(WP3)
UAM
Create SIP
ESSArch
RODA
SIP
GM-PI-7
Start transfer to archive
Producer
Participant role
Activity
Used tool
Input / Output
Conceptual Framework – Processes and Use Cases
GM-PI-6
Create SIP
ESSArch
GM-PI-7
Producer
Who does what?
In what order?
Using what kind of tool?
Required input?
Produced output?
SIP Creator
(WP3)
Start transfer to archive
UAM
RODA
SIP
Conceptual Framework
Pre-Ingest
Ingest
Archival Storage
Preservation Planning
Administration
Data Management
Access
Conceptual Framework
Pre-Ingest
Ingest
Archival Storage
Preservation Planning
Administration
Data Management
Recomended Practices
To be added according to the
experiences of the Pilots
Access
Used Methodology and Schedule
Surveys
• Questionnaire 1 (11 February, 2014)
• Infrastructure overview
• Process overview
• Data overview
• Questionnaire 2 (3 March, 2014)
• Detailed process information
• Detailed tool description
• General high-level requirements
• Input data profile survey (30 May, 2014)
Draft model and Discussion
• Draft model sent out (26 May, 2014)
• Comments in e-mail
• E-ARK technical meeting (12-13 June 2014, Athens)
Final model and approval (July 2014)
Results – Processes and Use cases
Pre-Ingest
Results – Processes and Use cases
GM-PI-1
Define SIP content
Pre-Ingest
GM-PI-2
DBExport
Select data (with rules)
DB
ESSArch
GM-PI-3
Select data (manual)
Alfresco
GM-PI-4
DMS
/RMS
RODA
Extract data from DB
GM-PI-5
Producer
Extract data from
DMS/RMS
GM-PI-6
SIP Creator
(WP3)
UAM
Create SIP
ESSArch
GM-PI-7
Start transfer to archive
RODA
SIP
Results – Processes and Use cases
Ingest
Results – Processes and Use cases
Ingest
SIP
GM-I-1
Upload SIP
E-ARK
SIP
E-ARK SIP  AIP tool
GM-I-2
Technical stuff
Start AIP generation
workflow
SDB
RODA
ESSArch PP
AIP
E-ARK
AIP
Results – Processes and Use cases
Data Management
Results – Processes and Use cases
Data Management
GM-DM-1
Select records to export
Archivee
GM-DM-2
Manipulate records
CMIS
GM-DM-3
Start export workflow
Archivist
Lily
Results – Processes and Use cases
Access
Results – Processes and Use cases
Access
GM-A-1
Search and select data
GM-A-2
AIP
E-ARK
AIP
Request data
GM-A-3
Request access from presentation
platform
End user
(Consumer)
GM-A-8
Download DIP
Presentation / Data
Mining Platform
Archivee
General Model – Work package overview
Work Package - OAIS Process cross reference table
E-ARK
General Model
Work Package
WP 1
Project Co-Ordination
WP 2
Use Cases and Pilots
WP 3
Transfer of Records to Archives
WP 4
Archival Records Preservation
WP 5
Archival Records Access Services
WP 6
Archival Storage, Services, and Integration
WP 7
Evaluation & Assessment
WP 8
Dissemination and Exploitation
Pre-Ingest
Ingest
Archival Storage
Preservation
Data
Management
Access
General Model – Pilot overview
Pilot - OAIS Process cross reference table
E-ARK
General Model
Full-scale Pilot
SIP creation of relational databases
Pilot 1
(Danish National Archives)
SIP creation and ingest of records
Pilot 2
(National Archives of Norway)
Ingest from government agencies
Pilot 3
(National Archives of Estonia)
Business archives
Pilot 4 (National Archives of Estonia,
Estonian Business Archives)
Preservation and access to records with geodata
Pilot 5
(National Archives of Slovenia)
Seamless integration between a live document
management system and a long-term digital
Pilot 6
archiving and preservation service
(KEEP SOLUTIONS)
Access to databases
Pilot 7
(National Archives of Hungary)
Focus of the pilot
Elements also used/tried within the pilot
Pre-Ingest
Ingest
Archival Storage
Preservation
Data
Management
Access
Cross Reference Tables – Use case view
Use Case View
General Model
x
x
x
x
x
x
x
x
x
x
x
?
x
x
x
x
x
?
x
x
x
x
x
?
x
x
x
x
x
x
x
x
x
x
?
x
x
x
x
x
?
x
x
x
x
x
?
x
x
x
x
x
x
x
x
Tools
WP6
x
x
x
x
x
x
x
x
WP5
x
WP4
x
x
x
x
?
x
Pilot 7 (NAH)
Create SIP
Start transfer to archive
SIP reception
Validate SIP
Enhance SIP
Create fond(s)
Pilot 6 (KEEP)
GM-PI-6
GM-PI-7
GM-PI-8
GM-PI-9
GM-PI-10
GM-PI-11
Pilot 5 (NAS)
x
x
Pilot 4 (EBA)
Define SIP content
Select data (with rules)
Select data (manual)
Extract data from DB
Extract data from DMS/RMS
Pilot 3 (NAE)
Pre-Ingest
GM-PI-1
GM-PI-2
GM-PI-3
GM-PI-4
GM-PI-5
Work Package
Pilot 2 (NAN)
Pilots
Pilot 1 (DNA)
Use Case
WP3
E-ARK
DBExport tool
DBExport tool
ESSArch Tools
ESSArch Tools, Noark, Alfresco, RODA
DBExport tool, ESSArch Tools, SIP creation tools,
RODA-in, UAM
SIP to AIP conversion tools
SDB, EPP, RODA, AIS
Cross Reference Tables – Tools view
Tools View
Alfresco
Alfresco is the most used open source content and records management
platform worldwide. While its main use is in content creation,
management and access the platform will also be used to pilot the E-ARK
records export recommendations and providing crucial source data for the
E-ARK service.
ET is a stand-alone application that allows producers and preservation
organizations to prepare, create, deliver and receive SIPs.
DBExport is used to create SIP packages in SIARD format based on the
content in relational database systems and provide a first level of integrity
and technical checks.
SIP creation tools (to be developed in WP3) implement the final PanEuropean SIP format based on the Alfrexco platform, ESSArch Tools (ET)
suite and the DBExport tool.
UAM allows data owners to prepare the transfer themselves – import data
from records and content management systems; rearrange, classify and
further describe the contents of the ransfer; validate the transfer
according to the rules established by the National Archives of Estonia
and finally create SIP packages to be transferred to the digital repository.
ESSArch Tools (ET)
DBExport Tool
E-ARK SIP creation tools
Universal Archiving Module
(UAM)
Pilot 7 (NAH)
Pilot 6 (KEEP)
Work Package
Pilot 5 (NAS)
Pilot 4 (EBA)
Pilot 3 (NAE)
Pilot 2 (NAN)
Pilot 1 (DNA)
Pilots
WP6
Description
WP5
Tool
WP4
General Model
WP3
E-ARK
Thank you!