PPTX - Open Access Repository - Sci-GaIA

Download Report

Transcript PPTX - Open Access Repository - Sci-GaIA

gLibrary 2.0 REST Platform
Antonio S. Calanducci – University of Catania - Italy ([email protected])
e-Research Summer Hackfest – Catania (Italy)
This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement n° 654237
Outline
•
•
•
•
•
•
•
•
•
Platform presentation & history
Features
Architecture
Authentication & Authorization
Deployment
Under the hood
How to use gLibrary
gLibrary 1.0 vs gLibrary 2.0
Reference
Live demo
2
Introduction to gLibrary 2.0
•
A service that provides access to existing data collections or
create new ones
•
Exposes access to data collections via REST APIs and JSON
•
RESTifies existing database
•
Supports both relational with schema (MySQL, PostgreSQL,
etc.) and non-relational schema-less database (MongoDB)
•
Creation and management of new repositories and collections
(i.e. REST APIs for them) is done via gLibrary REST APIs at
runtime
• We can say that gLibrary provides REST API to create
REST APIs :)
3
Terminology
•
Repository: it provides a way to group together data
collections. These collections can be of different type,
heterogeneous or coming from different remote servers. An
alias of repository is project (or a database in a RDBMs
world). Generally an user is the owner/manager of a
repository
•
Collection: it’s a set or documents or records. A collection
can have a fixed schema (like a database table) or schemaless (a JSON document)
•
Item: a record or document. It’s a set of key value pairs in
JSON format
•
Replica (or Attachment): Each item can optionally have an
associated file, stored on one or more distribute storage
server
4
Examples:
•
Repositories: “sci-gaia”, “my_newproject”, “unict”, “demo”
eg: /v2/repos/my_newproject
•
Collections: “patients”, “activities”, “presentations”,
“manuscripts”, “music”, “videos”, “invoices”, “running_jobs”,
“staged_files”, etc
eg: /v2/repos/my_newproject/videos
/v2/repos/my_newproject/invoices
•
Item: a given “invoice” detail, “song” details, “job” detail
eg: /v2/repos/demo/music/32
•
Replica: the “pdf” file of an “invoice”, a “mp3” file of a “song”,
the “txt” output file of a “job”
eg: /v2/repos/demo/invoices/1432/_replicas/i2jgi34jg34
5
gLibrary REST APIs to manage REST APIs over data sets
•
6
We follow REST principles to manage resources, using HTTP
verbs and proper URI paths:
•
GET for retrieving list of collections, items, replicas
•
POST to create new repository, new collections, new
items, new replicas
•
PUT for editing/updating items, collections, replicas
•
DELETE to delete repositories, collections, items, replicas
Features
•
•
•
•
•
•
•
•
7
Creation of local datasets (on gLibrary server) or remote (on
MySQL, PostgreSQL, MongoDB
Support for both schema less and fixed schema collection
Create collections from data coming from existing remote
databases (query are forwarded to the remote host)
Creation of relations between collections (even of different
type or belonging to different and remote databases)
Powerful query syntax on the URL that offers limit, skip,
where, like, logical operators, regexp, comparison, ordering
User creation and login
Setting permissions per repository and per collections
(Access Control Lists)
Atorage of assets on Grid (Disk Pool Manager) or Cloud
(OpenStack Swift). Direct download/upload from servers (no
caching on gLibrary server)
gLibrary architecture
Architecture
Clients
Grid
Storage
(DPM)
infn-se-03.ct.pi2s2.it
browser
Cloud
Storage
(Swift)
cloud.recas.ba.infn.it
mobile apps
glibrary.ct.infn.it server
(local database / mongoDB)
e-Infrastructure
Resources
Remote
Databases
(MySQL,
PostgreSQL,
MongoDB)
running on VM)
8
Certificate
token server
User TrackingDB
8
Authentication & Authorization
•
•
•
Authentication:
gLibrary provides APIs to create and sign in Users.
Each call to gLibrary REST APIs has to be authenticated. A
valid and not expired TOKEN has to be passed in any request
in the Authorization HTTP header with the TOKEN:
i.e. curl -H “Authorization:
Fsw6tUVzNwp4ftzK4cb3WxwKkvMZ”
http://glibrary.ct.infn.it:3500/v2/repos/
•
•
9
Authorization:
Access Control Lists, with permissions (reading, creation,
editing) for repositories and collections
Deployment
•
gLibrary server can be installed anywhere: on Windows,
macOS, any Linux distribution
•
Requirements:
• An installation of Node.js (https://nodejs.org)
• a local or remote MongoDB (https://www.mongodb.com)
•
Install it from the source available at:
• https://github.com/csgf/glibrary
• (note: use the branch testv2.1)
• install instructions are provided in the previous link
•
or create an account on our server at
• http://glibrary.ct.infn.it:3500
10
How to use gLibrary
•
From the Command Line:
• use CURL, Wget to integrate in your own script (i.e.
running on a VM or Grid WN)
•
From RIA Web Apps using xmlHTTPRequests of any wrapper
on top of it (i.e. jQuery $.ajax())
•
From any portal/CMS (ie. Liferay, Wordpress, Joomla,
Drupal, etc) as long an HTTP client is available
•
From mobile apps (Android, iOS and Windows phone
provides HTTP Clients in their SDK)
•
From desktop applications
11
Under the hood
•
gLibrary 2.0 has been written in JavaScript and Node.js
•
It’s based on the open source Loopback framework from IBM:
• http://loopback.io
•
A MongoDB database is used to store it’s configuration
settings for repositories, collections and replicas
•
It uses Juggler (https://github.com/strongloop/loopbackdatasource-juggler) as ORM. It has a modular architecture to
connect alternative datasources (SQLite, Oracle, SQL Server,
Redis, DynamoDB, CouchDB, Firebird, etc.)
12
History (gLibrary 1.0 vs. gLibrary 2.0)
•
gLibrary 1.0 initial goal was to be a simple and easy to use
platform to store, organize, browse and retrieve digital
assets in repositories, on grid infrastructure
• the “g” stands for Grid
• built with Python/PHP and AMGA as metadata service
• collections had fixed schema, grid storage only
• API were not so “RESTy”
•
gLibrary 2.0 is an evolution and has been rewritten from
scratch
• it’s a different product, that can do anything gLibrary 1.0
can do, plus:
• support many storage back-ends
• on demand repository and collection creation
13
References
•
Official documentation:
• https://csgf.readthedocs.io/en/latest/glibrary/docs/glibrary2
.html
•
Source code and installation instructions:
• https://github.com/csgf/glibrary
•
Contacts:
• [email protected] - [email protected][email protected] (Lead developer)
14
Live Demo
15
Summary and conclusions
•
16
gLibrary 2.0 is an API Platform
•
Provides REST APIs to create repositories, collections,
items and replicas
•
Can expose datasets from local and remote databases
•
Can be easily integrated in any kind of application using
HTTP requests
•
Supports relational and not relational databases and both
Grid and Storage Servers
Thank you!
sci-gaia.eu
[email protected]