Database - Geodise

Download Report

Transcript Database - Geodise

Data Management in Geodise
Jasmin Wason, Zhuoan Jiao and Marc Molinari
Engineering design and optimisation is a computationally intensive process where
data may be generated at different locations with different characteristics. Data is
traditionally stored in flat files with little descriptive metadata provided by the file
system. Our focus is on providing data management by leveraging existing
database tools that are not commonly used in engineering and making them
accessible to users of the system.
The main objectives are to provide:
A data management service
Securely store and retrieve data files and data structures to/from a distributed repository.
Technical and application specific metadata added so data is easier to search for, locate and share.
Metadata management services
Web services provide API access to metadata in relational and XML databases.
Related data aggregated into groups.
A familiar interface for engineers
Work with functions and variables rather than underlying XML, SOAP, SQL, XPath, etc.
© Geodise Project, University of Southampton, 2003.
http://www.geodise.org/
Geodise Database Toolbox
Storage service
Example:
Allows applications to archive files sent
over GridFTP to file systems, and store
data structures as XML in databases. A
location service maps logical data
identities with physical storage locations.
Archive data:
>> fileID = gd_archive('C:\input.dat');
Retrieve data:
>> gd_retrieve(fileID, 'E:\tmp' )
ans = E:\tmp\input.dat
Metadata service
Example:
The data can be stored with additional
descriptive information detailing technical
characteristics (e.g. format, size, date),
ownership, and user-defined application
domain specific metadata.
Define metadata and archive file:
>> m.grids = 1;
>> m.turb_model = 'sa';
>> fileID = gd_archive('C:\input.dat', m);
Query service
Example:
Querying over the metadata database can
help to locate the needed data intuitively
and efficiently, using the gd_query
function or query GUI. Users only receive
results for data they are authorised to
access.
>> r = gd_query('standard.userID = me & grids < 2');
>> gd_display(r):
standard.userID = me
standard.ID = input_dat_8a184899-ad2d-4055-aad9-a1
grids = 1
Authorisation service
Example:
An authenticated user may grant other
users access rights to their data and this
information is stored in the authorisation
database.
>>
>>
>>
>>
m.grids = 1;
m.access.users = {'userA', 'userB'};
m.access.groups = {'groupC'};
fileID = gd_archive ('C:\input.dat', m);
© Geodise Project, University of Southampton, 2003.
http://www.geodise.org/
Data Management Implementation
To increase the usability of file and metadata management services for engineers we have implemented
a MATLAB Toolbox for archiving, querying and retrieval of data to and from a Geodise repository.
Client
Grid
Geodise Database
Toolbox
Matlab
Functions
Globus Server
Refers
to
GridFTP
Java
clients
.NET
Location
Service
Location
Database
Authorisation
Service
Authorisation
Database
CoG
Apache
SOAP
SOAP
SOAP
Java
Metadata
Archive & Query
Services
© Geodise Project, University of Southampton, 2003.
http://www.geodise.org/
Metadata
Database
XML Toolbox
Enables the conversion of Matlab variables and structures from proprietary format to
XML and vice versa in transparent, easy-to-use way. The XML can then be transferred,
stored, and retrieved across the Grid.
Four functions: xml_save(), xml_load(), xml_format(), xml_parse()
Matlab
Toolbox
<struct xmlns="http://www.geodise.org/matlab.xsd " idx="0" fields="name id param">
<char idx="1" name="name" size="1 14"> eng_opt_design </char>
<char idx="1" name="id" size="1 11">1022223 -779</char>
<struct idx="1" name=" param" size="1 1" fields="width height material">
<double idx="1" name="width" size="1 1"> 100 </double>
<double idx="1" name="height" size="1 1"> 87.9 </double>
<double idx="1" name="material" size="1 1"> 0.07244 </double>
</struct>
</struct>
(A) Generate file
File archive
(B) Archive
XML
local file path
structure
Data file
XML
filehandle
Filestore
DB
…
(C) Query
Metadata
database
query string
structure
structure
structure
XML
XML
XML
filehandle
Toolbox
<struct xmlns="http://www.geodise.org/matlab.xsd " idx="0" fields="name id param">
<char idx="1" name="name" size="1 14"> eng_opt_design </char>
<char idx="1" name="id" size="1 11">1022223 -779</char>
<struct idx="1" name=" param" size="1 1" fields="width height material">
<double idx="1" name="width" size="1 1"> 100 </double>
<double idx="1" name="height" size="1 1"> 87.9 </double>
<double idx="1" name="material" size="1 1"> 0.07244 </double>
</struct>
</struct>
(D) Retrieve
filehandle
local file path
Data file
XML
The XML Toolbox developed by the GEM project.
Application of the XML Toolbox in Geodise.
© Geodise Project, University of Southampton, 2003.
http://www.geodise.org/
Future Work
Categorisation of metadata based on XML Schemas
XML Schemas can be generated to describe user-defined metadata.
Changes are made over time to metadata about a design.
Use comparison and merge tool to detect changes and alter XML Schema.
XML Schema work may include future integration of ontologies.
Web query interface
Auto generation of query interface based on XML Schemas.
User certificate required for authentication.
Computational steering with shared Matlab variables
Temporary shared storage and update of Matlab variables.
XML Toolbox
Ability to read most XML files and convert these into Matlab struct variables and vice
versa.
Infrastructure
Improved Web Service security and use of OGSA-DAI.
Jython implementation of client toolbox.
© Geodise Project, University of Southampton, 2003.
http://www.geodise.org/