Framework Description
Download
Report
Transcript Framework Description
Federal University of Rio de Janeiro – COPPE/UFRJ
Author: Wladimir S. Meyer – Doctorate Student
Advisors: Jano Moreira de Souza – Ph.D.
Milton Ramos Ramirez – D.Sc.
1 / 18
Summary
Introduction
Framework Description
Motivation
Objectives
Related Works
Structure
Functioning
New functionalities added to Secondo
The Case Study
Final Considerations
2 / 18
Introduction
Motivation
The challenge of integrate spatial databases spread around a
computational grid
Objectives
Aggregate new functionalities to an extensible SDBMS that
permit it to act as a platform to study distributed spatial
databases in computational grids.
This platform should:
Be capable of interact (by itself) with other analogous platforms
in a grid
Offer some level of transparencies [Özsu and Valduriez 1999]:
• Data independence
• Network transparency
• Replication Transparency
Be modular to permit focus only in experiences being developed
Be capable of exchange “specialized skills” (algebras in this
case)
3 / 18
Introduction
Related Works
The GGF Data Access and Integration Services Work Group (GGFDAIS-WG) produces a lot of recomendations related with DB in
grids [OGSA-DAI-WSRF 05].
They are a set of interfaces and services to be implemented outside
the DBMS environment
Only relational, XML and file system data models are supported
The OGSA-DAI project implements many of DAIS-WG
recomendations and offers a java toolkit for clients
The OGSA-DQP project [Smith et al. 2002] uses OGSA-DAI to
offer support in distributed queries over a grid. Only relational
databases are benefitted and doesn’t support the newly release of
OGSA-DAI based on WSRF.
4 / 18
Framework Description -
Structure
The framework is composed by:
A Spatial DBMS*: Secondo [Dieker and Güting 2000] was adopted
because its modularity, formalism and extensibility. It was intended
originally for experimental purpose with spatial and spatio-temporal
data models [Güting et al. 2004].
A grid middleware: it offers several services that are used by the
SDBMS [Foster 2005]:
Job Manager Service (GRAM)
Reliable File Transfer Service (RFT)
Index Service (MDS)
Globus Toolkit 4 was chosen because of its web service approach and set of
powerful components.
A set of tools: it was added to provide some extra functionalities
like:
Submit queries to a set of servers,
Discovery an algebra, in other Secondo, based in algebra description files
Import an algebra
(*) – when used with its spatial algebra
5 / 18
Framework Description -
Functioning
Central Index Service (MDS)
•Global Schema
•Fragments’ map
Secondo#1
Algebras’
Description file
Secondo #2
Algebras’
Description file
QUERY
Secondo #3
Algebras’
Description file
Secondo #4
6 / 18
Framework Description -
Functioning
Central Index Service (MDS)
•Global Schema
•Fragments’ map
MDS
Secondo #2
Secondo #1
QUERY
Request Servers’ status
MDS
Secondo #3
MDS
Same
fragments
Secondo #4
7 / 18
Framework Description -
Functioning
Central Index Service (MDS)
•Global Schema
•Fragments’ map
MDS
Secondo #2
Secondo #1
QUERY
Responses
MDS
CPU load
Total amount of memory
Total amount of free memory
Number of running processes
Number of active processes
Number of users logged in
Total amount of free space in hard disk
Secondo #3
MDS
Secondo #4
8 / 18
Framework Description -
Functioning
Central Index Service (MDS)
•Global Schema
•Fragments’ map
Secondo #1
Secondo #2
The Secondo #1 generates a job description file, a
Secondo-command file and submit them to selected
nodes using GRAM
QUERY
Send
subqueries
The job description file can express a
multijob, for example meaning that a result
from a query must be transfered to another
to be used in a second step.
Secondo #3
Secondo #4
9 / 18
Framework Description -
Functioning
Central Index Service (MDS)
•Global Schema
•Fragments’ map
Secondo #2
Secondo #1
QUERY
Results as nested lists (RFT)
Secondo #3
Secondo #4
10 / 18
Framework Description -
Functioning
Central Index Service (MDS)
•Global Schema
•Fragments’ map
Secondo #1
Result
Secondo #2
The returned results are aggregated to form a global result
Secondo #3
Secondo #4
11 / 18
Framework Description –
Modified Secondo
Global Query Plan Processor
subqueries
Global query
Query processor
Alg 2
Alg 3
MDS
Results
Query Execution Monitor
Resources status
monitorResourcesStatus()
Fragment Location
Kernel
Command processor
Global schema
requestGlobalSchema()
Optimizer
requestFragmentLocation()
Query Plan Maker
submitSubQueries()
Graphical User
Interface
Alg 1
New functionalities
GRAM
Global result
Adapted from [Ramirez 2001]
Alg n
Storage Manager & tools
globalQueryPlanProcessor()
GRAM cli
Submit activities (jobs) to grid
requestGlobalSchema()
modifyGlobalSchema()
requestFragmentLocation()
updateFragmentLocation()
monitorResourcesStatus()
lookForAlgebras()
importAlgebra()
MDS cli
Discover and monitor registered resources
12 / 18
Framework Description –
New functionalities
Files generated automatically during a job submission:
Job description file – a file that specifies details about where and
how a job must be executed
Secondo Command file – specifies a set of commands to be run in
a Secondo server
Spatial select example
Constructed with spatial algebra
open database 28433;
create tempBox:rect;
update tempBox:=[const rect value(-48.775 –48.771 –25.331 –25.339)]
let temp=drain_line creatertree [shape];
query temp drain_line windowintersect [tempBox] consume;
delete temp;
delete tempBox;
close database 28433;
R-tree algebra operators
13 / 18
The Case Study
To validate the proposed framework a geographic database
prototype is being built in the following manner:
Composition:
• 04 computers, with Fedora Linux, as grid nodes,
• All machines running GT4 with GRAM, MDS, RFT services,
• All machines running a modified Secondo (Secondo-grid)
Distributed spatial database design:
Hydrography
Federated architecture with a Global Schema
Secondo 1
Edification
Thematic fragmentation
Vegetation
The fragments can be replicated
All themes belong to the same region
Secondo 2
Secondo 4
Secondo 3
14 / 18
The Case Study
Autonomy:
modarate, because each Secondo must update the global
schema and fragments’ map when necessary
Nature of data:
Cartographic data supplied by Directory of Geographic Service
(Brazilian Army)
Queries being implemented:
spatial select and spatial join
15 / 18
Final Considerations
This framework is being developed as a platform for
experimental purposes: performance isn’t its main focus
Many issues were not included in present work and will be
covered in future works: transaction control, optimizer for
distributed queries, security, etc
Modules of the framework that are running now:
• Registering and Monitoring modules: based on
global schema, fragments’ map, servers’ status
monitor and algebras’ description file
• Automatic generation of files: job description and
secondo command file
• Submission of single queries with GRAM clients
16 / 18
Final Considerations
Next steps:
Conclude the data transference module using RFT
Implement multijob submission with complex queries
Conclude the infrastructure to import algebras
17 / 18
Thank you !
18 / 18