A Unified Framework for the Semantic Integration of XML Databases

Download Report

Transcript A Unified Framework for the Semantic Integration of XML Databases

First IEEE International Conference on Digital Information Management (ICDIM)
A Unified Framework for
the Semantic Integration of
XML Databases
Doan Dai Duong and Le Thi Thu Thuy
{Duong_Dai.Doan, Thuy_Thi_Thu.Le}@unb.ca
The University of New Brunswick, Fredericton, NB, Canada
Presented by
Virendrakumar C. Bhavsar
December 06-08, 2006
1
Agenda

Introduction

XML Declarative Description (XDD)

Modeling of Data Components

Modelling of Processing Components

Conclusion
2
Introduction

General model of XML database integration
Step 1: Schema Integration
RDS
XML schema1
OODS
XML schema2
RDS
XML schemaN
convert
XML Database
Schema
Integration System
Integrated XML
schema
Set of mappings
Ontology
3
Step 2: Query Processing
<studenewrrwerr">
<Fname>>
<room/rrrrrrrrrrr>
<national/rrewe>
xxx
</studeerewrewnt>
Integrated data
Users
<Fname>>
<national/>
</student>
query
xx
x
x
Integrated schema
s
Local data
n
xx
xx
s
s
r
c
<student source=“A">
<Fname> Xuan</Fname>
<room>G26</room>
<nationality>Vietnam</nationalit
y>
</student>
<student source="B">
<Fname>Phuoc</Fname>
<room>A12</room>
<nationality>Campuchia</nation
ality>
</student>
Local data
n
r
c
<student source="B">
<Fname> Xuan</Fname>
<room>G26</room>
<nationality>Vietnam</nationality>
</student>
<student source="B">
<Fname>Phuoc</Fname>
<room>A12</room>
<nationality>Campuchia</nationality
>
</student>
Local data
n
r
c
<student source=“C">
<Fname> Xuan</Fname>
<room>G26</room>
<nationality>Vietnam</nationality
>
</student>
<student source="B">
<Fname>Phuoc</Fname>
<room>A12</room>
<nationality>Campuchia</nationa
lity>
</student>
4
Proposed Integration Framework
XDD as underlying model
XMLSchema
Integrated schema
XML database
Database sources
Integration
Metadata
system
XML data
XML query
Integrated data
User query
Powerful

XDD supports for all tasks of framework





Input XML query, input XML data, output XML data
Rules, constraints, mappings
Metadata
Based on XML standard format, XDD combines all tasks of framework tightly
and makes it easily to manipulate data
Reduce time and effort of programmers and users and syntax errors
5
XML Declarative Description*


XML Declarative Description (XDD) is XML-based
information representation
Ordinary XML expressions (ground XML expressions)+
variables = Non-ground XML expressions
 Enhancement of expressive power and representation
of implicit information

XML clauses of the form
H ← B1, … , Bm, C1, …, Cn
 Able to express conditions, constraints
*Wuwongse, V., Anutariya, C., Akama, K., and Nantajeewarawat, E. XML Declarative Description (XDD): A
Language for the Semantic Web. IEEE Intelligent Systems, Vol. 16, No. 3, (2001) 54-65
6
Modeling of Data Components

XML Databases



Extension (actual data values): ground XML
expressions
Intension (schemas, logical specifications,
relationships, indexes and constraints): non-ground
XML expressions
XML Queries


Include constructor, patterns, and filters
Correspond to three parts (H, Bi, Cj) of XDD rule
H  B1 …, Bm, C1,…,Cn
7
Modeling of Data Components
constructor
pattern
filter
Query modelled by XDD
8
Query Execution
Example
Data source
<Student>
<name>John</name>
<nationality>Canadian
</nationality>
<GPA>4</GPA>
<phone>234-7856<phone>
<ID>3224567<ID>
</Student>
result1
<Student>
<name>John</name>
<nationality>Canadian
</nationality>
<GPA>4</GPA>
</Student>
<Student>
<name>Duong</name>
<nationality>Vietnamese
</nationality>
<GPA>4.2</GPA>
<phone>456-3241<phone>
</Student>
Query
result2
<Student>
<name>Duong</name>
<nationality>Vietnamese
</nationality>
<GPA>4.2</GPA>
</Student>
Modeling of Data Components

Mappings



Describes correspondence between object in
integrated schema and its corresponding
objects in local schemas
Supports decomposing XML queries and
converting data
Modeled by non-ground XML expressions
10
Sample of Mappings
Object in
integrated
schema
Object in
schema A
Object in
schema B
11
Modelling of Processing Components
Schema Integration Component



The main task is to resolve conflicts between
schemas of participating databases
Conflict resolution between various schemas is
done at one time (one-shot strategy)

Each local schema is big non-ground XML expression
($E_variable)
12
Schema Integration Component

XDD can interactively process all schemas as $E expressions
$E expression
<Integrating_schema>
<schema name="1">…</schema>
<schema name="1">
</schema>
$E expression
<schema name="2">…</schema>
<schema name="2">
</schema>
…
$E expression
<schema name="n">…</schema>
</Integrating_schema>
<schema name="n">
</schema>
13
Schema Conflict Classification
Conflicts between schemas can be classified into four main kinds


Naming conflicts

Constraint conflicts

Synonyms

Occurring numbers of elements

Acronyms

Fixed vs. default values

Homonyms

Constraints of attributes
Structural conflicts

Data type conflicts

Missing items conflicts

Disjoint or incompatible data types

Internal path discrepancy conflicts

Compatible data types

Aggregation conflicts

IDREF and IDREFS

Generalization/specification
14
Aggregation
conflict
Professor
FName
MName
Professor
LName
Name
Union rule
Professor
FName
MName
LName
Name
Aggregation checking and
data type constructing rule
New
data type
is
created
Professor
Name
FName
MName
LName
14
Query Decomposition

The main task  yield n local subqueries from global query
student
id
name
country
position
field
<student id =“$S:id”>
<name>$S:name</name>
<country>$S:country</country>
</student>
Integrated schema
SATstudent
key
fullname
country
<SATstudent key =”$S:id” source=”B”>
<fullname> $S:name </fullname>
<country>$S:country</country>
</SATstudent>
fieldStudy
Schema for source B
SOMstudent
id
name
nation
program
Schema for source A
position
<SOMstudent id=”$S:id” source=”A”>
<name> $S:name </name>
<nation>$S:country</nation>
</SOMstudent>
16
Query Decomposition
A. Brief view
Mappings from global to local
query
Sub query for local source
Query Decomposition
Sub query for local source
B. Solution
Input XML query
<student id =”$S:id”>
<name>$S:name</name>
<country>$S:country</country>
</student>
XML metadata
•XDD rules for
transformation
<SOMstudent id =”$S:id” source=”A”>
<name> $S:name</name>
<nation>$S:country</ nation>
</SOMstudent>
<SATstudent key =”$S:id” source=”B”>
<fullname> $S:name </fullname>
<country>$S:country</country>
</SATstudent>
Output XML queries
16
Query Decomposition
Example
<answer>
<SATstudent source=”B”>
<country>$S:country</country>
</SATstudent>
<SOMstudent source=”A”>
<nation>$S:country</nation>
</SOMstudent>
</answer>
4
results in
<answer>
$E:expression
</answer>

3
<Mapping>
infers to
<student>
<country>$S:country</country>
</student>
<local>$E:expression</local>
</Mapping>
bounds to
2
matches
with
1
<Mapping>
<student>
<country>$S:country</country>
</student>
<local>
<SATstudent source=“B">
<country>$S:country</country>
</SATstudent>
<SOMstudent source=“A">
<nation>$S:country</nation>
</SOMstudent>
</local>
</Mapping>
Local query for
source A
Local query for
source B
Query Decomposition

Using special structure of mapping and
applying XDD rules for query decomposition


Subqueries for distributed data sources are
simultaneously produced
Similarly for data conversion, extracted data
are simultaneously converted to global schema
format
19
Conclusion

XDD is used to model all data components and processing
components of XML database integration framework

Components of system modeled by XDD can communicate and
exchange data easily


Special structure for XDD-based bidirectional mappings is
designed. Information is produced efficiently for both query
decomposition and data conversion, avoiding data redundancy
The framework can

Integrate n participating schemas

Decompose a query into n subqueries at a time.
20