A Unified Framework for the Semantic Integration of XML Databases
Download
Report
Transcript A Unified Framework for the Semantic Integration of XML Databases
First IEEE International Conference on Digital Information Management (ICDIM)
A Unified Framework for
the Semantic Integration of
XML Databases
Doan Dai Duong and Le Thi Thu Thuy
{Duong_Dai.Doan, Thuy_Thi_Thu.Le}@unb.ca
The University of New Brunswick, Fredericton, NB, Canada
Presented by
Virendrakumar C. Bhavsar
December 06-08, 2006
1
Agenda
Introduction
XML Declarative Description (XDD)
Modeling of Data Components
Modelling of Processing Components
Conclusion
2
Introduction
General model of XML database integration
Step 1: Schema Integration
RDS
XML schema1
OODS
XML schema2
RDS
XML schemaN
convert
XML Database
Schema
Integration System
Integrated XML
schema
Set of mappings
Ontology
3
Step 2: Query Processing
<studenewrrwerr">
<Fname>>
<room/rrrrrrrrrrr>
<national/rrewe>
xxx
</studeerewrewnt>
Integrated data
Users
<Fname>>
<national/>
</student>
query
xx
x
x
Integrated schema
s
Local data
n
xx
xx
s
s
r
c
<student source=“A">
<Fname> Xuan</Fname>
<room>G26</room>
<nationality>Vietnam</nationalit
y>
</student>
<student source="B">
<Fname>Phuoc</Fname>
<room>A12</room>
<nationality>Campuchia</nation
ality>
</student>
Local data
n
r
c
<student source="B">
<Fname> Xuan</Fname>
<room>G26</room>
<nationality>Vietnam</nationality>
</student>
<student source="B">
<Fname>Phuoc</Fname>
<room>A12</room>
<nationality>Campuchia</nationality
>
</student>
Local data
n
r
c
<student source=“C">
<Fname> Xuan</Fname>
<room>G26</room>
<nationality>Vietnam</nationality
>
</student>
<student source="B">
<Fname>Phuoc</Fname>
<room>A12</room>
<nationality>Campuchia</nationa
lity>
</student>
4
Proposed Integration Framework
XDD as underlying model
XMLSchema
Integrated schema
XML database
Database sources
Integration
Metadata
system
XML data
XML query
Integrated data
User query
Powerful
XDD supports for all tasks of framework
Input XML query, input XML data, output XML data
Rules, constraints, mappings
Metadata
Based on XML standard format, XDD combines all tasks of framework tightly
and makes it easily to manipulate data
Reduce time and effort of programmers and users and syntax errors
5
XML Declarative Description*
XML Declarative Description (XDD) is XML-based
information representation
Ordinary XML expressions (ground XML expressions)+
variables = Non-ground XML expressions
Enhancement of expressive power and representation
of implicit information
XML clauses of the form
H ← B1, … , Bm, C1, …, Cn
Able to express conditions, constraints
*Wuwongse, V., Anutariya, C., Akama, K., and Nantajeewarawat, E. XML Declarative Description (XDD): A
Language for the Semantic Web. IEEE Intelligent Systems, Vol. 16, No. 3, (2001) 54-65
6
Modeling of Data Components
XML Databases
Extension (actual data values): ground XML
expressions
Intension (schemas, logical specifications,
relationships, indexes and constraints): non-ground
XML expressions
XML Queries
Include constructor, patterns, and filters
Correspond to three parts (H, Bi, Cj) of XDD rule
H B1 …, Bm, C1,…,Cn
7
Modeling of Data Components
constructor
pattern
filter
Query modelled by XDD
8
Query Execution
Example
Data source
<Student>
<name>John</name>
<nationality>Canadian
</nationality>
<GPA>4</GPA>
<phone>234-7856<phone>
<ID>3224567<ID>
</Student>
result1
<Student>
<name>John</name>
<nationality>Canadian
</nationality>
<GPA>4</GPA>
</Student>
<Student>
<name>Duong</name>
<nationality>Vietnamese
</nationality>
<GPA>4.2</GPA>
<phone>456-3241<phone>
</Student>
Query
result2
<Student>
<name>Duong</name>
<nationality>Vietnamese
</nationality>
<GPA>4.2</GPA>
</Student>
Modeling of Data Components
Mappings
Describes correspondence between object in
integrated schema and its corresponding
objects in local schemas
Supports decomposing XML queries and
converting data
Modeled by non-ground XML expressions
10
Sample of Mappings
Object in
integrated
schema
Object in
schema A
Object in
schema B
11
Modelling of Processing Components
Schema Integration Component
The main task is to resolve conflicts between
schemas of participating databases
Conflict resolution between various schemas is
done at one time (one-shot strategy)
Each local schema is big non-ground XML expression
($E_variable)
12
Schema Integration Component
XDD can interactively process all schemas as $E expressions
$E expression
<Integrating_schema>
<schema name="1">…</schema>
<schema name="1">
</schema>
$E expression
<schema name="2">…</schema>
<schema name="2">
</schema>
…
$E expression
<schema name="n">…</schema>
</Integrating_schema>
<schema name="n">
</schema>
13
Schema Conflict Classification
Conflicts between schemas can be classified into four main kinds
Naming conflicts
Constraint conflicts
Synonyms
Occurring numbers of elements
Acronyms
Fixed vs. default values
Homonyms
Constraints of attributes
Structural conflicts
Data type conflicts
Missing items conflicts
Disjoint or incompatible data types
Internal path discrepancy conflicts
Compatible data types
Aggregation conflicts
IDREF and IDREFS
Generalization/specification
14
Aggregation
conflict
Professor
FName
MName
Professor
LName
Name
Union rule
Professor
FName
MName
LName
Name
Aggregation checking and
data type constructing rule
New
data type
is
created
Professor
Name
FName
MName
LName
14
Query Decomposition
The main task yield n local subqueries from global query
student
id
name
country
position
field
<student id =“$S:id”>
<name>$S:name</name>
<country>$S:country</country>
</student>
Integrated schema
SATstudent
key
fullname
country
<SATstudent key =”$S:id” source=”B”>
<fullname> $S:name </fullname>
<country>$S:country</country>
</SATstudent>
fieldStudy
Schema for source B
SOMstudent
id
name
nation
program
Schema for source A
position
<SOMstudent id=”$S:id” source=”A”>
<name> $S:name </name>
<nation>$S:country</nation>
</SOMstudent>
16
Query Decomposition
A. Brief view
Mappings from global to local
query
Sub query for local source
Query Decomposition
Sub query for local source
B. Solution
Input XML query
<student id =”$S:id”>
<name>$S:name</name>
<country>$S:country</country>
</student>
XML metadata
•XDD rules for
transformation
<SOMstudent id =”$S:id” source=”A”>
<name> $S:name</name>
<nation>$S:country</ nation>
</SOMstudent>
<SATstudent key =”$S:id” source=”B”>
<fullname> $S:name </fullname>
<country>$S:country</country>
</SATstudent>
Output XML queries
16
Query Decomposition
Example
<answer>
<SATstudent source=”B”>
<country>$S:country</country>
</SATstudent>
<SOMstudent source=”A”>
<nation>$S:country</nation>
</SOMstudent>
</answer>
4
results in
<answer>
$E:expression
</answer>
3
<Mapping>
infers to
<student>
<country>$S:country</country>
</student>
<local>$E:expression</local>
</Mapping>
bounds to
2
matches
with
1
<Mapping>
<student>
<country>$S:country</country>
</student>
<local>
<SATstudent source=“B">
<country>$S:country</country>
</SATstudent>
<SOMstudent source=“A">
<nation>$S:country</nation>
</SOMstudent>
</local>
</Mapping>
Local query for
source A
Local query for
source B
Query Decomposition
Using special structure of mapping and
applying XDD rules for query decomposition
Subqueries for distributed data sources are
simultaneously produced
Similarly for data conversion, extracted data
are simultaneously converted to global schema
format
19
Conclusion
XDD is used to model all data components and processing
components of XML database integration framework
Components of system modeled by XDD can communicate and
exchange data easily
Special structure for XDD-based bidirectional mappings is
designed. Information is produced efficiently for both query
decomposition and data conversion, avoiding data redundancy
The framework can
Integrate n participating schemas
Decompose a query into n subqueries at a time.
20