Validation of Mappings Between Data Models

Download Report

Transcript Validation of Mappings Between Data Models

VALIDATION OF
MAPPINGS BETWEEN
DATA MODELS
Guillem Rull
Technical University of Catalonia (UPC)
Barcelona, Spain
The Motivation
 Mappings are key elements for any
application requiring interaction of
heterogeneous data.
 A lot of research efforts have been
done to automate the mapping
creation process.
 However, all approaches require
human feedback at some point, to
solve semantic heterogeneities.
 It is thus necessary be able to check
whether the resulting mappings
satisfy the expected needs and
requirements. Few work has been
done in this area.
The Research
 The main goal is to propose a
method for testing whether a
mapping satisfies some desirable
properties.
 We will extend the CQC method
which we successfully applied to the
validation of database schemas.
 Main steps:
1.Identify relevant properties to
validate.
2.Validate mappings according to
these properties in the context of
relational databases.
3.Extend the previous results to
mappings between different
types of models (XML, OO, etc.)
4.Develop a tool able to, given a
mapping and its models, perform
tests to check the desirable
properties.
Current State of the Research
 Two important properties of mappings are
defined in the literature: mapping inference
and query answerability.
 We have also proposed and formalized
two additional properties: mapping
satisfiability and mapping losslessness.
• Mapping inference allows us to check for
redundant mapping formulas.
• Mapping losslessness allows us to check
whether some data is captured by mapping.
It is a generalization of query answerability.
• Query answerability checks whether the exact
answer of a query is preserved by mapping.
• Mapping satisfiability allows us to ensure that
the mapping contains no contradiction.
 We have proved that the four properties
can be expressed in terms of query
liveliness in a relational database.
• A query is not lively if it returns an empty
answer for all database instances.
We can check it with the CQC method.
• We can define a new schema putting
together the mapped models and
incorporating the mapping in form of
additional constraints.
• Then, for each property we can define a
query such that its liveliness determines if the
property holds or not.
 We are currently working on computing
explanations when the properties do not
hold.
Example of Mapping
 Source model:
employees(name, category, happiness-degree)
categories(name, salary)
 Target model:
happy-employees(name, happiness-degree)
all-employees(name, salary)
 Mapping formulas:
select name, happiness-degree
select name,
 happiness-degree
from employees
where happiness-degree > 10
from happy-employees
select employees.name, salary
from employees, categories
 select name, salary
where employees.category
from all-employees
= categories.name
Microsoft is a registered trademark of Microsoft Corporation