Merging Models Based on Given Correspondences
Download
Report
Transcript Merging Models Based on Given Correspondences
Merging Models Based on
Given Correspondences
Rachel A. Pottinger
Philip A. Bernstein
Introduction
“A model is a formal description of a
complex application artifact, such as a
database schema, an application
interface, a UML model, an ontology, or
a message format. The problem of
merging such models lies at the core of
many meta data applications.”
Introduction
Combining models requires two steps:
Determining correspondences between two
models (Schema matching)
Merging the models based on those
correspondences
Determining correspondences is a major
topic of ongoing research and is not
covered in this paper
Model Management
Proposed by Bernstein in “Applying
Model Management to Classical Meta
Data Problems”
Operators:
Match
Merge
Apply
Diff
Model Management
Presented solution:
Merge (A, B, MapAB)
A & B models
MapAB = mapping of correspondences
Returns “duplicate-free union” of A & B
Example - Conflict
Conflict Resolution
Conflict resolution is independent of
representation
Existing similarities among solutions
offer an opportunity for abstraction
Buneman, Davidson, and Kosky (BDK)
algorithm
Uses pair-wise correspondences that have
“Is-a” and “Has-a” relationships
Representation of Models
Representation requires (at least) 3
meta-levels
Model = database schema, etc
Meta-model = type definitions
Meta-meta-model = representation
language in which models and metamodels are expressed
Inputs: Merge (A, B, MapAB)
Two models: A & B
Mapping: MapAB
First-class models, elements and relationships
Mapping elements, origins of additional mapping
relationships
Non-mapping elements
Equality and similarity mapping elements
Optional designation of preferred model
Optional overrides for Merge behavior
Complicated Mapping
Non-mapping element
Similarity Mapping
Mapping Result
Result = “a schema that presents all the
information of the schemas being
merged, but no additional information”
Resulting model, G, satisfies Generic
Merge Requirements
Conflict Resolution
Conflicts categorized based on metalevel
Representation conflicts
Meta-model conflicts
Fundamental conflicts
Representation Conflicts
Occurs when two models describe the same
concept in different ways
Example, Name represented as ActorName vs.
FirstName & LastName
Different possible outputs
Solutions:
Concepts the same based off equality mapping
elements
Related based of meta-meta-model relationships
and elements, FirstName sub element of ActorName
Related in more complex fashion beyond metameta-model representation, ActorName equals the
concatenation of FirstName and LastName
Meta-Model Conflicts
Merge result violates meta-modelspecific constraint
SQL table and XML database are merged
into a SQL model, there will be no concept
of a sub column
EnforceConstaints operator requires
merge results to conform to a given
meta-model.
Fundamental Conflicts
Meta-meta-model conflicts
Merge result violates meta-meta-model
rules and cannot be considered a model
Fundamental Conflicts Example
Meta-meta-model rule: one-type
restriction
Merge allowed actions:
Specify an alternative function to apply for
each conflict resolution category
Resolve the conflict manually
Cardinality Constraints
Maximum and minimum occurrences of
relations often restricted
Acyclicity
Models often required to be acyclic
Cycles introduced in merging are
collapsed into a single element by
default
User can override default behavior
The Merge Algorithm
Initialize result G to null
Include Elements with equivalence
relation
Combine element properties
Combine and include relationships
Fundamental conflict resolution
Merge Steps
Actor
ID:
History:
HowRelated:
Name:
Etc…
Sim
ActorID
ID:
History:
HowRelated:
Name:
Etc…
ID:
History:
HowRelated:
Name:
Etc…
Bio
Bio
ID:
History:
HowRelated:
Name:
Etc…
ActorName
ID:
History:
HowRelated:
Name:
Etc…
FirstName
ID:
History:
HowRelated:
Name:
Etc…
ID:
History:
HowRelated:
Name:
Etc…
LastName
ID:
History:
HowRelated:
Name:
Etc…
Contributions
Technical requirements for a generic merge
operator
Use of a first-class input mapping model,
enabling richer correspondences
Characterization of when Merge can be
automatic
Taxonomy of conflicts and a definition of
conflict resolution strategies
Experimental evaluation and results
Evaluation
Merged Foundational Model of Anatomy (FMA) and
GALEN Common Reference Model
FMA contains 895,307 elements and 2,032,020 relationships
GALEN contains 155,307 elements and 569,384 relationships
Significant structural differences
Mapping contained 6265 1-to-1 correspondences
Evaluation Goals:
Limited changes to Merge would be needed
Merge would function on models this large
The merged results would not be simply read from the
mapping (i.e., the conflicts anticipated would occur)
Evaluation
Few non-fundamental changes had to be made
Merging took aprox. 20 hours
Merge results
1,045,411 elements with 9,096 duplicates
2,590,969 relationships
338 cycles, most of length 2, where found
1 cycle of length 18 was found
Merged correspondences:
3 element merges: 2344
3+ element merges: 623
1 element merges: 1215
Conclusions
Algorithm is well designed
Merge() is implemented in a generic way that
allows for different models
Definitions of conflict management are given
Implementation and execution was very labor
intensive
Slow, 13 weeks of expert work, 20 hours of
processor time
Relies on other systems with unknown results
Questions?
Generic Merge Requirements
1.
2.
3.
4.
5.
6.
7.
8.
Element preservation
Equality preservation
Relationship preservation
Similarity preservation
Meta-meta-model constraint satisfaction
Extraneous item prohibition
Property preservation
Value preference