Transcript foreign key

Ingénierie des bases de données avancées - avril 2010
LIBD
Database Reverse Engineering
Jean-Luc Hainaut
Based on the keynote of WCRE'09: Legacy and Future of Database Reverse Engineering
University of Namur
Faculté d'informatique
LIBD - Laboratory of Database Application Engineering
www.info.fundp.ac.be/libd
2
LIBD

What is Database Reverse Engineering?

Where does it come from?

Where is it going?
Contents
LIBD

Context, cultures and history

Databases: facts and challenges

Database reverse engineering motivation: the migration process

Database reverse engineering methodology

Database reverse engineering: A case study

Database reverse engineering: New challenges

Conclusions

Appendix 1: Conceptualization of foreign keys
3
4
LIBD
Context, cultures and history
Context, cultures and history
Database Reverse Engineering
LIBD

A collection of methods and tools to help an organization determine the
structure, function, and meaning of its data” [Chikofsky96]

A collection of methods and tools to help an organization to
recover/rebuild the complete/partial specifications of its data
A goal generally not reachable for programs
5
Context, cultures and history
LIBD
6
The role of the database in data-centered business applications
communications
GUI
manager
users
Application
(business logic)
DMS
DB schema
DB data
Context, cultures and history
7
Programming community vs DB community
LIBD
program
program
DB
DB
program
DB
• Programs implement
services available to client
agents
• The database is made up of
all the data that describe the
current state of the business
• Oh yes, wouldn't it be wise
to store data somewhere?
• Oh yes, wouldn't it be nice to
write some programs to
process these data?
Context, cultures and history
8
Building a database
LIBD
User
requirements
Information
analysis
Conceptual
schema
Logical
design
Logical (RDB)
schema
Physical
design
Physical (DB2)
schema
Coding
SQL-DDL code
Context, cultures and history
LIBD
9
Building a database

Analysis: elaborating the conceptual schema of the database (ERA,
UML class diagrams)

Logical Design: producing the logical schema of the database (e.g.,
hierarchical, relational, object-relational),

Physical Design: producing the physical schema of the database
(e.g., IMS/DL1, DB2, Oracle, SQL Server),

Coding: producing the DDL code of the database
Should produce a complete, up to date,
documentation of the DB
Context, cultures and history
10
What next?
LIBD

Database maintenance

Program maintenance

Database evolution

Database/application migration

Data conversion

Database integration/federation
All require a complete, up to date, documentation of the DB
Context, cultures and history
LIBD
No documentation?
Rebuild it through Reverse Engineering
11
Context, cultures and history
LIBD
12
Short history : three periods
80's : Discovery of the concepts of DBRE
• Migrating CODASYL DBs, IMS DBs, standard files  relational technology
• Techniques

automated DDL code interpretation

some trivial heuristics to elicit missing constraints (e.g., foreign keys)
• Approach insufficient to recover the complete database schema (implicit
data structures and constraints implemented in the procedural code and in
user interfaces)
Context, cultures and history
LIBD
Short history : three periods
90's : Deepening techniques and methodologies
• Additional elicitation techniques to recover implicit constructs
• Importance of program code analysis in DB schema recovery
• Development of comprehensive methodologies and tools.
• Need for reverse engineering relational databases
13
Context, cultures and history
LIBD
Short history : three periods
00's : Widening the scope of DBRE
• XML as a data model, the web as an infinite DB, semistructured/unstructured data, semantics expressed through conceptual
schemas and ontologies,
• MD engineering, traceability (formal mapping),
• help in automating full migration: schema + data + program,
• explosion of poorly designed (web) databases, complexity and size of
corporate databases, skill shortage
• dynamic SQL
• ORM-based program development
14
15
LIBD
Databases: facts and challenges
Databases: facts and challenges
LIBD
16
Some facts (1)

a company may use more than 10 DMSs to implement its information system;

a new version of a DMS every 4 years,

a database may be used by several thousands of programs;

the schema of a large database may include more than 1,000 entity types and
20,000 attributes

some database schemas have got so large and complex that no single data
administrator can master them any longer;
Databases: facts and challenges
LIBD
17
Some facts (2)

description of one entity type and its attributes: from 1 to 100 pages

the functional documentation of a large database may (should) comprise more
than 5,000 pages;

the SQL-DDL code of a database (tables, constraints, indexes, triggers, checks,
etc.) may comprise 200,000 LOC (5,000 pages);

however, many databases have no documentation.
Databases: facts and challenges
LIBD
Some facts (3)

database schemas share some interesting properties with programs:

corrective, preventive and adaptive maintenance (no added value) of an
information system may require more than 50% of development effort;

maintenance and evolution are much more costly when no correct
documentation is available.
• bugs
• awkward design
• dead parts
• obscure sections
• (nearly) duplicated sections
• developed on obsolete platforms
• poorly documented (if ever)
18
19
LIBD
DBRE Motivation:
the migration process
Database engineering - The migration process
LIBD
20
Database/application migration

Porting a complete legacy application, or some of its components, on another,
generally more modern, platform.

For a database: changing its DMS. A popular example: migrating the legacy
database of a business application to a RDBMS.

Two main approaches :
• physical approach
• semantic approach
Database engineering - The migration process
21
Database/application migration (2)
LIBD

The physical, or one-to-one migration strategy is the cheapest but also the
worst approach since it deeply degrades the final structure.
Requires no documentation of the DB – Very popular
Physical (IDMS)
schema
Physical
extraction
IDMS-DDL code
Transform
Physical (DB2)
schema
Coding
SQL-DDL code
Database engineering - The migration process
22
Database/application migration (3)
LIBD

Semantic approach: based on an in-depth understanding of the DB structures.
Provides a high quality result. Strong basis for the future.
Requires a complete, up to date, documentation of the DB
Conceptualization
Logical (DBTG)
schema
Logical
extraction
Reverse
Engineering
Physical (IDMS)
schema
Physical
extraction
IDMS-DDLcode
code
IDMS-DDL
Conceptual
Conceptual
schema
schema
Logical
Logical
design
design
Physical
Physical
design
design
Logical(RDB)
(RDB)
Logical
schema
schema
Physical (DB2)
(DB2)
Physical
schema
schema
Coding
Coding
SQL-DDLcode
code
SQL-DDL
Database engineering - The migration process
LIBD
Migrating an undocumented data structure (1)

physical (one-to-one) migration
SELECT CLIENT ASSIGN TO "CUST.DAT"
ORGANIZATION IS INDEXED
RECORD KEY IS CUST_ID.
FD CUST-FILE.
01 CUSTOMER.
02 CUST-ID PIC X(12).
02 CUST-INFO PIC X(80).
02 CUST-HIST PIC X(1000).
=
CUSTOMER
CUST-ID: char (12)
CUST-INFO: char (80)
CUST-HIST: char (1000)
id: CUST-ID

Create table CUSTOMER(
CUST_ID char(12) not null,
CUST_INFO char(80) not null,
CUST_HIST char(1000) not null,
primary key (CUST_ID));
=
CUSTOMER
CUST_ID: char (12)
CUST_INFO: char (80)
CUST_HIST: char (1000)
id: CUST_ID
23
Database engineering - The migration process
LIBD
24
Migrating an undocumented data structure (2)

semantic migration (refinement)
SELECT CLIENT ASSIGN TO "CUST.DAT"
ORGANIZATION IS INDEXED
RECORD KEY IS CUST_ID.
FD CUST-FILE.
01 CUSTOMER.
02 CUST-ID PIC X(12).
02 CUST-INFO PIC X(80).
02 CUST-HIST PIC X(1000).
+

CUSTOMER
CUST-ID: char (12)
CUST-INFO: compound (70)
NAME: char (20)
ADDRESS: char (40)
STATUS: char (10)
CUST-HIST -PURCH[0-100] array: compound (10)
ITEM: num (5)
T OTAL: num (5)
id: CUST-ID
id(CUST -HIST -PURCH):
ITEM

CUSTOMER
CUST-ID: char (12)
CUST-INFO: com pound (70)
NAME: char (20)
ADDRESS: char (40)
STATUS: char (10)
id: CUST-ID
0-100
record
1-1
CUST-HIST-PURCH
Index: index (4)
ITEM: num (5)
TOTAL: num (5)
id: record.CUSTOMER
ITEM
id': record.CUSTOMER
Index
Database engineering - The migration process
LIBD
25
Migrating an undocumented data structure (3)

semantic migration (SQL translation)
CUSTOMER
CUST-ID: char (12)
CUST-INFO: com pound (70)
NAME: char (20)
ADDRESS: char (40)
STATUS: char (10)
id: CUST-ID
0-100
record
Create table CUSTOMER(
CUST_ID
char(12) not
CUST_NAME
char(28) not
CUST_ADDRESS char(60) not
CUST_STATUS char(2) not
primary key (CUST_ID));
1-1
CUST-HIST-PURCH
ITEM: num (5)
Index: index (4)
TOTAL: num (5)
id: record.CUSTOMER
ITEM
id': record.CUSTOMER
Index

null,
null,
null,
null,
Create table CUST_HIST_PURCH(
CUST_ID
char(12) not null,
ITEM
char(10) not null,
CINDEX
smallint not null check(CINDEX <= 100),
TOTAL
smallint not null,
primary key (CUST_ID,ITEM),
unique (CUST_ID,CINDEX),
foreign key (CUST_ID) reference CUSTOMER);

CUSTOMER
CUST_ID
CUS_NAME
CUS_ADDRESS
CUS_STATUS
id: CUST_ID
CUST_HIST_PURCH
CUST_ID
ITEM
CINDEX
TOTAL
id: CUST_ID
ITEM
id': CUST_ID
CINDEX
ref: CUST_ID
No m ore than
100 CUST_HIST_PURCH
per CUSTOMER
Database engineering - The migration process
LIBD
Migrating an undocumented data structure - Synthesis
physical migration
Create table CUSTOMER(
CUST_ID char(12) not null,
CUST_INFO char(80) not null,
CUST_HIST char(1000) not null,
primary key (CUST_ID));
Create table CUSTOMER(
CUST_ID
char(12) not
CUST_NAME
char(28) not
CUST_ADDRESS char(60) not
CUST_STATUS char(2) not
primary key (CUST_ID));
semantic migration
null,
null,
null,
null,
Create table CUST_HIST_PURCH(
CUST_ID
char(12) not null,
ITEM
char(10) not null,
CINDEX
smallint not null check(CINDEX <= 100),
TOTAL
smallint not null,
primary key (CUST_ID,ITEM),
unique (CUST_ID,CINDEX),
foreign key (CUST_ID) reference CUSTOMER);
26
Database engineering - The migration process
LIBD
Migrating an undocumented data structure (4)

new application: compute total sales per item
CUSTOMER
CUST-ID: char (12)
CUST-INFO: char (80)
CUST-HIST: char (1000)
id: CUST-ID
CUSTOMER
CUST_ID
CUS_NAME
CUS_ADDRESS
CUS_STATUS
id: CUST_ID


?
CUST_HIST_PURCH
CUST_ID
ITEM
CINDEX
TOTAL
id: CUST_ID
ITEM
id': CUST_ID
CINDEX
ref: CUST_ID
Select ITEM, sum(TOTAL)
from
CUST_HIST_PURCH
group by ITEM;
• where is the required information?
• clearly visible + documentation if needed
• how to extract it from the CUSTOMER table?
• just name the columns
• who will develop the (C, Java, VB) program?
• by any non expert
• … and when?
• immediately, 2 minutes
27
28
LIBD
Database reverse engineering:
a DMS-independent methodology
Database reverse engineering - Introduction
LIBD
29
DBRE: the ideal view
create table COUNTRY (
CNAME char(24) not null,
VOLUME numeric(12) not null,
primary key (CNAME));
create table EXPORT (
CNAME char(24) not null,
SNAME char(18) not null,
primary key (CNAME, SNAME),
foreign key (CNAME)
references COUNTRY,
foreign key (SNAME)
references SECTOR);
create table SECTOR (
SNAME char(18) not null,
INCOME numeric(14) not null,
primary key (SNAME));
SECTOR
Sector Name
Income
id: Sector Name
0-N
COUNTRY
Country Name
Volume
id:Country Name
export
0-N
Database reverse engineering - Introduction
LIBD
30
DBRE: the actual view
create table TBL_C (
IDC char(24) not null,
COL_2 numeric(12) not null);
create table TBL_X (
IDX char(42) not null);
create table TBL_S (
IDS char(18) not null,
COL_2 numeric(14) not null);
create unique index ID_C on
TBL_C(IDC);
create unique index ID_X on
TBL_X(IDX);
?
SECTOR
Sector Name
Income
id: Sector Name
0-N
COUNTRY
Country Name
Volume
id:Country Name
export
0-N
Database reverse engineering - Introduction
LIBD
Where to find the additional information?

DDL analysis or catalog extraction (35%)

Schema analysis (10%)

Data analysis (10%)

HMI analysis (10%)

Program analysis (35%)
31
Database reverse engineering - Introduction
What are the challenges?
LIBD



The DDL code (or the data
dictionary) gives us the explicit
data structures and constraints.
Less than 50%

Physical extraction
Many undeclared, therefore
hidden, implicit data structures
and constraints.
More than 50%

Logical extraction
Many complex, non-standard
constructs: difficult to interpret.

Conceptualization
32
DBRE - Methodology
LIBD
33
The DB-MAIN Methodology - The technical processes
DDL code
Documentation
Physical
extraction
Physical
schema
Logical
extraction
Logical
schema
Conceptualization
Conceptual
schema
Program code
Data
User interface
Reports
DBRE - Methodology
LIBD
34
DB-MAIN methodology: Physical extraction
create table TBL_C (
IDC char(24) not null,
COL_2 numeric(12) not null);
create table TBL_X (
IDX char(42) not null);
create table TBL_S (
IDS char(18) not null,
COL_2 numeric(14) not null);
create unique index ID_C on
TBL_C(IDC);
create unique index ID_X on
TBL_X(IDX);
step 1
TBL_X
IDX
id': IDX
acc
TBL_C
IDC
COL_2
id': IDC
acc
TBL_S
IDS
COL_2
DBRE - Methodology
LIBD
35
DB-MAIN methodology : Logical extraction
TBL_X
IDX
id': IDX
acc
TBL_C
IDC
COL_2
id': IDC
acc
TBL_S
IDS
COL_2
 cryptic names
 missing unique keys
 missing foreign keys
 concatenated columns
step 2
EXPORT
CNAME
SNAME
id: CNAME
SNAME
ref: CNAME
ref: SNAME
COUNTRY
NOMP
VOLUME
id: NOMP
SECTOR
SNAME
CA
id: SNAME
DBRE - Methodology
LIBD
36
DB-MAIN methodology : Conceptualization
EXPORT
CNAME
SNAME
id: CNAME
SNAME
ref: CNAME
ref: SNAME
COUNTRY
CNAME
VOLUME
id: CNAME
SECTOR
SNAME
INCOME
id: SNAME
 technology-dependent
constructs
 semantic interpretation
unclear (sort of!)
step 3
SECTOR
Sector Name
Income
id: Sector Name
0-N
COUNTRY
Country Name
Volume
id:Country Name
export
0-N
DBRE - Methodology
37
Project management
LIBD
Project
planning
Identification
Pilote
Evaluation
Stop/go
Source
management
Selection
Conditioning
Full project
Physical
extraction
Sch. analysis
Data analysis
Logical
extraction
Prog. analysis
Others
Conceptualization
DBRE - Methodology
LIBD
Analysis techniques for Logical extraction
1. Schema analysis
2. Data analysis
3. Programming pattern (clichés) analysis
4. Query analysis
5. Program dependency analysis
6. Program slicing
others
38
DBRE - Methodology
LIBD
39
Analysis techniques for Logical extraction
1. Schema analysis
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID

ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
ref: ORD-CUST
One of the main heuristics in the 80's
DBRE - Methodology
LIBD
40
Analysis techniques for Logical extraction
2. Data analysis
PATIENT
SS_NUMBER
REG_NUMBER
NAME
ADDRESS
id: SS_NUMBER
select REG_NUMBER, count(*) as N
from
PATIENT
group by REG_NUMBER
having count(*) > 1

PATIENT
SS_NUMBER
REG_NUMBER
NAME
ADDRESS
id: SS_NUMBER
id': REG_NUMBER
REG_NUMBER |
N
-----------|-----|
DBRE - Methodology
LIBD
41
Analysis techniques for Logical extraction
3. Programming pattern (clichés) analysis
Much can be learned on the DB structures by examining how programs use the data.
As a consequence, several code analysis techniques have been designed to extract
information on these data structures.
Four popular examples:
 pattern analysis
 DML statement analysis (e.g., SQL queries analysis)
 dependency analysis
 program slicing
DBRE - Methodology
LIBD
42
Analysis techniques for Logical extraction
3. Programming pattern (clichés) analysis
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID
ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
+
select * from CUSTOMER
where CUST_ID = :CN;
if SQLCODE = 0 then begin
insert into ORDER
values(:ON,:CN,:OD);
end;
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID

ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
ref: ORD-CUST
DBRE - Methodology
LIBD
43
Analysis techniques for Logical extraction
4. Query analysis
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID
ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
+
select *
from CUSTOMER, ORDER
where CUST_ID = ORD_CUS
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID

ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
ref: ORD-CUST
DBRE - Methodology
44
Analysis techniques for Logical extraction
LIBD
5. Program dependency/flow analysis
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID
CUSTOMER
CUST-ID: char (16)
CUST-NAME: char (32)
CUST-ADDRESS: char (120)
id: CUST-ID
?
ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM
ORDER
ORD-NUM: num (8)
ORD-CUST: num (16)
ORD-DATE: date (8)
id: ORD-NUM

+
select
where
C = ""
insert
ORD_CUST into :OC from ORDER
ORD_NUM = :ON;
+ OC;
into CUSTOMER values(:C,'?','?');
CUSTOMER.CUST_ID
C

ORDER.ORD_CUST
OC
DBRE - Methodology
45
Analysis techniques for Logical extraction
LIBD
6. Program slicing

Considering program P, statement s in P, and variables V of P, the
slice (s,V) in P is the subset of P that contributes to the state of V at
point s [Weiser,84].

Example : identify all the statements that fill a record before it is stored
in the database. Idea: the data validation statements must be
somewhere in this slice.

Has been extended
[Horwitz,90].

Has been extended to complex DB programs and schema objects
[Cleve,06].
to
programs
that
comprise
procedures
Useful to reduce the exploration space when searching programs for
patterns, clichés, dependencies, etc.
DBRE - Methodology
The DB-MAIN Methodology (1)
LIBD

Developed by the Laboratory of Database Engineering (LIBD),
University of Namur, since 1993 (work started in 1989). Effort
of about 40 man-year.

Supported by the DB-MAIN CASE environment

Support DDL, schema, data, DML, program code and HMI
analysis techniques

Produces the enriched logical schema, the conceptual schema
and the inter-schema mappings
46
DBRE - Methodology
47
The DB-MAIN Methodology (2)
LIBD

Extendible: can accommodate new sources, new technologies, new
languages, new DBRE techniques.

Integrated with database design methodologies

Targets most Data manager: RPG, COBOL files, IMS/DL1, IDMS/IDS2,
RDB, ORDB, XML

Automatic generation of documentation and additional code (interschema mappings, wrappers, ETL)
48
LIBD
Database reverse engineering:
A case study
DBRE - Methodology
49
DB-MAIN - A case study
LIBD
Context:
Belgian car distributor - Spare parts management application
Objective: migrating an IDMS database to a RDBMS
Size:
medium size database (324 record types)
First trial : physical migration, outsourced
Result : failure, very poor performance, unreadable code, the migrated
system cannot be maintained any longer . . .
Second trial : semantic migration, in house (DB-MAIN methodology)
Result : success; satisfying performance, maintainable result; gave the
project leader the "IT manager of the year" award in 2002!
DBRE - Methodology
50
Methodology (physical migration)
LIBD
Physical (IDMS)
schema
Physical
extraction
IDMS-DDL code
Transform
Physical (DB2)
schema
Coding
SQL-DDL code
DBRE - Methodology
51
Methodology (semantic migration)
LIBD
Conceptualization
Conceptual
schema
Logical design
Logical (RDB)
schema
Physical (DB2)
schema
Logical (DBTG)
schema
Logical
extraction
Physical design
Physical (IDMS)
schema
Physical
extraction
Coding
IDMS DDL code
SQL-DDL code
DBRE - Physical schema (IDMS) -Excerpts
LIBD
52
DBRE - (Raw) Logical schema (IDMS) - Excerpts
LIBD
53
DBRE - Conceptual schema - Excerpts
LIBD
54
DBRE - Logical schema (Relational) - Excerpts
LIBD
55
56
LIBD
New challenges
New challenges
LIBD

New forms of data

New engineering approaches

Higher process automation

Standard problems getting more critical

Dynamic SQL

ORM-based program development
57
New challenges
58
1. New forms of data
LIBD
RDB and ORDB are no longer the ultimate data models
Emergence of semi-structured and unstructured data

XML: complex data are stored and transmitted in XML format

HTML: web pages contains valuable data; requires more interpretation
(conceptualization) effort

Text: no structure but rich semantic contents
Unifying structural data semantics (DB) and unstructured data semantics :
increasing role of ontologies
New challenges
59
2. New engineering approaches
LIBD
 Software engineering is going model-driven (MDE)
 "Reverse Engineering is Reverse Forward Engineering" [Baxter,1997]
 DBRE basically is model-driven, and, more important, transformational
 many DBRE activities can be modeled as schema transformations

for higher reliability and completeness

for formal traceability

for better automation
New challenges
60
3. Higher process automation
LIBD

Reverse engineering is a prerequisite to most evolution processes

If performed correctly, it provides inter-schema mappings as a byproduct

These mappings can be used to automate evolution processes

Example: migration/evolution of an application from a legacy DBMS to a
modern one:
 migrating the schema (conceptual approach) [Hick, 2001]

migrating the data [Hick, 2001]

migrating the application programs [Cleve, 2009]
New challenges
4. Standard problems getting more critical
LIBD

Explosion of poorly designed (web) databases

Increasing complexity and size of corporate databases

Skill shortage in legacy technologies
61
New challenges
62
5. Dynamic SQL
LIBD

In the 80's: major use of static SQL
OD = "11-08-2009";
exec SQL select C.CUST_ID, O.ORD_NUM into :CI, :ON
from
CUSTOMER C, ORDER O
where O.ORD_CUST = C.CUST_ID
and
O.ORD_DATE = :OD;

In the 00's, increasing use of dynamic SQL (ODBC, JDBC, PHP)
Connection conn; String query; Statement inst; ResultSet res;
inst = conn.createStatement();
OD = "11-08-2009";
query = "select CUST_ID, ORD_NUM "
+ "from
CUSTOMER C, ORDER O where CUST_NUM = "
+ "where O.ORD_CUST = C.CUST_ID and O.ORD_DATE = "
+ "'" + OD + "'";
res = inst.executeQuery(query);
CI = res.getString(1); ON = res.getInt(2);
New challenges
63
5. Dynamic SQL
LIBD

Much research has been devoted to static analysis of (static) SQL queries in
application programs.

However: static analysis of programs and queries is no longer sufficient.

Solution: to resort to dynamic analysis, but much more complex.

Aspect technology promising [Cleve,08]
New challenges
LIBD
64
6. ORM-based program development
"Object-relational mapping (ORM, O/RM, and O/R mapping) in computer
software is a programming technique for converting data between
incompatible type systems in relational databases and object-oriented
programming languages. This creates, in effect, a "virtual object database"
that can be used from within the programming language."
Wikipedia [Object-relational_mapping]
Oct. 2009
• Programs work on objects
• Data are stored in relational tables
• ORM API's map objects on table rows
• ... and conversely
complex, N:N,
mappings allowed
New challenges
LIBD
65
6. ORM-based program development
Application
program
Relational
DB schema
Application
program
Class
schema
O/R
Mapping
RDB data
standard architecture
Relational
DB schema
RDB data
ORM architecture
New challenges
LIBD
66
6. ORM-based program development
ORM API's are getting very popular:
C++:
LiteSQL, Debea, dtemplatelib, hiberlite, romic, SOCI, etc.
Delphi: Bold for Delphi, Macrobject DObject, InstantObjects, tiOPF, etc.
Java: Carbonado, Cayenne, CocoBase, Ebean, EJB 3.0, Enterprise Objects Framework,
Hibernate, iBATIS, JPM2Java, JPOX, Kodo, Object Relational Bridge OpenJPA,
SimpleORM, Spring, TopLink, Torque, GenORMous, etc.
.NET: ADO.NET Entity Framework, Atlas, Base One Foundation Component Library, BCSEi
ORM Code Generator, Business Logic Toolkit for .NET, Castle ActiveRecord,
DataObjects.Net, CocoBase, Devart LINQ to SQL, DevForce, Developer Express, ECO,
EntityORM, EntitySpaces, Euss, Habanero, iBATIS, Invist, LLBLGen Pro, LightSpeed,
Altova Mapforce, Neo, .netTiers, NConstruct, NHibernate, Opf3, ObjectMapper .NET,
Picasso, OpenAccess, TierDeveloper, Persistor.NET, Quick Objects, Sooda, Subsonic,
Wilson ORMapper, etc.
PHP: CakePHP, Coughphp, DABL, Data Shuffler, dbphp, Doctrine, dORM, EZPDO, Hormon,
LightOrm, Outle, pdoMap, PersistentObject, PHPSimpleDB, Propel, Rocks, Qcodo,
Redbean, Xyster, etc.
Python: Django, SQLAlchemy, SQLObject, Storm, etc.
Ruby: Active Record, Ruby on Rails, Datamapper, iBATIS, Sequel, etc.
Pearl: DBIx::Class, Rose::DB::Object, Fey::ORM, Jifty::DBI, DBIx::DataModel,
Data::ObjectDriver, Class::DBI, etc.
New challenges
LIBD
67
6. ORM-based program development
"Object/relational mapping is the Vietnam of Computer Science"
Ted Neward
New challenges
LIBD
6. ORM-based program development
"Object/relational mapping is the Vietnam of Computer Science"
"It represents a quagmire which starts well, gets more complicated as time
passes, and before long entraps its users in a commitment that has no
clear demarcation point, no clear win conditions, and no clear exit
strategy."
Ted Neward's Technical Blog
June 2006
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
68
New challenges
LIBD
69
6. ORM-based program development
Three approaches to ORM definition:
1. the database pre-exists : the class schema is ([semi-]automatically) derived
from the DB schema;
problem: the class schema imperfectly fits programmers view
2. the class schema pre-exists : the database is ([semi-]automatically) derived
from the class schema;
problems: the DB schema imperfectly fits database good practices and may not
meet the needs of other applications (the one-DB-per-AP syndrom inherited
from the 60's);
3. both class and database schemas pre-exist : the mapping is built manually;
problem: the mapping is built manually!
New challenges
LIBD
70
6. ORM-based program development
Some observations:

Both schemas can evolve independently without the mapping being maintained

Class schema can be very large

Database schema generally is very large

Class schemas are poorly documented

Database schemas are poorly documented (even more than it used to be)

Mappings are poorly documented
New challenges
LIBD
71
6. ORM-based program development
So, what's the point?
maintenance and evolution are more than "problematic"
Maintenance and evolution require
"a complete, up to date, documentation of the DB"
... and of the classes
... and of the mappings.
Most often, neither the class schema nor the (generally ill-designed) DB have a
decent documentation
So, maintenance and evolution require preliminary reverse engineering.
New challenges
LIBD
6. ORM-based program development
Example
a large WebLogic application accessing an Oracle database through EJB
(Enterprise Java Beans) entities defined on SQL views.
Problem: maintenance and evolution are getting painful.
Objective:
1. redocumentation of the Oracle database (base schema, views, checks,
triggers, stored procedures);
2. redocumentation of the EJB entities
3. redocumentation of the O/R mappings
72
New challenges
LIBD
6. ORM-based program development
DB physical extraction:
Tables:
835; 10,683columns;
Views:
1,024; 12,998 columns;
Foreign keys:
1,447;
Checks:
2,232;
Triggers:
978.
EJB physical extraction:
Entities:
244;
Relations:
305;
Fields:
3,819;
Methods:
642.
73
New challenges
LIBD
6. ORM-based program development
Mapping analysis:

several EJB entities are mapped on missing tables/views (obsolete
entities);

26% foreign keys declared in EJB entities are not declared in the DB
schema (inconsistencies, implicit foreign keys); EJB ahead of DB

when an EJB entity uses a table, it does not always declare its unique and
foreign keys (inconsistencies, program execution abnomalies); DB ahead
of EJB
we now understand why maintenance
and evolution were declared so painful!
74
New challenges
LIBD
6. ORM-based program development
Conclusion: EJB/DB inconsistencies have been found through the
reverse engineering of both schemas.
75
76
LIBD
Database reverse engineering:
Conclusions
Conclusions
LIBD
77
DBRE often is a critical process in information system management
 the database is the central component of every IS.
 any weakness in this component weakens the whole IS
In the reverse engineering process, the database is not an isolated component
 the DB cannot be understood independently of the programs
 the programs cannot be understood independently of the DB
Conclusions
LIBD
78
There is no fixed point in DBRE: new technologies make the documentation
problem more critical and more complex
 rapid website development may produce apocalyptic databases
 OO programming and particularly ORM
Risk of transforming DB knowledge into a niche culture.
 ORM makes the DB transparent
 for many application programmers, the DB is where « persistence services »
(probably) are located. We have a DBA for that. Not an easy guy.
 for many application programmers, Domain modeling consists in designing a
nice Java class system.
Contacts
79
LIBD
Laboratory: http://info.fundp.ac.be/libd
ReveR:
http://www.rever-sa.com
80
LIBD
81
LIBD
Appendix 2
Foreign key conceptualization
DBRE - Foreign keys
82
LIBD
Foreign keys probably are the most important constructs
both in the logical extraction and conceptualization processes.
Field practice shows that they are used to express a surprising large variety
of semantic patterns.
DBRE - Foreign keys
83
LIBD

Standard foreign keys and basic variants

Non standard foreign keys

Inclusion constraints

Complex foreign key patterns

Interval foreign keys

Pathological foreign keys
Excepts from
Jean-Luc Hainaut, Conceptual interpretation of foreign keys, DB-MAIN Technical report, May
2010, http://www.info.fundp.ac.be/~dbm/Documents/Publications-LIBD/TechnicalReports/Conceptual-interpretation-of-FK-(stand-alone).pdf
DBRE - Foreign keys
LIBD
84
Standard foreign keys and basic variants
• Standard foreign key
SHIPMENT
ShipID
Cus tom er
Item Code
OrderDate
ShipDate
id: ShipID
ref: Cus tom er
Item Code
OrderDate
ORDER
Cus tID
ProdNumber
ODate
Qty
id: Cus tID
ProdNumber
ODate

SHIPMENT
ShipID
1-1
ShipDate
id: ShipID
for
0-N
ORDER
Cus tID
ProdNum ber
ODate
Qty
id: Cus tID
ProdNum ber
ODate
DBRE - Foreign keys
LIBD
85
Standard foreign keys and basic variants
• Optional foreign key
VEHICLE
VehicID
Make
Model
Data
Driver[0-1]
id: VehicID
ref: Driver
SALESMAN
SManID
Nam e
Address
id: SManID

VEHICLE
VehicID
Make
Model
Data
id: VehicID
0-1
driven by
SALESMAN
SManID
0-N Nam e
Address
id: SManID
DBRE - Foreign keys
LIBD
86
Standard foreign keys and basic variants
• Optional, multi-components foreign key
STUDENT
StudID
Name
Option
Title[0-1]
Year[0-1]
id: StudID
ref: Title
Year
coex
ideal pattern
DISSERTATION
Title
Year
Advis or
id: Title
Year
DISSERTATION
Title
Year
Advis or
id: Title
Year

STUDENT
StudID
Name
Option
id: StudID
0-N
0-1
writes
DBRE - Foreign keys
LIBD
87
Standard foreign keys and basic variants
• Optional, multi-components foreign key
STUDENT
StudID
Nam e
Option
Title[0-1]
Year[0-1]
id: StudID
ref: Title
Year
DISSERTATION
Title
Year
Advis or
id: Title
Year
STUDENT
StudID
Nam e
Option
id: StudID

STUD_T
STUD_Y
Year
Title
STUD_TY
often found
ref: STUD_T.Title
STUD_Y.Year
DISSERTATION
Title
Year
Advis or
id: Title
Year
= ideal pattern
DBRE - Foreign keys
LIBD
88
Standard foreign keys and basic variants
• Total foreign key (total FK)
DETAIL
OrdNum
ProdNum
Qty
id: OrdNum
ProdNum
equ: OrdNum
ORDER
OrdNumber
OrdDate
id: OrdNumber
DETAIL

ProdNum
Qty
id: from.ORDER
ProdNum
ORDER
1-1
from
OrdNumber
1-N
OrdDate
id: OrdNumber
DBRE - Foreign keys
LIBD
89
Standard foreign keys and basic variants
• Identifying foreign key
VEHICLE
VehicID
Make
Model
Data
Driver
id: VehicID
id': Driver
ref
SALESMAN
SManID
Nam e
Address
id: SManID

VEHICLE
VehicID
Make
Model
Data
id: VehicID
1-1
driven by
SALESMAN
SManID
0-1 Nam e
Address
id: SManID
DBRE - Foreign keys
LIBD
90
Standard foreign keys and basic variants
• Identifying foreign key
CUSTOMER
Cus tID
Nam e
Address
id: Cus tID
C_STATISTICS
Cus tID
TotalAm ount
Las tBuyDate
id: Cus tID
ref

CUSTOMER
Cus tID
0-1
Name
Address
id: Cus tID
of
C_STATISTICS
1-1 TotalAmount
Las tBuyDate
id becomes implicit
DBRE - Foreign keys
LIBD
91
Standard foreign keys and basic variants
• Cyclic foreign key
replaces
PRODUCT
ProductCode
Nam e
Price
QtyonHand
Subs titute[0-1]
id: ProductCode
ref: Subs titute
s ubstitute
0-N
0-1
PRODUCT

ProductCode
Nam e
Price
QtyonHand
id: ProductCode
DBRE - Foreign keys
LIBD
92
Non standard foreign keys
• Secondary foreign key
INVOICE
InvNum
InvDate
Patient
id: InvNum
ref: Patient
PATIENT
RegNum ber
SSNum ber
Nam e
id: RegNum ber
id': SSNum ber
INVOICE

InvNum
InvDate
id: InvNum
1-1
to
PATIENT
RegNum ber
SSNum ber
0-N
Nam e
id: RegNum ber
id': SSNum ber
DBRE - Foreign keys
LIBD
93
Non standard foreign keys
• Secondary foreign key (to optional id)
PERFORMANCE
GolfLicNumber
Year
BestScore
id: Year
GolfLicNumber
equ: GolfLicNumber
PERSON
PID
Name
Address
GolfLicNumber[0-1]
id: PID
id': GolfLicNumber

PERFORMANCE
GolfLicNum ber
Year
BestScore
id: Year
GolfLicNum ber
equ: GolfLicNum ber
PERSON
PID
Nam e
Address
id: PID
GOLFER
GolfLicNum ber
id': GolfLicNum ber
DBRE - Foreign keys
LIBD
94
Non standard foreign keys
• Multi-target foreign key
SERVICE
EXPENSE
BUDGET
SERVICE
EXPENSE
BUDGET
ServiceID
Name
id: ServiceID
ExpNum
Date
Amount
Origin Y
X
id: ExpNum
ref: Origin
ref: X
BudgetID
Nature
Amount
id: BudgetID
ServiceID
Name
id: ServiceID
ExpNum
Date
Amount
Origin
id: ExpNum
ref: Origin
ref: Origin
BudgetID
Nature
Amount
id: BudgetID
=
seems simple but includes a complex constraint
DBRE - Foreign keys
LIBD
95
Non standard foreign keys
• Multi-target foreign key
SERVICE
EXPENSE
BUDGET
ServiceID
Name
id: ServiceID
ExpNum
Date
Amount
Origin Y
X
id: ExpNum
ref: Origin
ref: X
BudgetID
Nature
Amount
id: BudgetID

SERVICE
ServiceID
Name
id: ServiceID
0-N
by
EXPENSE
ExpNum
1-1 Date
Amount
id: ExpNum
BUDGET
BudgetID
Nature
Amount
id: BudgetID
0-N
1-1
on
for e  EXPENSE,
e.by.SERVICE.ServiceID = e.on.BUDGET.BudgetID

SERVICE
ServiceID
0-1
Name
id: ServiceID
of
BUDGET
BudgetID
1-1 Nature
Amount
id: BudgetID
EXPENSE
ExpNum
1-1
Date
Amount
id: ExpNum
0-N
on
for b  BUDGET,
b.BudgetID = b.of.SERVICE.ServiceID
DBRE - Foreign keys
LIBD
96
Non standard foreign keys
• Alternate foreign key
SERVICE
ServiceID
Nam e
id: ServiceID
EMPLOYEE
EmpID
Name
Address
id: EmpID
SERVICE
ServiceID
Name
id: ServiceID
VEHICLE
PlateNumber
Model
Date
Owner X
X
id: PlateNumber
ref: Owner
ref: X

owner
0-N
EMPLOYEE
EmpID
Nam e
Address
id: EmpID
owned
1-1
VEHICLE
PlateNumber
Model
Date
id: PlateNumber
DBRE - Foreign keys
LIBD
97
Non standard foreign keys
• Hierarchical foreign key to an ET
DEPARTMENT
DptNam e
Location
id: DptNam e
DEPARTMENT
DptName
Location
id: DptName
0-N
0-N
in
1-1
SERVICE
ServName
Budget
id: in.DEPARTMENT
ServName
EXPENSE
ExpID
Date
Amount
DptName
ServName
id: ExpID
ref: DptName
ServName

in
1-1
SERVICE
ServNam e
0-N
Budget
id: in.DEPARTMENT
ServNam e
by
EXPENSE
ExpID
1-1 Date
Am ount
id: ExpID
DBRE - Foreign keys
LIBD
98
Non standard foreign keys
• Hierarchical foreign key to a multivalued attribute
ORDER
OrdID
Date
id: OrdID
ORDER
OrdID
Date
id: OrdID
0-20
SHIPMENT
ShipNum ber
Date
OrdID
Item Code
Qty
id: ShipNum ber
ref: OrdID
Item Code
ORDER
OrdID
Date
Detail[0-20]
Item Code
O
Qty
O OrdID
id:
id(Detail):
id: O
Item Code
0-20

SHIPMENT
ShipNumber
Date
OrdID
Item Code
Qty
id: ShipNumber
ref: OrdID
Item Code
in
1-1
DETAIL
Item Code
Qty
id: in.ORDER
Item Code
in

1-1
DETAIL
Item Code
Qty
id: in.ORDER
Item Code
SHIPMENT
ShipNum ber
1-1
Date
Qty
id: ShipNum ber
0-N
for
DBRE - Foreign keys
LIBD
99
Non standard foreign keys
• Computed foreign key
PURCHASE

FISCAL-YEAR
Year
Budget
id: Year
FISCAL-YEAR
Year
Budget
id: Year
0-N
for
1-1
PurchID
Agent
Date
Am ount
id: PurchID
for pPURCHASE,
p.for.FISCAL-YEAR.Year = f(p.Date)
PURCHASE
PurchID
Agent
Date
Amount
id: PurchID
ref: f(Date)

FISCAL-YEAR
Year
Budget
id: Year
0-N
for
PURCHASE
PurchID
Agent
Date
1-1
Month
Day
Amount
id: PurchID
DBRE - Foreign keys
LIBD
100
Non standard foreign keys
• Computed foreign key
TAX-RATE
Country
Year
Rate
id: Country
Year
PURCHASE
PurchID
Customer
Year
Amount
id: PurchID
ref: Customer
ref: f(Customer)
Year
0-N
TAX-RATE
Country
Year
Rate
id: Country
Year
CITY
CityID
CityName
ContryName
id: CityID
CITY
CityID
CityNam e
ContryName
id: CityID
0-N
0-N
CUSTOMER
CustomerID
Name
City
id: CustomerID
ref: City

in
at_rate
1-1
1-1
PURCHASE
PurchID
Year
Amount
id: PurchID
CUSTOMER
Cus tomerID
Nam e
id: Cus tomerID
1-1
by

COUNTRY
ContryName
id: ContryName
0-N
in
in
1-1
1-1
TAX-RATE
CITY
Year
Rate
id: in.COUNTRY
Year
CityID
CityNam e
id: CityID
0-N
0-N
in
at_rate
1-1
1-1
PURCHASE
CUSTOMER
PurchID
Year
Amount
id: PurchID
Cus tom erID
Nam e
id: Cus tom erID
0-N
1-1
for pPURCHASE:
p.at_rate.TAX-RATE.Country
= p.by.CUSTOMER.in.CITY.CountryName
by
0-N
for pPURCHASE:
p.at.TAX-RATE.for.COUNTRY
= p.by.CUSTOMER.in.CITY.in.COUNTRY
DBRE - Foreign keys
LIBD
101
Non standard foreign keys
• Non-1NF foreign key
PERSON
BOOK
PERSON
PID
Name
Address
id: PID
BookID
Title
Borrowing[0-1]
Borrower
Date
id: BookID
ref: Borrowing.Borrower

BOOK
BookID
Title
Date[0-1]
id: BookID
coex: by.PERSON
Date
PID
Nam e
Address
id: PID
0-N
0-1

PERSON
PID
Nam e
Address
id: PID

BOOK
BookID
Title
id: BookID
PERSON
PID
Nam e
Addres s
id: PID
0-1
0-N
of
Borrowing
Date
1-1
0-N
BOOK
BookID
Title
id: BookID
by
0-1
Borrowing
Date
1-1
by
DBRE - Foreign keys
LIBD
102
Non standard foreign keys
• Non-1NF foreign key
CUSTOMER
ORDER
Cus tID
Name
Address
Order[0-N]
id: Cus tID
ref: Order[*]
OrdID
Date
Detail[0-20]
Item Code
Qty
id: OrdID
ref: Detail[*].ItemCode
ITEM
Item ID
Name
UnitPrice
id: Item ID

CUSTOMER
Cus tID
Nam e
Address
id: Cus tID
0-N
place
0-N
ORDER
ITEM
OrdID
Date
id: OrdID
Item ID
Nam e
UnitPrice
id: Item ID
0-20
of
1-1
0-N
Detail
Qty
id: ref.ITEM
of.ORDER
ref
1-1
DBRE - Foreign keys
LIBD
103
Inclusion constraints
• Inclusion constraint
OFFER
Supplier
Item
id: Supplier
Item
ORDER
OrderID
Supplier
Item
Qty
Cus tomer
id: OrderID
incl: Supplier
Item
SUPPLY
Supplier
Item
Qty
Price
id: Supplier
Item
Qty
gr: Supplier
Item
1-N

ORDER
OrderID
Supplier
Item
Cus tom er
id: OrderID
ref: Supplier
Item
of
1-1
SUPPLY
Qty
Price
id: of.OFFER
Qty
DBRE - Foreign keys
LIBD
104
Inclusion constraints
• Domain sharing
CHAIN
Name: ChainName
id: Name
OFFER
Item: char (12)
Chain: ChainName
Price: num (5)
id: Item
Chain
SHOP
Name: ChainName
Town: char (32)
Size: num (6)
id: Name
Town
0-N

by
1-1
OFFER
Item: char (1)
Price: num (5)
id: by.CHAIN
Item
0-N
of
1-1
SHOP
Town: char (32)
Size: num (6)
id: of.CHAIN
Town
DBRE - Foreign keys
LIBD
105
Complex foreign key patterns
• Conditional foreign key
STUDENT
StudID
Name
Country
School
id: StudID
cond_ref: School
SCHOOL
SchoolName
Address
Category
id: SchoolName
STUDENT
StudID
Nam e
Country
BelgianSchool[0-1]
ForeignSchool[0-1]
id: StudID
ref: BelgianSchool
exact-1: BelgianSchool
ForeignSchool

for s  STUDENT:
s.Country = "Belgium"  s.School  SCHOOL.SchoolName
SCHOOL
SchoolName
Address
Category
id: SchoolName
for s  STUDENT: s.Country = "Belgium"
 s.BelgiumSchool  SCHOOL.SchoolName

STUDENT
StudID
Nam e
id: StudID
SCHOOL
SchoolNam e
Address
Category
id: SchoolNam e
P
0-N
STUDENT
StudID
Name
id: StudID

SCHOOL
SchoolName
Address
Category
id: SchoolName
P
FOREIGN
BELGIAN
Country
School
School
1-1
from
for f  FOREIGN: f.Country  "Belgium"
FOREIGN
Country
ForeignSchool
BELGIAN
BelgianSchool
ref: BelgianSchool
for f  FOREIGN: f.Country  "Belgium"
DBRE - Foreign keys
LIBD
106
Complex foreign key patterns
• Overlapping identifier - foreign key
DETAIL
ORDER
OrderNbr
LineNbr
ItemCode
Qty
id: OrderNbr
LineNbr
ref: OrderNbr
OrderNbr
Cus tomer
Date
id: OrderNbr
ORDER
OrderNbr
Cus tom er
Date
id: OrderNbr

DETAIL
LineNbr
Item Code
Qty
id: in.ORDER
LineNbr
0-N
1-1
in
DBRE - Foreign keys
LIBD
107
Complex foreign key patterns
• Overlapping identifier - foreign key
DETAIL
ITEM
OrderID
Item Code
Date
Qty
id: OrderID
Item Code
ref: Item Code
Date
Item Code
Date
Price
id: Item Code
Date

DETAIL
OrderID
Item Code
Item Code_R
Date
Qty
id: OrderID
Item Code
ref: Item Code_R
Date
ITEM
Item Code
Date
Price
id: Item Code
Date
for d  DETAIL:
d.ItemCode = d.ItemCode_R

ITEM
Item Code
Date
Price
id: Item Code
Date
DETAIL
for d  DETAIL:
d.ItemCode = d.ref.ITEM.ItemCode
OrderID
Item Code
1-1
Qty
id: OrderID
Item Code
0-N
ref
DBRE - Foreign keys
LIBD
108
Complex foreign key patterns
• Overlapping foreign keys
INVOICE
OrderNumber
InvoiceNumber
Date
Amount
id: OrderNumber
InvoiceNumber
LINE-of-INVOICE
OrderNumber
InvoiceNumber
LineNumber
ItemCode
Qty
Amount
id: OrderNumber
InvoiceNumber
LineNumber
ref: OrderNumber
InvoiceNumber
ref: OrderNumber
ItemCode
LINE-of-ORDER
OrderNumber
ItemCode
Qty
id: OrderNumber
ItemCode

INVOICE
OrderNum ber
InvoiceNum ber
Date
Am ount
id: OrderNum ber
InvoiceNum ber
LINE-of-ORDER
OrderNum ber
Item Code
Qty
id: OrderNum ber
Item Code
0-N
0-N
from
1-1
LINE-of-INVOICE
LineNum ber
Qty
Am ount
id: from .INVOICE
LineNum ber
for
1-1
for l  LINE-of-INVOICE:
l.from.INVOICE.OrderNumber
= l.for.LINE-of-ORDER.OrderNumber
DBRE - Foreign keys
LIBD
109
Complex foreign key patterns
• Overlapping foreign keys
ORDER
0-N
INVOICE
OrderNum ber
InvoiceNum ber
Date
Am ount
id: OrderNum ber
InvoiceNum ber
LINE-of-ORDER
OrderNum ber
Item Code
Qty
id: OrderNum ber
Item Code
0-N
from
1-1
LINE-of-INVOICE
LineNum ber
Qty
Am ount
id: from .INVOICE
LineNum ber
from
1-1
1-1
INVOICE

0-N
for
InvoiceNumber
Date
Amount
id: for.ORDER
InvoiceNumber
0-N
0-N
for
from
1-1
1-1
for l  LINE-of-INVOICE:
l.from.INVOICE.OrderNumber
= l.for.LINE-of-ORDER.OrderNumber
OrderNumber
id: OrderNumber
LINE-of-ORDER
Item Code
Qty
id: from.ORDER
Item Code
0-N
LINE-of-INVOICE
LineNumber
Qty
Amount
id: from.INVOICE
LineNumber
for
1-1
for l  LINE-of-INVOICE:
l.from.INVOICE.for.ORDER
= l.for.LINE-of-ORDER.from.ORDER
DBRE - Foreign keys
LIBD
110
Complex foreign key patterns
• Non-minimal FK

REGISTRATION
Student
Subject
Lecturer
id: Student
Subject
ref: Subject
Lecturer
LECTURE
Subject
Lecturer
id: Subject
Lecturer
id': Lecturer
REGISTRATION
Student
1-1
Subject
id: Student
Subject
for
LECTURE
Lecturer
0-N
Subject
id: Lecturer
for r  REGISTRATION: r.Subject = r.for.LECTURE.Subject
gives
1-1

id: SUBJECT
LECTURER
LECTURER
0-N
SUBJECT
regis tration
0-N
id: STUDENT
SUBJECT
incl: LECTURER
SUBJECT
0-N
STUDENT
0-N
DBRE - Foreign keys
LIBD
111
Complex foreign key patterns
• Partially reciprocal foreign keys
COUNTRY
CITY
CountryName
Capital
Area
id: CountryName
ref: Capital
CountryName
CityName
Country
Population
id: CityName
Country
ref: Country

COUNTRY
CountryName
Capital
Area
id: CountryName
id': Capital
CountryName
ref
CITY
CityName
Country
Population
id: CityName
Country
equ: Country

in
1-N
COUNTRY
CountryName
Area
id: CountryName
gr: COUNTRY
CITY
1-1
capital
1-1
incl: CITY
COUNTRY
0-1
CITY
CityName
Population
id: in.COUNTRY
CityName
DBRE - Foreign keys
LIBD
112
Complex foreign key patterns
• Inverse foreign keys
CUSTOMER
Cus tID
Nam e
Address
Orders [0-N]
id: Cus tID
id': Orders [*]
ref
inv: Orders [*]
ORDER
OrderID
Date
Cus tom er
id: OrderID
ref: Cus tom er
inv: Cus tom er

CUSTOMER
Cus tID
Nam e
Address
id: Cus tID
ORDER
OrderID
Date
Cus tom er
id: OrderID
ref: Cus tom er
DBRE - Foreign keys
113
Temporal (interval?) foreign keys
LIBD
H_EMPLOYEE
CODE
s tart
end
NAME
STATUS
ADDRESS
PROJECT
id: CODE
s tart
tref: PROJECT
s tart
H_PROJECT
TITLE
s tart
end
THEME
BUDGET
id: TITLE
s tart

«valid»
EMPLOYEE
CODE
NAME
«valid» STATUS
«valid» ADDRESS
«valid» PROJECT
id: CODE
1-1
«valid»
works on
0-N
«valid»
PROJECT
TITLE
«valid» THEME
«valid» BUDGET
id: TITLE
DBRE - Foreign keys
LIBD
114
Pathological foreign keys
• Loosely matching foreign keys
The domains of the foreign key and of the target key are comparable in some way
DETAIL
DET-ID: compound (20)
ORD-ID: num (8)
ITEM-CODE: char (12)
QTY: num (6)
AMOUNT: num (8,2)
id: DET-ID
SHIPMENT
SHIP-NUM: num (8)
DETAIL: char (20)
AMOUNT: num (8,2)
id: SHIP-NUM
ref: DETAIL
CUSTOMER
CUSTOMER
CUSTOMER
CUST-ID: num (10)
Nam e: char (24)
id: CUST-ID
CUST-ID: num (10)
Nam e: char (24)
id: CUST-ID
CUST-ID: num (10)
Nam e: char (24)
id: CUST-ID
ORDER
INV-NUM: num (8)
CUST: num (8)
DATE-ORD: date (8)
AMOUNT: num (8,2)
id: INV-NUM
ref: CUST
ORDER
INV-NUM: num (8)
CUST: num (12)
DATE-ORD: date (8)
AMOUNT: num (8,2)
id: INV-NUM
ref: CUST
ORDER
INV-NUM: num (8)
CUST: char (12)
DATE-ORD: date (8)
AMOUNT: num (8,2)
id: INV-NUM
ref: CUST
DBRE - Foreign keys
LIBD
115
Pathological foreign keys
• 99% correct foreign key
CUSTOMER
Cus tID
Nam e
id: Cus tID
CUSTOMER
CATEGORY
Cus tID
Nam e
Category
id: Cus tID
99%ref: Category
CatNam e
Rebate
id: CatNam e
CATEGORY
CatNam e
Rebate
id: CatNam e
0-N

of
P
1-1
CUSTOMER(wrong)
Category[0-1]
CUSTOMER(correct)
for c  CUSTOMER(wrong):
c.Category  CATEGORY.CatName
DBRE - Foreign keys
LIBD
116
Pathological foreign keys
• Transitive foreign key (non-redundant)
CUSTOMER
Cus tID
Nam e
id: Cus tID
INVOICE
InvID
Cus tID
OrdNum
Date
Am ount
id: InvID
ref: Cus tID
ref: Cus tID
OrdNum
CUSTOMER
ORDER
Cus tID
OrdNum
Date
id: Cus tID
OrdNum
ref: Cus tID
Cus tID
Nam e
id: Cus tID

INVOICE
InvID
Cus tID
OrdNum
Date
Am ount
id: InvID
ref: Cus tID
OrdNum
ORDER
Cus tID
OrdNum
Date
id: Cus tID
OrdNum
ref: Cus tID
DBRE - Foreign keys
LIBD
117
Pathological foreign keys
• Transitive foreign key (redundant)
DEPARTMENT
DepartID
Nam e
id: DepartID
CUSTOMER
Cus tID
Nam e
Agent
Departm ent
id: Cus tID
ref: Departm ent
ref: Agent
EMPLOYEE
Em pID
Nam e
Depart
id: Em pID
ref: Depart
DEPARTMENT
DepartID
Nam e
id: DepartID

CUSTOMER
Cus tID
Nam e
Agent
id: Cus tID
ref: Agent
EMPLOYEE
EmpID
Nam e
Depart
id: EmpID
ref: Depart
DBRE - Foreign keys
LIBD
118
Pathological foreign keys
• Partly optional foreign key
STUDENT
StudID
Name
Diss ert[0-1]
Year
id: StudID
ref: Diss ert
Year
DISSERTATION
Title
Year
Advis or
id: Title
Year
STUDENT
StudID
Nam e
Year
id: StudID
DISSERTATION
Title
Year
Advis or
id: Title
Year

LAST-YEAR-STUDENT
Diss ert
ref: Diss ert
STUDENT.Year

STUDENT
StudID
Nam e
Year
id: StudID
DISSERTATION
Title
Year
Advis or
id: Title
Year
0-N
for s  LAST-YEAR-STUDENT:
s.Year = s.writes.DISSERTATION.Year
LAST-YEAR-STUDENT
1-1
writes
DBRE - Foreign keys
LIBD
119
Pathological foreign keys
• Embedded foreign key
ORDER
Sender
OrdNum
Date
Am ount
id: Sender
OrdNum
INVOICE
InvNum
Date
Am ount
Cus tomer
Order
id: InvNum
ref: Cus tomer
Order
ref: Cus tomer
CUSTOMER
Cus tID
Name
Address
id: Cus tID

ORDER
Sender
OrdNum
Date
Amount
id: Sender
OrdNum
CUSTOMER
Cus tID
Name
Address
id: Cus tID
INVOICE
CUST-ORDER
id: ORDER.Sender
ORDER.OrdNum
ref: ORDER.Sender
InvNum
Date
Amount
Cus tomer
Order
id: InvNum
ref: Cus tomer
ref: Cus tomer
Order
DBRE - Foreign keys
LIBD
120
Pathological foreign keys
• Embedded foreign key
ORDER
Sender
OrdNum
Date
Amount
id: Sender
OrdNum
CUSTOMER
Cus tID
Name
Address
id: Cus tID
INVOICE
CUST-ORDER
id: ORDER.Sender
ORDER.OrdNum
ref: ORDER.Sender
InvNum
Date
Amount
Cus tomer
Order
id: InvNum
ref: Cus tomer
ref: Cus tomer
Order
ORDER

CUSTOMER
Sender
OrdNum
Date
Amount
id: Sender
OrdNum
Cus tID
Nam e
Address
id: Cus tID
0-N
from
1-1
INVOICE
CUST-ORDER
0-N
for
1-1
InvNum
Date
Amount
id: InvNum
DBRE - Foreign keys
LIBD
121
Pathological foreign keys
• Reflexive foreign key
A
A1
A2
id: A1
ref
A

A1
A2
id: A1