Sequeda_SQL_Moving_Target

Download Report

Transcript Sequeda_SQL_Moving_Target

SQL Databases are a Moving Target
Juan F. Sequeda – [email protected]
Syed Hamid Tirmizi – [email protected]
Daniel P. Miranker – [email protected]
Department of Computer Sciences
The University of Texas at Austin
My favorite slide
SEMANTIC WEB
DATABASES
2
The Semantic Web Vision
Semantic
Query Engine
User Layer: Users and agents
interact with this layer and query the
ontologies
Local
ontology
Local
ontology
Local
ontology
DB
DB
DB
Ontology Layer: defines the
semantic representation of RDB.
Inference
Database Layer: Data that needs to
accessed semantically
3
So what should be done?
• Make it easy for existing databases to generate
Semantic Web content.
Research Problem(s):
Generate Ontologies from Database Content
• (many systems already make RDF from database data)
• Mapping database restrictions to OWL
– People don’t live past 120 years old
• Mining database content for additional domain knowledge
– Professors earn more than teaching assistants
4
Finally: Can the Semantic Web Work?
Semantic
Query Engine
• We see a big problem:
– Who builds this?
– Who generates these
mappings?
Local
ontology
Local
ontology
Local
ontology
DB
DB
DB
5
The Semantic Web Vision
Semantic Web
Query Engines
generate
DB
Schema/
Metadata
Table
Content
Local
ontology
RDF
SQL-query
translate
6
Our Position
• SQL has semantics
– SQL has been evolving  it is a moving target!
– SQL DDL can be used to generate local ontologies
– A Greatest Common Denominator like OWL-DL
can aid with data integration
7
Why are SQL Databases Moving Targets?
• In the beginning we had… Relational Model
Student(Juan, 22)
• SQL86-89 came out with Table Definitions
CREATE TABLE employee (
name VARCHAR(100),
age INTEGER)
• SQL92 added data integrity Constraints
CHECK, PRIMARY KEY, FOREIGN KEY,
UNIQUE
• SQL99 added Triggers
8
SQL DDL to Ontologies
• Requirements
– To create the ontology automatically, we need to compare the
technologies
– We need to identify constructs with similar semantics
• Analysis
– Our analysis leads to a layer cake for SQL with corresponding
layers in the Semantic Web stack
• Implementation
– We express our transformations as FOL and use SQL BNF
as a guideline for completeness.
9
Relational Model to RDF
• Relational Model
– Employee(name, age)
– T1: Employee (Juan, 21)
• RDF
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:ex="http://www.example.com/#">
<rdf:Description rdf:about="http://www.example.com/employee">
<ex:name>Juan</ex:name>
<ex:age>21</ex:age>
</rdf:Description>
</rdf:RDF>
10
Table Definition to RDFS
• Table Definition
CREATE TABLE employee (
name VARCHAR(100),
age INTEGER)
• RDFS
11
Table Definition to RDFS
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdfs:Class rdf:ID="employee">
Table Name
<rdfs:comment>Employee</rdfs:comment>
<rdfs:subClassOf
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntaxns#Resource"/>
</rdfs:Class>
Attributes
<rdf:Property rdf:ID="name">
<rdfs:comment>Name of Employee</rdfs:comment>
<rdfs:domain rdf:resource="#employee"/>
<rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Literal"/>
</rdf:Property>
<rdf:Property rdf:ID="age">
<rdfs:comment>Age of Employee</rdfs:comment>
<rdfs:domain rdf:resource="#employee"/>
<rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Literal"/>
</rdf:Property>
</rdf:RDF>
Data type
12
SQL to RDFS
• Tables  rdfs:Class
• Columns  rdf:Property
– Table  rdfs:Domain
– Datatype  rdfs:Range
• Foreign Keys  rdf:Property
– Table  rdfs:Domain
– Referencing Table  rdfs:Range
13
SQL to RDFS
SQL BNF
rdfs:Class
<table definition> ::= CREATE [ <table
scope> ] TABLE <table name> <table
contents source> [ … ]
rdf:Property
rdfs:domain
rdfs:range
<column definition> ::= <column name>
<data type> [NOT NULL] [...]
rdf:Property
rdfs:domain
rdfs:range
<referential constraint definition>::=
FOREIGN KEY <left paren> <referencing
columns> <right paren> REFERENCES
<referenced table and columns> [...]
Shared
Semantics
Tables are
equivalent to
classes
Attributes
become
properties
having the
domain as the
current table
and the range as
the datatype.
Foreign Keys
are relationships
14
SQL to OWL
CREATE TABLE employee(
employee_id INTEGER PRIMARY KEY,
employee_ssn VARCHAR(11) UNIQUE,
employee_name VARCHAR(100) NOT NULL,
employee_salary INTEGER NOT NULL,
employee_type CHAR(8) CHECK ( employee_type IN
('TEMP', 'FULLTIME', 'CONTRACT'))
dept INTEGER FOREIGN KEY (dept) REFERENCES
department (dept_id))
CREATE TABLE department(
dept_id INTEGER PRIMARY KEY,
dept_name VARCHAR(100) NOT NULL,
manager INTEGER FOREIGN KEY (manager) REFERENCES
employee (employee_id))
15
SQL to OWL
<owl:Class rdf:ID="Department"/>
<owl:ObjectProperty rdf:ID="dept">
<rdfs:domain rdf:resource="#Employee"/>
<rdfs:range rdf:resource="#Department"/>
</owl:ObjectProperty>
Table Name
Foreign Key
Attribute
<owl:DatatypeProperty rdf:ID="age">
<rdf:type rdf:resource="&owl;FunctionalProperty"/>
<rdfs:domain rdf:resource="#Employee"/>
Data type
<rdfs:range rdf:resource="&xsd;int"/>
</owl:DatatypeProperty>
16
SQL to OWL
<owl:DatatypeProperty rdf:ID="employee_type">
Enumerated
<rdfs:domain rdf:resource="#Employee"/>
Check
<rdfs:range>
Constraint
<owl:DataRange>
<owl:oneOf>
<rdf:List>
<rdf:first rdf:datatype="&xsd;string">Temp</rdf:first>
<rdf:rest>
<rdf:List>
<rdf:first rdf:datatype="&xsd;string">Fulltime</rdf:first>
<rdf:rest>
<rdf:List>
<rdf:first rdf:datatype="&xsd;string">Contract</rdf:first>
<rdf:rest rdf:resource="&rdf;nil"/>
</rdf:List>
</rdf:rest>
</rdf:List>
</rdf:rest>
</rdf:List>
</owl:oneOf>
</owl:DataRange>
</rdfs:range>
</owl:DatatypeProperty>
17
SQL to OWL
SQL BNF
Shared Semantics
<table definition>
::=
CREATE [
<table scope> ] TABLE <table name>
<table contents source> [...]
A table can be an Object
Property if acts like a
binary relation, or else it
is a Class
<check constraint
definition>
::=
CHECK <left
paren> <search condition> <right
paren>
<in predicate>
::=
<row value
expression> [ NOT ] IN <in
predicate value>
The enumerated check
constraint represents
owl:oneOf
owl:DatatypeProperty
owl:FunctionalProperty
owl:maxCardinality
owl:cardinlity
<column definition>
::=
<column
name> <data type> [NOT NULL] [...]
An attribute is equivalent to a
Datatype Property with
their domain and range.
Depending if the attribute
is NOT NULL or not, it
can have
owl:maxCardinality or
owl:cardinality.
owl:ObjectProperty
rdfs:domain
rdfs:range
owl:FunctionalProperty
<referential constraint
definition>
::=
FOREIGN
KEY <left paren> <referencing
columns> <right paren>
REFERENCES <referenced
table and columns> [...]
Foreign Keys connect to
relations, therefore they
act as Object Properties
with their respective
domain and range.
owl:Class
owl:ObjectProperty
owl:oneOf
18
Rules: SQL to OWL
• Binary Relation: a relation that only has two foreign
keys (single or composite) referencing two relations.
Re l (r )  FK ( xtr, r , xt, t )  FK ( xsr, r , xs, s)  Attr( y, r )  y  xtr  xsr  Bin Re l (r , s, t )
• Class: A relation that is not a binary relation is a class.
Re l (r )  Bin Re l (r , x, y )  Class (r )
• Object Property
– A binary relation is an object property
Bin Re l (r , s, t )  Class ( s)  Class (t )  ObjP(r , s, t )  ObjP(r , t , s)  Inv (r , r )
– A foreign key that references another relation is an object
property, whose domain is the current relation and range is
the relation that the foreign key references
PK ( y, s)  FK ( x, r , y, s)  Class (r )  Class ( s)  ObjP( x, r , s)  FuncP( x)
PK ( y, s)  FK ( x, r , y, s)  Uni( x)  Class (r )  Class ( s)  ObjP( x, r , s )  FuncP( x)  ObjP( x , s, r )  FuncP( x )  Inv ( x , x)
19
Rules: SQL to OWL
• Data type Property: If an attribute is not an object
property, then it is a data type property. Its domain is
the current relation and the range is the data type.
Re l (r )  Class (r )  Attr( x, r )  FK ( x, r , y, z )  DTP ( x, r , type( x))  FuncP( x)  MaxCard ( x,1)
Re l (r )  Class (r )  Attr( x, r )  FK ( x, r , y, z )  NN ( x)  DTP ( x, r , type( x))  FuncP( x)  Card ( x,1)
Re l (r )  Class (r )  Attr( x, r )  FK ( x, r , y, z )  CheckList ( x)  DTP ( x, r , type( x)  list ( x))  FuncP( x)  MaxCard ( x,1)
20
SQL to Rules
• Triggers are business rules and it can not be expressed
in OWL
• Should think about how this could be mapped to the
rule layer of the Semantic Web
And the final layer cake…
21
SQL Layer Cake
22
Greatest Common Denominator for Data Integration
• All ontologies generated by SQL should have similar
semantic power to facilitate better data integration
– Use of different technologies may cause problems in data
integration
– The constructs we use has OWL-DL as GCD
• OWL-DL is what SW people like the most.
– Reasoning and Inference
– Decidability
– Computational Completeness
• Therefore OWL-DL should be the target for SQL to
Semantic Web transformations.
23
24
Summary
• SQL has semantics
– SQL has been evolving  it is a moving target!
– SQL DDL can be used to generate local ontologies
– A Greatest Common Denominator like OWL-DL
can aid with data integration
25
Thank You
26
Other Slides…
27
Discussion Topics
• Hierarchy and Inheritance
– SQL DDL doesn’t have it
– How do database people model it
– Not clear how to get the semantics
• CHECK Constraint
– Has more semantics that OWL can handle
– CHECK (value >0 AND value < 360)
• OWL Constraints
– allValuesFrom
someValuesFrom
• URI
28
Discussion Topics: Hierarchy and Inheritance
• Hierarchy and Inheritance
– Integrate information that is spread across several relations
(vertical partitioning) and can be either
• Integrated in one concept
• Inheritance
But how do you decide!
– Key Equality and Data Inclusion: two relations that share the
same primary key and the child relation’s primary key is also a
foreign key
Project
PK
Software
Project
PK, FK
29
Discussion Topics: Hierarchy and Inheritance
• How do you represent Hierarchy and Inheritance
in database?
– Hierarchy: Hand - Finger
Hand
Finger
PK
FK
?
– Inheritance: Person - Student
Person
PK
Student
PK
PK, FK
30
Discussion Topics: someValuesFrom vs allValuesFrom
A(id, x)
B(id, y)
C(A_id, B_id)
• Can we say that A and B are classes? Yes
• Can we say that C represents an object property? Yes
• Can we say that B can only be connected to A using the
object property? ¿?
• Can we say anything about the cardinality of this
relation? ¿?
31
Discussion Topics: CHECK Constraint
• Embodies semantics and rules at the same time
• Enumerated CHECK constraint  owl:one of
• But what about:
– CHECK (value >0 AND value < 360)
32
Translation BNF: SQL to OWL
<sql-owl statement> ::= <class defintion> <property table>| <object property
definition>
<class definition> ::= CREATE TABLE <table name>
<object property definition> :: = <object definition> <object property table>
<object definition> ::= CREATE TABLE <table name>
<object property table> ::= <left paren> <object property references> <object
property references> <right paren>
<property table>::= <left paren> <property> [ { <comma> <property}...]
<property> ::= <object property>
|<datatype property>
33