Transcript slides
W3C Workshop on
RDF Access to Relational Databases
25-26 October, 2007 — Boston, MA, USA
D2RQ
Lessons Learned
Christian Bizer
Richard Cyganiak
Freie Universität Berlin
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
The D2RQ Plattform
2002: D2R MAP
dump relational databases as RDF
based on and expressive declarative mapping language
2004: D2RQ
RDQL/SPARQL to SQL query rewriting
Jena and Sesame API
2006: D2R Server
SPARQL, Linked Data access over the Web
Tested with Oracle, MySQL, and PostgreSQL
Should work with any SQL-92 compatible database
GNU GPL license, 4600 downloads (150 per month)
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Outline
1. D2RQ Mapping Language
2. D2RQ Architecture and Interfaces
3. Areas for Future Community Work
1. RDF Access to Relational Databases
2. The Web Perspective
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
The D2RQ Mapping Language
Declarative language to express mappings between a given RDF
schemata and a given relational database schemata.
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Class Map
Author
ID
first
last
email
12
Chris
Bizer
[email protected]
map:Author_ClassMap a d2rq:ClassMap;
d2rq:class foaf:Person;
d2rq:uriPattern
"/people/@@Author.ID@@".
http://www4.wiwiss.fu-berlin/d2rServer/people/12
rdf:type foaf:Person .
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Property Bridge
Author
ID
first
last
email
12
Chris
Bizer
[email protected]
map:email_PropertyBridge a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:Author_ClassMap;
d2rq:property foaf:name;
d2rq:pattern
"@@Author.first@@ @@Author.last@@".
http://www4.wiwiss.fu-berlin/d2rServer/people/12
foaf:name “Chris Bizer” .
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Joins
Papers
Author
ID
name
email
ID
title
confID
12
Chris
[email protected]
312
D2R Server
132
Rel_Authors_Papers
AuthorID
PaperID
12
312
map:author_PropertyBridge a d2rq:PropertyBridge;
d2rq:belongsToClassMap :PeopleClassMap;
d2rq:property dc:creator;
d2rq:refersToClassMap :PapersClassMap;
d2rq:join “Author.ID=Rel_Authors_Papers.AuthorID";
d2rq:join "Rel_Authors_Papers.PaperID=Papers.ID“.
http://www4.wiwiss.fu-berlin/d2rServer/docs/312 dc:creator
http://www4.wiwiss.fu-berlin/d2rServer/people/12 .
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Other Features of the Mapping Language
Conditional mappings
Value translation tables
Extensible with arbitrary value translation functions
Performance hints
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
D2RQ Architecture and Interfaces
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Performance and Limitations
Performance is fine with databases containing a few million
records.
Dumps, Linked Data und HTML interface usually no problem.
Simple SPARQL queries usually fine.
Complex SPARQL queries (OPTINAL, FILTER, LIMIT) sometimes slow.
Due to limitations of the implementation. Will improve with future
releases.
Limitations
No support for Named Graphs
Read only. No support for CREATE/DELETE/UPDATE
No support for inference
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Areas for Future Community Work
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
RDF Access to Relational Databases
With Virtuoso, DartGrid, SPASQL, SquirrelRDF, Relational.OWL,
D2RQ, and … there are various suitable solutions around.
Compare the Expressivity of Mapping Languages
People need weird mappings and fixups for database design anti-patterns.
We need an accepted mapping benchmark which reflects this.
First approach: THALIA testbed.
Compare the Performance of the different Implementations
We need an accepted performance benchmark.
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Future Community Work seen from the Web Perspective
Mapping relational databases to RDF is a local problem and its
technical realization matters little from the Web perspective.
What people really want are
expressive and fast queries
over an integrated view
on an unbounded number of data sources (the Web)
expressed via simple user interfaces.
We should aim at providing answers to the well-known, but
hard data integration questions arising from this scenario.
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Testbed: The Linking Open Data Cloud
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Federation versus Replication
RDF Link
FOAF
RDF Link
DBpedia
geonames
RDF Link
SIOC
RDF Link
1. Virtual Integration via SPARQL Query Federation
DARQ (HU Berlin)
Complicated and slow.
2. Materialized Integration via Crawling
Zitgist (Zitgist), SWSE (DERI), Swoogle (UMBC), Watson (Open University)
Fast, but requires huge RDF repositories.
Worked for HTML, worked for RSS, so why not for RDF?
3. Materialization On-the-Fly
Crawl only data that is needed while answering the query.
Semantic Web Client Library (FU Berlin), SWIC (University of London)
Works, but is really slow.
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Data Source Discovery and Description
1. Registry-based Discovery
Registries collect links or data source descriptions.
-
Example: Ping the Semantic Web
Work on data source descriptions
-
DARQ, SADDLE
2. Link-based Discovery
Discovering RDF data by following RDF Links.
Worked fine on the classic HTML Web, so why not for the Semantic Web?
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Schema Mapping
Still no clear answers to:
How to express mappings between different RDF vocabularies?
How to publish and search for such mappings on the Web?
RDF Schema and OWL are insufficient in practice to express
mappings.
Maybe upcoming Rules Interchange Format (RIF) could
provide a solution?
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)
Conclusion
We should have a look which parts of the Semantic
Web puzzle are missing to make RDF-based data
integration work on WEB- scale!
This talk is online at
http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/Bizer-Cyganiak-D2RQ-slides.pdf
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)