- Tetherless World Constellation
Download
Report
Transcript - Tetherless World Constellation
Rajashree Deka
Tetherless World Constellation
Rensselaer Polytechnic Institute
The majority of data underpinning the Web are
stored in Relational Databases (RDB).
Advantages:
Secure and scalable architecture.
Efficient storage.
Reliability.
Disadvantages:
Difficult to share data across large organizations
where different database schemata are used.
Most importantly, there is no check on semantics.
Semantic web getting more mature, growing need
for RDF applications to access content of legacy
databases.
Compared to RDB, RDF is:
More expressive.
More easily processed and interpreted.
Easily reasoned over by software agents.
Need a way to make data in RDBMS available as
RDF.
In order to generate Semantic Web content from a
RDB, Tim Berners-Lee proposed a very direct
mapping:
Each table in the RDB is a RDF class.
Each field (column) name is a RDF property.
Each record is a RDF node - an instance of the RDF
class and so can play the role of a subject or an object
in a RDF statement.
Semi-automatic generation of ontology from RDB
Read all records, export as RDF triples.
Mappings are direct, complex mappings do not usually
appear.
Need to convert to RDF regularly.
Does not allow the population of an existing ontology –
a BIG limitation!
Map existing RDB to an existing ontology
Customize mapping according to existing ontology.
Complex mappings can be implemented.
Provides an integrated environment for accessing
the content of non-RDF, relational databases as
virtual, read-only RDF graphs.
Using D2RQ we can:
Query a non-RDF database using SPARQL queries.
Access information in a non-RDF database using the
Jena API or the Sesame API.
Access the content of the database as Linked Data over
the Web.
D2RQ mapping language –
describes the relation
between ontology and RDB
D2RQ engine – uses
mappings to rewrite Jena
and Sesame API calls to
SQL queries.
D2R server - provides a
Linked Data view, a HTML
view for debugging and a
SPARQL Protocol endpoint
over the database.
D2RQ mapping language formally defined by
http://www4.wiwiss.fu-berlin.de/bizer/d2rq/0.1/
D2RQ namespace is defined by
http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#
Database compatibility:
Oracle
MySQL
PostgreSQL
Microsoft SQL Server
ODBC data sources (e.g. Microsoft Access) - mapping
generator and automatic detection of column types do not
work.
Two command line tools (only on Windows and Unix
systems ):
Mapping generator:
Analyzes database schema.
Generates a default mapping file.
Resultant D2RQ map is an RDF document in N3 format.
Mapping can be used as-is or can be customized.
Dump script:
Writes the content of the RDB into a single RDF file.
Supported syntaxes are "RDF/XML" (the default),
"RDF/XML-ABBREV", "N3", "N-TRIPLE".
Ontology is mapped to a database schema using:
d2rq:ClassMaps – Represents a class or a group of
similar classes in the ontology. Specifies how
instances of the class are identified.
d2rq:PropertyBridges – A ClassMap has a set of
PropertyBridges which specify how the properties
of an instance are created.
# Table dataset (default mapping)
map:dataset a d2rq:ClassMap;
d2rq:dataStorage map:database;
d2rq:uriPattern
"dataset/@@dataset.dataset_id@@";
d2rq:class vocab:dataset;
d2rq:classDefinitionLabel "dataset";
.
map:dataset__label a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:dataset;
d2rq:property rdfs:label;
d2rq:pattern "dataset #@@dataset.dataset_id@@";
.
map:dataset_dataset_id a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:dataset;
d2rq:property vocab:dataset_dataset_id;
d2rq:propertyDefinitionLabel "dataset dataset_id";
d2rq:column "dataset.dataset_id";
d2rq:datatype xsd:int;
# Table dataset (customized mapping)
map:dataset a d2rq:ClassMap;
d2rq:dataStorage map:database;
d2rq:uriPattern "http://escience.rpi.edu/ontology/BCODMO/bcodmo/2/0/DeploymentDatasetCollection_@@dataset.
dataset_id@@";
d2rq:class bcodmo:DeploymentDatasetCollection;
d2rq:classDefinitionLabel "DeploymentDatasetCollection";
.
map:seeAlsoStatement a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:dataset;
d2rq:property rdfs:seeAlso;
d2rq:uriPattern
"http://osprey.bcodmo.org/dataset.cfm?id=@@dataset.datase
t_id@@&flag=view";
.
map:hasIdentifier a d2rq:PropertyBridge;
d2rq:property bcodmo:hasIdentifier;
d2rq:belongsToClassMap map:dataset;
d2rq:column "dataset.dataset_id";
d2rq:datatype xsd:int;
.
map:dataset_dataset_id a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:dataset;
d2rq:property bcodmo:hasParameter;
d2rq:refersToClassMap map:parameters;
d2rq:propertyDefinitionLabel "dataset dataset_id";
d2rq:join "dataset.dataset_id =
dataset_parameters.dataset_id";
d2rq:join "dataset_parameters.parameters_id =
parameters.parameters_id";
.
Customization is very direct in the case where
a class in the ontology is represented by a
table in the database.
Mapping is complicated or sometimes not
possible when a class in the ontology is not a
table in the database, but a record in a
database table.
Define primary keys wherever possible and
create indexes.
Indicate directions in d2rq:joins.
Set d2rq:autoReloadMapping to false
whenever not needed.
Use hint properties:
d2rq:valueMaxLength
d2rq:valueRegex
d2rq:valueContains
Performs reasonably well with basic triple patterns,
performance deteriorates when SPARQL features
such as OPTIONAL, FILTER and LIMIT are used.
Does not have reasoning capability. Reasoning can
be added by using the D2RQ engine within Jena.
Integration of multiple databases or other data
sources using D2RQ alone is not possible.
Read-only, cannot perform INSERT, DELETE or
UPDATE operations.
Cannot handle complicated database structures
like VIEWS.
Virtuoso RDF View:
Uses table to class and column to predicate
approach.
RDB data are represented as virtual RDF graphs.
Customization of mapping possible.
Triplify:
Maps HTTP-URI requests to relational database
queries expressed in SQL.
No SPARQL support.
R2O:
XML based declarative mapping language.
DartGrid Semantic Web toolkit:
Provides a visual tool to define mapping.
RDBToOnto
User oriented tool that creates static mapping (RDF
dump).
Asio Semantic Bridge for Relational Databases
(SBDR) and Automapper:
Uses table to class approach.
Prof. Peter Fox
Patrick West
Eric Rozell
Ankesh Khandelwal
Evan Patton