acronym country
Download
Report
Transcript acronym country
RDF and
triplestores
CMSC 461
Michael Wilson
Reasoning
Relational
databases allow us to reason
about data that is organized in a specific
way
Data that models specific relationships
Data that is very cleanly structured
What
other reasoning methods are
available to us?
Metadata
“Data
about data”
Data that describes other data
Gives
Example
context
metadata:
Image EXIT data (geolocation, rotation,
etc.)
User statistics
Last saved information in a file
What’s so important?
The
context that we gather from
metadata often allows us to understand a
much greater picture
Can correlate and tie metadata together
Calculate statistics on metadata
Understand trends
Infinite possibilities
The depth of metadata
Many
systems have their own way of
storing metadata
Database tables may be organized to
house specific metadata
This
does not lend itself well to discovering
new types of metadata
Person may have age, DOB
Later want to add new types (friends,
Facebook ID, Twitter ID, etc.)
Metadata structures
RDF
Resource Description Framework
OWL
Web Ontology Language
Ontology
– established vocabulary to
describe knowledge within a domain
RDF
is more widely used
Schemas
RDF and other structured metadata formats
allow us to establish a common language to
describe different sorts of metadata
We can make schemas that describe
Social media
Physical location
Job details
Moreover, we can tie them all to one subject
Doesn’t require database reorganization
Why is that cool?
What
this means is that we can tie any
arbitrary sets of data together with very
little work on our part
We make a schema that describes a new
domain, and staple that information onto
an existing subject
Triples
Within
these schemas, data is
conceptually organized as
<subject> <predicate> <object>
Subject
The subject of the expression
Predicate
The relationship between the subject and object
Object
The direct object of the expression
These
expressions are called “triples”
Triple examples
Examples?
Storing triples
Since
we are often interesting in large
amounts of data, we need to think on
how to store these
Triplestores
Pretty obvious
What do these give us over doing
something like storing the information in a
database?
Triplestore querying
Triplestores
can also be queried
SQL is more limited for the kinds of queries
we’d like to be able to make
SPARQL
The acronym stands for:
SPARQL
Protocol and RDF Query Language
SPARQL
SPARQL
is a SQL-like query language
Allows us to query on the various schemas
we have assigned to our subjects
SPARQL
queries can look surprisingly
readable
SPARQL example
PREFIX abc:
<http://example.com/exampleOntology#>
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
abc:isCapitalOf ?y .
?y abc:countryname ?country ;
abc:isInContinent abc:Africa .
Querying power
Using
SPARQL, you can make extremely
deep, powerful queries and reason very
intuitively on the data present in a
triplestore
Organizing data this way allows
computers to actually be able to reason
on data as well
Caveats
All
this tech is SUPER new
All tied very heavily into the Semantic Web
Basically
introduce a system like this into the
web at large
Metadata stored about web pages,
computers can reason about them
Much
of this is a moving target
Not a whole lot of production applications
using this stuff yet
Tools
There
are a few triplestore servers and
other tools you can use
Jena
Apache project
Framework that allows for Semantic Web
concepts to be employed
Can query using SPARQL
Jena can use Postgres in the background
More tools
RDFLib
https://github.com/RDFLib
Python library for RDF
Can run entirely in memory
Good
more
for experimentation purposes and