TDT44 Semantic web chapter 2: RDF

Download Report

Transcript TDT44 Semantic web chapter 2: RDF

TDT44 Semantic web
chapter 2: RDF
Resource Description
Framework
Espen Albert
Presentation structure
• Scenario
• Problem and solution
• Concepts
• RDF/XML
• Turtle
Scenario throughout presentation
• You are a quality engineer for Nikon, Nikon
produces cameras.
• Assignment
• Read reviews from the web, and summarize peoples
opinions
• Gather information about products  Deliver
feedback to producers
• How do we write a computer program to gather all
these reviews?
Problem: The web is not
understandable to a machine.
• Internet today, how do we combine data?
• Domain specific, can not easily interchange information
• Different review sites have different formats
•
•
•
•
•
Represent camera(Nikon D300, D300, Nikon_D300, etc. )
Represent features (Zoom, price range, weight, usability, etc.)
Represent rating (5 stars, 1-10, etc.)
Represent reviewer (name, facebook, nickname, etc.)
Etc.
• What if it was easy?
• Scenario: You ask internet what do people say about Nikon
cameras you get back “clean” data, only reliable reviews with
a rating scale you like and all pros and cons are listed. Job
done…
Solution –Semantic web
RDF
• Where in relation to the semantic web?
•
Base layer for representing information
• Story
•
•
WHO?
1999 Resource Description Framework specification by W3C
2004 the updated RDF specifications
•
•
•
consequence of semantic web
describe web resources describe any resource
6 documents. See book and W3C if you are interested in the specifics
• Definition RDF – Resource Description Framework
•
•
•
a standard published by W3C that can be used to represent distributed information/knowledge in a
way that computer applications can use and process it in a scalable manner.
A language enabling computers and humans to describe entities, both abstract and real, over the web in
a format with meaningful non ambiguous data
A framework to standardize communication for humans and machines so everyone can understand
• Components of solution
•
Need: Express any information anyone can think of
•
•
Why we use triples
Need: Mechanism to connect distributed information over the Web / Have a way of talking about the
same/unique concepts
•
Why we use vocabularies, and URIs
Concepts RDF
• Statement/triple
• Subject|Predicate|Object
• Resource|Property|Value
• URI –Uniform Resource Identifier
• General identifier for Abstract concepts, Human being, Product, Physical location and URL
• Examples
• URL relation to URI?
•
URLs can be directly retrieved from the web, URL subset of URI.
• Change
•
Only identifier  Required to return content, because of linked data project
• 2 types
•
•
•
Slash = normal
Hash (aka. URI Reference, URIref) = normal URI + # + fragment identifier
Hash preferred because they are dereferenceable URI, no need for content negotiation mechanism, slash
could have different file types
• Vocabulary
• Set of URIs with a common prefix
• RDF Model
• RDF Graph - Collection of statements
• Statements – Set of triples
Statements
• Examples
• Nikon_D300 weighs 0.6kg
• What are the subject, predicate and object?
• S: Nikon_D300
• P: Weighs
• O: 0.6kg
• Nikon_D300 is manufactured by Nikon
• What are the subject, predicate and object?
• S: Nikon_D300
• P: manufactured by
• O: Nikon
• Which of the three (S, P, O) must be a URI?
• Predicate. A subject can be a “blank node”. An object can be a
“literal” or “blank node”
• How do we serialize this? RDF/XML, Turtle, N3-Notation, Ntriples
RDF/XML 1
• Nikon_D300 weighs 0.6kg
• RDF/XML use the RDF vocabulary found @
http://www.w3.org/1999/02/22-rdf-syntax-ns#
with rdf as prefix
• xmlns:”prefix” = vocabulary
• Statement: <rdf:Description rdf:about=“MUST BE FULL URI”>
<predicate> literal </predicate> </rdf:Description>
RDF/XML 2
• Nikon_D300 manufactured_by Nikon
• Fill in the missing part:
• Nikon_D300 and Manufactured_by is found in the myCamera
vocabulary
Nikon is identified by: "http://www.dbpedia.org/resource/Nikon"
• Note difference from previous :
Statement: <rdf:Description rdf:about=“MUST BE FULL URI”>
<predicate> literal </predicate> </rdf:Description>
RDF/XML 3
• Nikon_D300 is_a DSLR (Digital single-lens reflex)
• Fill in the missing part:
• All resources is found in the myCamera vocabulary
RDF/XML 4
• Nikon_D300 is_a DSLR (Digital single-lens reflex)
• Switch to using rdf:type instead
• Alternative
RDF/XML 5
• Combining statements for one subject
• Nikon_D300 is a DSLR
Nikon_D300 manufactured_by Nikon
RDF/XML 6
• Nesting statements
RDF/XML 7 literals
• Nikon_D300 weighs 0.6kg
• “0.6kg” is called a literal.
• Other examples “5 stars”, “Espen”, “0,33”
• None of these statements mean the
same:
• These are equal
RDF/XML 8
• rdf:value
Nikon_D300 weights 0.6kg
• Note line 16: No rdf:about. This is called a blank node. The
camera has a weight that has two properties value and unit.
Could have used rdf:nodeID
• Blank nodes are used to
• Model n-ary relationships
• Some class that will
never be used outside graph
RDF/XML 9
http://www.w3.org/2001/XMLSchema
• rdf:datatype for literals
Nikon_D300 weights 0.6kg, Nikon_D300 model “D300”
• Note line 14a and 17a: Both datatypes are written with full URI. And the two
properties it is used for. Custom = myCamera:model & rdf:value
• rdf:parseType used instead of rdf:Description
• Avoidance of full URI
Turtle
• Most popular of the alternatives to RDF/XML
• Syntax also similar to SPARQL
• ttl = http://www.w3.org/2008/turtle#turtle
• Format
• Example
Turtle 2
• Literals language:
• Datatype:
Turtle 3
• @base can also be used to change the in
scope URI
• What is the URI?
• subj1?
•
•
•
•
•
•
•
http://liyangyu.com/ns0/subj1
subj2?
http://liyangyu.com/ns0/foo/subj2
subj3?
http://liyangyu.com/ns0/foo/bar#subj3
obj4?
http://liyangyu.com/ns1/obj4
• a to replace rdf:type
Turtle 4
• Multiple statements for subject with same predicate use “,”
• Multiple statements for subject with different predicate use “;”
Turtle 5
Summary
• Need: Express any information anyone can think of
• Why we use triples
• Need: Mechanism to connect distributed
information over the Web / Have a way of talking
about the same/unique concepts
• Why we use vocabularies, and URIs
• RDF
• What is your definition of RDF?
Bonus: Linked data IT-dagene
• Job søking på IT-dagene
• Hvor mange bedrifter har hørt om Linked data?
•0
• Brønnøysund registeret.
• 18 registere, hvor mye redundans finnes i de
databasene?
• Eks. Man er registrert i 18 registere og bytter telefon
nummer, må man oppdatere nummeret sitt i alle
registerne?
Bonus: Linked data IT-dagene
• Finn.no
• Finnes det en ontology på finn? Hver gjenstand på torget
peke på en model type? Bærbar pc den mest spesifikke
kategorien…
• DIPS - Helsesektor, "En innbygger, en journal"
• Linked data. Ontologi av sykdommer. Behandlingsformer.
Beslutningssystem.
Bonus: Linked data IT-dagene
• Lindbak Retail Systems
• XXL, Coop, Nille
• Løsninger for handelskjeder.
• Har mye data på hva folk handler, men hadde aldri hørt
om linked data?
Finnes det noen norske firmaer?
Partner til socus.no
Ingen obligasjoner
• Møtes en gang i måneden for å diskutere ideer?
• Startup…
• Ta gjerne kontakt: [email protected]
RDF/XML 10
• Avoid writing rdf:about=“full URI”
• In-scope base URI. Since the example is located @
http://ww.liyangyu.com/rdf/review.rdf
The full URI:
http://ww.liyangyu.com/rdf/review.rdf#Nikon_D300
• To change in-scope base URI:
RDF/XML 11
• Collection of literals or resources
• rdf:bag  Unordered
• rdf:seq  Ordered
• rdf:alt  Non-duplicates, e.g. list of alternative
stores selling x
RDF/XML 12
• Reification – description of statements
• Example:
•
•
The creator of the statement is liyang
http://www.purl.org/metadata/dublin-core# - Dublin Core vocabulary used for properties of a given
document
• Problems
• No built in mechanism to understand that the URI myCamera:statement_01 is
created for a statement
• Up to RDF application to decide which statement the reification belongs to
• Specifying subject, predicate, object could match multiple statements if we are
in a global context