Semantic Web

Download Report

Transcript Semantic Web

2012 INTERNATIONAL ASIAN SUMMER SCHOOL IN LINKED DATA
IASLOD 2012, August 13-17, 2012, KAIST, Daejeon, Korea
General Introduction
for Semantic Web and Linked Open Data
Hideaki Takeda
National Institute of Informatics
takeda@nii.ac.jp
Hideaki Takeda / National Institute of
Semantic Web and Linked Data
• Semantic Web
– What is Semantic Web
– How to realize Semantic Web
•
•
•
•
Metadata
RDF
RDFS
OWL
• Linked Data
– What is Linked Data?
– The State-of-the-Art of Linked Data
• Linking Open Data (LOD)
– How to use Linked Data
• Linked Data Browser
• Linked Data Search Engine
• Linked Data Applications
– How to use RDF
• RDFa
– SPARQL
Hideaki Takeda / National Institute of
Semantic Web
Hideaki Takeda / National Institute of
The Aim of The Semantic Web
• "The Semantic Web is an extension of the current web in
which information is given well-defined meaning, better
enabling computers and people to work in cooperation."
The Semantic Web, Scientific American, May 2001, Tim Berners-Lee, James Hendler and Ora Lassila
• The Semantic Web is a vision: the idea of having data on
the web defined and linked in a way that it can be used by
machines not just for display purposes, but for automation,
integration and reuse of data across various applications.
http://www.w3.org/2001/sw/
Hideaki Takeda / National Institute of
Semantic Web
• Realization of various information exchanging via Web
Automation
自動化
Integration
統合
Re-use of data
データの再利用
Hideaki Takeda / National Institute of
Next Generation Web?
• Evolution of Web
– HTML: Web for Display
– XML: Web with Syntax
– ?? : Web with Semantics
• Why should we embed semantics into Web?
From
– Web for Human
To
– Web for human and machines
cf. Web for machines
Hideaki Takeda / National Institute of
A brief introduction of XML
• Limitation of HTML
– Chaos by mixture of displaying and text structures
• e.g.,
– <h3></h3> should be used for “the third-level heading”, but are often
used just for bigger fonts
– <b></b> is specifying “bold” , not “emphasis”.
– Fixed Structure
• e.g.,
– If you need <h7></h7>….
– I need a structure just for my data
<h1> A list of lectures</h1>
<h2> Knowledge Sharing Systems</h2>
<h3> Lecturer: Hideaki Takeda</h3>
<h3>Wednesday 3rd</h3>
Hideaki Takeda / National Institute of
XML
• XML(eXtensible Markup Language)
– Can define original tags
– Represent logical structures of data
• DTD
– Do not include style information
• XST
<lecturelist>
<lecture>
<title id=1234> Knowledge Sharing Systems</title>
<lecturer> Hideaki Takeda</lecturer>
<schedule>
<week>Wednesday</week>
<time>3rd</time>
</lecture>
...
</lecturelist>
Hideaki Takeda / National Institute of
Whey is XML not sufficient?
<person>
<name> Hideaki Takeda</name>
<age> 20</age>
</person>
•
•
•
•
<個人>
<名前>Hideaki Takeda</名前>
<年齢> 20</年齢>
</個人>
What are specified by “person” and “name” ?
Is “name” and “名前” the same?
Is this description sufficient as a description for “person”?
…
• In short, syntax alone cannot solve these problems
Hideaki Takeda / National Institute of
Architecture for the Semantic Web
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of
How to describe “meaning”?
• Need to describe “information on information”
– “Meaning of something” is a description (“meaning”)
to a description (“something”) in computers
– Metadata
• Data about data
• Need to architecture for common understanding
– Syntax (language or scheme)
– Vocabulary (ontology)
Hideaki Takeda / National Institute of
Metadata
• What is metadata?
– Data about data
– What one can say about any information object
• What is described as metadata?
– Content relates to what the object contains or is about,
and is intrinsic to an information object.
– Context indicates the who, what, why, where, how aspects
associated with the object's creation and is extrinsic to an
information object.
– Structure relates to the formal set of associations within or
among individual information objects and can be intrinsic
or extrinsic
Setting the State, Anne J.Gilliand-Swetland, Introduction to Metadata – Pathways to Digital
Information, Murthsa Baca (ed.), Getty Information Institute.
Hideaki Takeda / National Institute of
Metadata
• Metadata to individual information objects
– Bibliography,Dublin Core
• Metadata to part or structure of information objects
– Drawings,RDF,RDFS, OWL
Type:tractor
Owner:Taro
Product year:2002
Body
Axis:
Connect body to wheel
Wheel
Hideaki Takeda / National Institute of
•
•
•
•
A Layer model for Semantic Web
RDF (Resource Description Framework)
– The most primitive model for metadata description
• SVO model
• Entity-Relation Model
• Semantic net
RDF Schema
– Addition of “concept” to RDF
• class-subclass,constraints
OWL
– More general concept description language
• Logical consistency
• Various class expressions
• Various constraints
DAML-S
– Descriptions on processes
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of
RDF (Resource Description
Framework)
• A framework to describe metadata
• Separation of model and syntax
• W3C Recommendation (2004)
Hideaki Takeda / National Institute of
RDF Model
• Element
– Resource:
• URI(Universal Resource Identifier)
• Literal(string)
– No need to be specified by Web
– Property:
• Attribute when describing resources
• URI or Literal just as Resource
– Statement: triad of resource, property, and
resource
Hideaki Takeda / National Institute of
•
•
RDF model
Statement
– Creator of http://www-kasm.nii.ac.jp/~takeda is “Hideaki Takeda”
Structure
– Resource (subject): http://www-kasm.nii.ac.jp/~takeda
– Property (predicate): Creator
– Value (object): “Hideaki Takeda”
http://www-kasm.nii.ac.jp/~takeda
Resource
Creator
Property
“Hideaki Takeda”
Value
Hideaki Takeda / National Institute of
RDF model
•
Creator of http://www-kasm.nii.ac.jp/~takeda is http://www.nii.ac.jp/staffid/123456 which
has name “Hideaki Takeda” and email “[email protected]” .
http://www-kasm.nii.ac.jp/~takeda
Creator
http://www.nii.ac.jp/staffid/123456
name
“Hideaki Takeda”
email
“[email protected]”
Hideaki Takeda / National Institute of
RDF model
• Creator of http://www-kasm.nii.ac.jp/~takeda has name
“Hideaki Takeda” email “[email protected]” .
http://www-kasm.nii.ac.jp/~takeda
Creator
name
“Hideaki Takeda”
email
“[email protected]”
Hideaki Takeda / National Institute of
RDF syntax
• Creator of http://www-kasm.nii.ac.jp/~takeda is “Hideaki Takeda”
http://www-kasm.nii.ac.jp/~takeda
Resource
Creator
Property
“Hideaki Takeda”
Value
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://dublincore.org/2001/08/14/dces#">
<rdf:Description about="http://www-kasm.nii.ac.jp/~takeda">
<dc:Creator>Hideaki Takeda</dc:Creator>
</rdf:Description>
</rdf:RDF>
<rdf:RDF>
<rdf:Description about="http://www-kasm.nii.ac.jp/~takeda">
<dc:Creator rdf:resource=“Hideaki Takeda” />
</rdf:Description>
</rdf:RDF>
Hideaki Takeda / National Institute of
RDFS (RDF Schema)
• Stronger knowledge representation model
– RDF: ER model,semantic net
– RDF Schema: Frame model,object-oriented
paradigm
• Minimal definition
• Property-centered approach
• RDFS is defined as extension of RDF
• RDFS gives definitions of RDF descriptions
Hideaki Takeda / National Institute of
RDFS
•
•
•
Class Definition
– rdfs:Resource
– rdfs:Class
– rdf:Property
– rdfs:ConstraintProperty
– rdfs:Literal
Property Definition
– rdf:type
– rdfs:subClassOf
– rdfs:subPropertyOf
– rdfs:comment
– rdfs:label
– rdfs:seeAlso
– rdfs:isDefinedBy
ConstraintProperty Definition
– rdfs:range
– rdfs:domain
RDFS Structure by RDF
Resource Description Framework(RDF) Schema Specification 1.0
http://www.w3.org/TR/2000/CR-rdf-schema-20000327/
Hideaki Takeda / National Institute of
• rdfs:Class
• rdfs:SubclassOf
RDF Schema
– Detailed class
– Multiple
– Transivity
• rdf:type


Range
 Only one
 No cardinality
Domain
 Multiple (or)
– Indicate an instance of a
class
• rdf:property
– Attribute
• rdfs:subPropertyOf
– Detailed property
– Transivity
Hideaki Takeda / National Institute of
RDF Schema
Animal
<rdf:RDF xml:lang="en"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" “The class of person”
s
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
rdfs:comment
<rdfs:Class rdf:ID="Person">
Person
<rdfs:comment>The class of people.</rdfs:comment>
d
d
<rdfs:subClassOf rdf:resource="http://www.w3.org/
maritalStatus
d
2000/03/example/
ssn
r
age
classes#Animal"/>
MaritalStatus
</rdfs:Class>
r
<rdf:Property ID="maritalStatus">
rdfs:comment
<rdfs:range rdf:resource="#MaritalStatus"/>
Integer
<rdfs:domain rdf:resource="#Person"/>
t
t
</rdf:Property>
“Social Security Number”
Married
Single
t
<rdf:Property ID="ssn">
<rdfs:comment>Social Security Number</rdfs:comment>
Windowed
t = rdf:type
<rdfs:range
t
d = rdfs:domain
rdf:resource="http://www.w3.org/2000/03/example/classes#Integer"/>
r = rdfs:range
<rdfs:domain rdf:resource="#Person"/>
Divorced
= class
</rdf:Property>
= class instance
<rdf:Property ID="age">
= property
<rdfs:range
rdf:resource="http://www.w3.org/2000/03/example/classes#Integer"/>
<rdfs:domain rdf:resource="#Person"/>
</rdf:Property>
<rdfs:Class rdf:ID="MaritalStatus"/>
<MaritalStatus rdf:ID="Married"/>
<MaritalStatus rdf:ID="Divorced"/>
Resource Description Framework(RDF) Schema Specification 1.0
<MaritalStatus rdf:ID="Single"/>
http://www.w3.org/TR/2000/CR-rdf-schema-20000327/
<MaritalStatus rdf:ID="Widowed"/>
Hideaki Takeda / National Institute of
</rdf:RDF>
OWL(Web Ontology Language)
•
•
•
More general knowledge representation
Based on Description Logics
Features
– Class
• Necessary condition / necessary and sufficient condition
• Class expression:
– Constraint by property
» Like slot definition of a class
» Type constraint (all/some), cardinality, typed cardinality
– Logical operation of classes: union, intersection, negation
– Property
• Multiple ranges and domains
• Specifying meta-property
– Import of definitions
Hideaki Takeda / National Institute of
Linked Data
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
Architecture for the Semantic Web

The world of classes (Ontologies)

The world of instances
(Linked Data)
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of
Layers of Semantic Web
• Ontology
– Descriptions on classes
– RDFS, OWL
– Challenges for ontology building
• Ontology building is difficult by nature
– Consistency, comprehensiveness, logicality
• Alignment of ontologies is more difficult
Descriptions on classes
Ontology
インスタンスに関する記述
Linked Data
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of
Layers of Semantic Web
• Linked Data
– Descriptions on instances (individuals)
– RDF + (RDFS, OWL)
– Pros for Linked Data
• Easy to write (mainly fact description)
• Easy to link (fact to fact link)
– Cons for Linked Data
• Difficult to describe complex structures
• Still need for class description (-> ontology)
Descriptions on classes
Ontology
Description on instances
Linked Data
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of
Linked Data

Linked Data is “Web of Data”
– Data published as RDF
– Can refer from outside
• The four rules for Linked Data
Hideaki Takeda / National Institute of
Linked Data
• The four rules for Linked Data
– Use URIs as names for things
• Give a URI to every object in the world!
– Use HTTP URIs so that people can look up those names.
• Don’t use URN
– When someone looks up a URI, provide useful information, using the
standards (RDF, SPARQL)
• Provide machine-readable data for URI
– Include links to other URIs. so that they can discover more things.
• Make data linked together just like Web
Linked Data, TBL, http://www.w3.org/DesignIssues/LinkedData.html
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
Linking Open Data (LOD)
•
•
•
•
•
The project to collect published Linked Data
Major Linked Data
(Translated from the original resources)
– Dbpedia (Wikipedia) 270 Million Triples
– Geonames:Geo names and their latitudes and longitudes, 93 Million
Triples
– MusicBrainz:Music
– WordNet:Dictionary
– DBLP bibliography:Bibliography for technical papers. 28 Million Triples
– US Census Data: 1 Billion Triples
(Crawling)
– FOAF (Friend Of A Friend)
(Wrapper)
– Flickr Wrapper
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
LOD Cloud
(Linking Open Data)
Hideaki Takeda / National Institute of
http://dbpedia.org/page/Tokyo
Hideaki Takeda / National Institute of
http://en.wikipedia.org/wiki/Tokyo
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
How to use Linked Data
Linked Data
Browser
Things
Linked Data
Mashup
Things
Things
Linked Data
Search Engine
Things
Things
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
Linked Data Browser
• Browse linked data just as browsing web pages
– Show RDF data
– Prompt links to follow
• System/Service
– Mables
• Display data by following links
– Tabulator
• Firefox plugin/online
• Adding information in a single page
– Sig.ma
• Showing RDF resources which can be operated
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
Tabulator
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
Linked Data Search Engine
• Search RDF data with crawled data set
– Swoogle
– Sindice
– watson
Hideaki Takeda / National Institute of
http://sindice.com/
Hideaki Takeda / National Institute of
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
How to use Linked Data
• Semantic Data Mash-up Applications
– SemaPlorer
• http://btc.isweb.uni-koblenz.de/
– Dbpedia Mobile
• http://wiki.dbpedia.org/DBpediaMobile
– Bio2RDF
• http://bio2rdf.org/
Hideaki Takeda / National Institute of
DBpedia Mobile
Hideaki Takeda / National Institute of
Bio2RDF
• Search LOD in
bioscience
• Translate data into RDF
if not
Hideaki Takeda / National Institute of
Bio2RDF
Hideaki Takeda / National Institute of
Linked Data
• What is Linked Data?
• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)
• How to use Linked Data
– Linked Data Browser
– Linked Data Search Engine
– Linked Data Applications
• How to use RDF
– RDFa
– SPARQL
Hideaki Takeda / National Institute of
RDFa
• Add extra structured content to the (X)HTML
pages
– adds new (X)HTML/XML attributes
• “RDF in attributes”
– Programs can extract those and turn into RDF
– Flexibility for using Literals and URI resources
Hideaki Takeda / National Institute of
Principles of RDFa
• RDF contents are defined through XML
attributes (no elements)
• XML/HTML tree structure is used
• Varios attributes are defined by RDFa
– Some attributes (@href, @rel) are also reused
• The text content can be also reused
Hideaki Takeda / National Institute of
Examples
http://example.com/alice/posts/trouble_with_bob
<div xmlns:dc="http://purl.org/dc/elements/1.1/">
<h2 property="dc:title">The trouble with Bob</h2>
<h3 property="dc:creator">Alice</h3>
...
</div>
In N3
<http://www.example.com/alice/posts/trouble_with_bob>
<http://purl.org/dc/elements/1.1/title> "The Trouble with Bob";
<http://purl.org/dc/elements/1.1/creator> "Alice" .
Hideaki Takeda / National Institute of
<div xmlns:dc="http://purl.org/dc/elements/1.1/">
<div about="/alice/posts/trouble_with_bob">
<h2 property="dc:title">The trouble with Bob</h2>
<h3 property="dc:creator">Alice</h3>
...
</div>
<div about="/alice/posts/jos_barbecue">
<h2 property="dc:title">Jo's Barbecue</h2>
<h3 property="dc:creator">Eve</h3>
...
</div>
...
</div>
Hideaki Takeda / National Institute of
<div about="/alice/posts/trouble_with_bob">
<h2 property="dc:title">The trouble with Bob</h2>
The trouble with Bob is that he takes much better photos than I do:
<div about="http://example.com/bob/photos/sunset.jpg">
<img src="http://example.com/bob/photos/sunset.jpg" />
<span property="dc:title">Beautiful Sunset</span>
by
<span property="dc:creator">Bob</span>.
</div>
</div>
Hideaki Takeda / National Institute of
<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<p property="foaf:name"> Alice Birpemswick </p>
<p> Email: <a rel="foaf:mbox" href="mailto:[email protected]">[email protected]</a></p>
<p> Phone: <a rel="foaf:phone" href="tel:+1-617-555-7332">+1 617.555.7332</a> </p>
</div>
Hideaki Takeda / National Institute of
<div xmlns:foaf="http://xmlns.com/foaf/0.1/" about="#me" rel="foaf:knows">
<ul>
<li typeof="foaf:Person">
<a property="foaf:name" rel="foaf:homepage" href="http://example.com/bob">Bob</a>
</li>
<li typeof="foaf:Person">
<a property="foaf:name" rel="foaf:homepage" href="http://example.com/eve">Eve</a>
</li>
<li typeof="foaf:Person">
<a property="foaf:name" rel="foaf:homepage" href="http://example.com/manu">Manu</a>
</li>
</ul>
</div>
Hideaki Takeda / National Institute of
Using RDFa
• RDF Validator
– http://validator.w3.org/
• RDF Distiller
– http://www.w3.org/2007/08/pyRdfa/
Hideaki Takeda / National Institute of
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
<body about="http://example.org/john-d/#me">
xmlns:foaf="http://xmlns.com/foaf/0.1/"
<h1>John's Home Page</h1>
xmlns:dc="http://purl.org/dc/elements/1.1/"
<p>My name is <span property="foaf:nick">John D</span>
version="XHTML+RDFa 1.0" xml:lang="en">
and I like
<head>
<a href="http://www.neubauten.org/" rel="foaf:interest"
<title>John's Home Page</title>
xml:lang="de">Einsturzende Neubauten</a>.
<base href="http://example.org/john-d/" />
<meta property="dc:creator" content="Jonathan Doe" /> </p>
<p>
<link rel="foaf:primaryTopic"
My <span rel="foaf:interest"
href="http://example.org/john-d/#me" />
resource="urn:ISBN:0752820907">favorite
</head>
book is the inspiring <span about="urn:ISBN:0752820907">
<cite property="dc:title">Weaving the Web</cite> by
<span property="dc:creator">Tim Berners-Lee</span></span>
</span>
</p>
</body>
</html>
<http://example.org/john-d/> <http://xmlns.com/foaf/0.1/primaryTopic> <http://example.org/john-d/#me>.
<http://example.org/john-d/> <http://purl.org/dc/elements/1.1/creator> "Jonathan Doe"@en.
<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/nick> "John D"@en.
<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/interest> <http://www.neubauten.org/>.
<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/interest> <urn:ISBN:0752820907>.
<urn:ISBN:0752820907> <http://purl.org/dc/elements/1.1/title> "Weaving the Web"@en.
<urn:ISBN:0752820907> <http://purl.org/dc/elements/1.1/creator> "Tim Berners-Lee"@en.
Hideaki Takeda / National Institute of
Summary
• Linked Data is the practical application of
Semantic Web
– The bottom-up approach
– Postpone the ontology issue
• A technological solution for data sharing
– Data science
– Open Government
Hideaki Takeda / National Institute of