Semantic Web

Download Report

Transcript Semantic Web

The Semantic Web
Schedule for this evening
• Review of the survey
– Summary. Discussion if wanted
• Some other ways to move content from
place to place
– FTP
– OAI – PMH
• Then, the Semantic Web
– An introduction to things to come
Survey
• Summary on Word document
• Responses and any comments
Other ways to move materials in
the Internet
• FTP – File Transfer Protocol
– One of the oldest of the Internet protocols
– Originally, command line interface
– Now, many GUI versions
• Host must run a server version that
listens on port 20 (default)
• Client requests a session, user logs in,
issues a sequence of commands
including get and put.
Brief demonstration
Open Archives Intiative
• Generally oriented toward sharing
information about resources in
collections accessible on the Internet
• There is a protocol for sharing
– Based on XML so we will look at that first
Semantic Web
• Semantics refers to meaning.
• The semantic web aims to have enough
information about a resource available that
a program can use resources as if the
program could understand what the
resources are.
– Of course, the program does not really
“understand” in the human sense.
– However, if it has enough information, it can
follow rules and behave in ways that are
consistent with understanding what it is
working with.
Markup
• HTML is a markup language
– not the first, by any means
• Tags in HTML give clues to the reader
(browser or other program) about what to
do in displaying or presenting the marked
text.
– emphasize, make stand out (like a title or
section head), break
– Some allowance for meta tags
• HTML has been stretched beyond its
original design
XML
• Simplified version of SGML
– Language for defining languages (markup
languages)
– HTML is now XHTML and is an XML
language
– XML allows you to make up your own
descriptive language
Metadata
• Critical part of the description of content
and resources
• What does metadata look like?
• Metadata is data about data
– Information about a resource, encoded in
the resource or associated with the
resource.
• The language of metadata: XML
– eXtensible Markup Language
XML
•
•
•
•
•
XML is a markup language
XML describes features
There is no standard XML
Use XML to create a resource type
Separately develop software to interact
with the data described by the XML
codes.
Source: tutorial at w3school.com
XML rules
• Easy rules, but very strict
• First line is the version and character set
used:
– <?xml version="1.0" encoding="ISO-8859-1"?>
• The rest is user defined tags
• Every tag has an opening and a closing
Element naming
• XML elements must follow these naming
rules:
– Names can contain letters, numbers, and other
characters
– Names must not start with a number or
punctuation character
– Names must not start with the letters xml (or XML
or Xml ..)
– Names cannot contain spaces
Elements and attributes
• Use elements to describe data
• Use attributes to present information
that is not part of the data
– For example, the file type or some other
information that would be useful in
processing the data, but is not part of the
data.
Repeating elements
• Naming an element means it appears
exactly once.
• Name+ means it appears one or more
times
• Name* means it appears 0 or more
times.
• Name? Means it appears 0 or one time.
Parts of an XML document
• Elements
The HTML
– The components of an XML document
examples are
– Some contain other parts, some are empty
familiar; the
• Ex in HTML: “br” or “table” in XML “ingredient” XML examples
• Attributes
are made up –
– Information about elements, not data
dependent on
• Ex in HTML “src=” in XML “scale=”
the specific XML
• Entities
scheme used
– Special characters or strings with pre-assigned meaning
• Ex in HTML &nbsp for non-breaking space
• PCDATA
– Parsed Character data: text that will be parsed and interpreted by
the reader. Tags and entities will be expanded and used in
presentation.
• CDATA
– Character data: text that will not be parsed and interpreted. It
will be displayed exactly as provided.
Using XML - an example
Define the fields of a recipe collection:
<?xml version="1.0" encoding="ISO-8859-1"?>
<recipe>
<recipe-title> </recipe-title>
<ingredient-list>
<ingredient>
<ingredient-amount> </ingredient-amount>
<ingredient-name> </ingredient-name>
</ingredient>
</ingredient-list>
<directions>
</directions>
</recipe>
ISO 8859 is a character set.
See http://www.bbsinc.com/iso8859.html
Processing the XML data
• How do we know what to do with the
information in an XML file?
– Document Type Definition (DTD)
• Put in the same file as the data -- immediate
reference
• Put a reference to an external description
• Provides the definition of the legitimate content
for each element
Document Type Definition
•
•
•
•
•
•
•
•
•
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE recipe [
<!ELEMENT recipe (recipe-title, ingredient-list, directions)>
Repeat 0 or
<!ELEMENT recipe-title (#PCDATA)>
more times
<!ELEMENT ingredient-list (ingredient)>
<!ELEMENT ingredient (ingredient-amount, ingredient-name)*>
<!ELEMENT ingredient-amount (#PCDATA)>
<!ELEMENT ingredient-name (#PCDATA)>
<!ELEMENT directions (#PCDATA)> ]>
<?xml version="1.0" encoding="ISO-8859-1"?>
External reference to DTD
<!DOCTYPE recipe SYSTEM “recipe.dtd”>
<recipe>
<recipe-title> Meringue cookies</recipe-title>
Not the way that I
<ingredient-list>
want to see a recipe in
<ingredient>
a magazine!
<ingredient-amount>3 </ingredient-amount>
<ingredient-name> egg whites</ingredient-name>
What could we
</ingredient> <ingredient>
do with a large
<ingredient-amount> 1 cup</ingredient-amount>
collection of
<ingredient-name> sugar</ingredient-name>
such entries?
</ingredient> <ingredient>
<ingredient-amount>1 teaspoon </ingredient-amount>
<ingredient-name> vanilla</ingredient-name>
How would we
</ingredient> <ingredient>
get the
<ingredient-amount>2 cups </ingredient-amount>
information
<ingredient-name>mini chocolate chips </ingredient-name>
entered into a
</ingredient>
collection?
</ingredient-list>
<directions>Beat the egg whites until stiff. Stir in sugar, then vanilla. Gently fold in
chocolate chips. Place in warm oven at 200 degrees for an hour. Alternatively, place in
an oven at 350 degrees. Turn oven off and leave overnight.
</directions>
</recipe>
Spot Check
• Design an XML schema for an
application of your choice. Keep it
simple.
• Examples -- address book, TV program
listing, DVD collection, …
• Work in pairs and discuss your choice
and your solution
• A paper with content encoded with XML:
Another example
http://tecfaseed.unige.ch/staf18/modules/ePBL/uploads/proj3/paper81.xml
• First few lines:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet href="ePBLpaper11.css" type="text/css"?>
<?xml-stylesheet href="ePBLpaper11.xsl" type="text/xsl"?>
<!DOCTYPE paper SYSTEM "ePBLpaper11.dtd">
<paper id="proj3">
<info>
<title>Standards E-learning and their possible support for a rich
pedagogic approach in a 'Integrated Learning' context</title>
<authors>
<author>
<firstname>Rodolophe</firstname>
<familyname>Borer</familyname>
<homepageurl>http://tecfa.unige.ch/perso/staf/borer/</homepageurl>
<email/>
</author>
</authors>
"ePBLpaper11.dtd” shown on next slide
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!-- _________ _____________________ -->
<!-- ePBL-project DTD for student project management
& specification -->
<!-- Copyright: (2004)
[email protected] -->
<!-- http://tecfa.unige.ch/~paraskev/
-->
<!-- Daniel K. Schneider
-->
<!-- http://tecfa.unige.ch/tecfa-people/schneider.html-->
<!-- Created: 13/11/2002 (based on EVA_pm grammar) -->
<!-- Updated: 07/05/2004
-->
<!-- VERSIONS
-->
<!-- v1.1 Adaptations to use with Morphon xml editor
and addition of IDs-->
<!-- ____________________ -->
<!-- _ ENTITY DECLARATIONS ______ -->
<!ENTITY % foreign-dtd SYSTEM "ibtwsh6_ePBL.dtd">
%foreign-dtd;
<!ENTITY % id "id ID #IMPLIED">
<!-- ______ MAIN ELEMENT _________ -->
<!ELEMENT project (name, authors, date, updated,
goal, state-of-the-art, research-developmentquestions, methodology, workpackages ) >
<!ELEMENT name (#PCDATA )>
<!ELEMENT date (#PCDATA )>
<!ELEMENT authors (#PCDATA )>
<!ELEMENT updated (#PCDATA )>
<!ELEMENT goal (title, description )>
<!ELEMENT state-of-the-art %vert.model;>
<!ATTLIST state-of-the-art %id;>
<!ELEMENT research-development-questions (question
)+>
<!ELEMENT question (title, description )>
<!ELEMENT methodology %vert.model;>
<!ATTLIST methodology %id;>
<!ELEMENT workpackages (workpackage )+>
<!ELEMENT workpackage (planning, objectives,
deliverables )>
<!ATTLIST workpackage %id;>
<!ELEMENT objectives (objective )+>
<!ELEMENT objective (title, description )>
<!ELEMENT deliverables (deliverable )+>
<!ELEMENT deliverable (url, title, description )>
<!ELEMENT url (#PCDATA )>
<!ELEMENT planning (from, to, progress )>
<!ELEMENT from (#PCDATA )>
<!ELEMENT to (#PCDATA )>
<!ELEMENT progress (#PCDATA )>
<!-- ________________________ -->
<!ELEMENT title (#PCDATA )>
<!ATTLIST title %id;>
<!ELEMENT description %vert.model;>
<!-- _______________________ -->
Source: http://tecfa.unige.ch/staf/staf-j/vuilleum/staf18/p6/
No longer there
Resource sharing
• On your projects, you had to go looking for
the materials that you need
• You look at the site, see what is there,
consider how it could be used in your
project.
• On a large scale, that does not work so
well.
• It would be nice to query a site and ask
what is there that might be of interest to
us.
Distributed Resources
Multiple Services
One service provider gathers
Data provider
information about data and uses
it to provide services
Data provider
Data provider
Service provider -search, browse,
compare, etc.
Data provider
Data provider
Open Archives Initiative (OAI)
• Web-based
– Uses HTTP to communicate between sites
• Centralized server
– Services provided from a site that has
already gathered the information it needs
for those services from a distributed
collection of sites.
OAI PMH
• Interoperability through Metadata Exchange
• The Open Archives Initiative Protocol for
Metadata Harvesting (OAI-PMH) is a low-barrier
mechanism for repository interoperability. Data
Providers are repositories that expose structured
metadata via OAI-PMH. Service Providers then
make OAI-PMH service requests to harvest that
metadata. OAI-PMH is a set of six verbs or
services that are invoked within HTTP.
http://www.openarchives.org/pmh/
OAI PMH verbs
•
•
•
•
•
•
Identify
ListMetadataformats
ListSets
Listidentifiers
Listrecords
Getrecord
Open Archives Initiative Protocol for
Metadata Harvesting -- OAI-PMH
Implemented as CGI, ASP,
PHP, or other
OAI PMH defines an
interface between the
Harvester and any
number of
Repositories
OAI
OAI
HTTP resp
(XML)
Metadata
Provider
Any system may serve as a harvester, repository, or both
Harvester
Repository
HTTP req
(OAI verb)
Service
Provider
OAI - PMH components
Service
Providers and
Data Providers
Requests
and
Responses
http://www.oaforum.org/tutorial/english/page3.htm#section3
Records
• Metadata of a resource.
• Three parts
– Header (required)
•
•
•
•
Identifier (required: 1 only)
Datestamp (required: 1 only)
setSpec elements (optional: 0, 1, or more)
Status attribute for deleted item
– Metadata (required)
• XML encoded metadata with root tag, namespace
• Repositories must support Dublin Core, other formats
optional
– “About” statement (optional)
• Right statements
• Provenance statements
Dublin Core elements
see: http://dublincore.org/documents/dces/
•
•
•
•
•
•
•
•
Title
Entity primarily responsible for
Creator making content of the resource
Subject - C
Description
Entity making the resource
Publisher available
to content of
Contributor Contributor
the resource
Date YYYY-MM-DD, ex.
Ex: collection, dataset,
Type - C event, image
•
•
•
•
•
•
What is needed to
display or operate the
resource.
Format - C
Identifier Unambiguous ID
Source
Language Standards RFC 3066, ISO639
Relation Ref. to related resource
Coverage - C
Space, time, jurisdiction.
• Rights
Rights Management information
C = controlled vocabulary recommended.
Identifiers
• Globally unique identifier
• Valid URI
– Examples
• oai:<archiveId>:<recordId>
• oai:etd.vt.edu:etd-1234567890
– Must resolve to one item
• No duplicates
• No reuse of previously used identifiers
Datestamps
• Date of last modification of a record
– Used only for harvesting (meta metadata?)
• Mandatory for each item in the repository
• Two levels of granularity possible
– YYYY-MM-DD
– YYYY-MM-DDThh:mm:ssZ
• T … Z = Time zone -- must be GMT
• Allows harvesting incrementally -- get only
what is new since last visit
– Accessed by arguments from and until
The OAI-PMH verbs
• Each requests a specific response from a
data repository
•
•
•
•
Identify
Function: Description of the archive
Example: http://www.language-archives.org/cgi-bin/olaca3.pl?verb=Identify
Parameters: none
Errors/exceptions:
– badArgument (there should not be any)
• Response format:
Element
Example
Ordinality ‡
repositoryName
My Archive
1
baseURL
http://archive.org/oai
1
protocolVersion
2.0
1
earliestDatestamp
1999-01-01
1
deleteRecords
no, transient, persistent
1
granularity
YYYY-MM-DD, YYYY-MM-DDThh:mm:ssZ
1
adminEmail
[email protected]
+
compression
deflate, compress
*
description
oai-identifier, eprints, friends, …
*
‡ Ordinality: 1 = mandatory, 1 only; + = mandatory, 1 only; * = optional, 0 or more
Actual response from
http://www.language-archives.org/cgi-bin/olaca3.pl?verb=Identify
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2011-11-13T02:01:52Z</responseDate>
<request verb="Identify">http://www.language-archives.org/cgi-bin/olaca3.pl</request>
<Identify>
<repositoryName>OLAC Aggregator</repositoryName>
<baseURL>http://www.language-archives.org/cgi-bin/olaca3.pl</baseURL>
<protocolVersion>2.0</protocolVersion>
<adminEmail>[email protected]</adminEmail>
<earliestDatestamp>1900-01-01</earliestDatestamp>
<deletedRecord>no</deletedRecord>
<granularity>YYYY-MM-DD</granularity>
<!-- maybe later <compression>identity</compression> -->
<description>
<oai-identifier xmlns="http://www.openarchives.org/OAI/2.0/oai-identifier"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai-identifier
http://www.openarchives.org/OAI/2.0/oai-identifier.xsd">
Continued
<scheme>oai</scheme>
<repositoryIdentifier>OLACA.language-archives.org</repositoryIdentifier>
<delimiter>:</delimiter>
<sampleIdentifier>oai:ethnologue.com:aaa</sampleIdentifier>
</oai-identifier>
</description>
<description>
<olac-archive xmlns="http://www.language-archives.org/OLAC/1.1/olac-archive"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" type="institutional"
xsi:schemaLocation="http://www.language-archives.org/OLAC/1.1/olac-archive
http://www.language-archives.org/OLAC/1.1/olac-archive.xsd" currentAsOf="2011-10-31">
<archiveURL>http://www.language-archives.org/archive_records/</archiveURL>
<participant name="Steven Bird" role="Curator" email="[email protected]"/>
<participant name="Gary Simons" role="Curator" email="[email protected]"/>
<participant name="Haejoong Lee" role="Administrator"
email="[email protected]"/>
<institution>Open Language Archives Community</institution>
<institutionURL>http://www.language-archives.org/</institutionURL>
<shortLocation>Philadelphia, U.S.A.</shortLocation>
<location/>
Continued
<synopsis>
This repository contains all records from OLAC-registered archives. It is intended to be
used by services which do not want to harvest individual OLAC archives.
</synopsis>
<access>
Metadata may be used only subject to the access permissions given by the individual
archives.
</access>
</olac-archive>
</description>
</Identify>
</OAI-PMH>
ListMetadataFormats
• Function: retrieve available metadata formats
from archive
• Example: archive.org/oai-script?verb=ListMetadataFormats&
•
identifier=oai:HUBerlin.de:3000218
• Parameters: identifier (optional)
• Errors/exceptions:
– badArgument
– idDoesNotExist
– noMetadataFormats
Response to http://www.language-archives.org/cgi-bin/
olaca3.pl?verb=ListMetadataFormats
− <OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2006-10-17T01:58:06Z</responseDate>
<request verb="ListMetadataFormats">http://www.language-archives.org/cgibin/olaca3.pl</request>
− <ListMetadataFormats>
− <metadataFormat>
<metadataPrefix>olac</metadataPrefix>
<schema>http://www.language-archives.org/OLAC/1.0/olac.xsd</schema>
<metadataNamespace>http://www.languagearchives.org/OLAC/1.0/</metadataNamespace>
</metadataFormat>
− <metadataFormat>
<metadataPrefix>olac_display</metadataPrefix>
<schema>http://www.language-archives.org/OLAC/1.0/olac.xsd</schema>
<metadataNamespace>http://www.languagearchives.org/OLAC/1.0/</metadataNamespace>
</metadataFormat>
− <metadataFormat>
<metadataPrefix>oai_dc</metadataPrefix>
<schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema>
<metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNa
mespace>
</metadataFormat>
</ListMetadataFormats>
</OAI-PMH>
ListSets
• Function: retrieve set structure of a repository
• Example: archive.org/oai-script?verb=ListSets
• Parameters: resumptionToken (exclusive)
• Errors/exceptions:
– badArgument
– badResumptionToken
– noSetHierarchy
Sets are optional and are used to divide a
repository into separate units that will be of
interest to different harvesters.
ListIdentifiers
• Function: abbieviated form of ListRecords, retrieve only
headers
• Example: archive.org/oai-script?verb=ListIdentifiers&metadataPrefix=
oai_dc&from=2002-12-01
• Parameters:
–
–
–
–
–
from (optional)
until (optional)
metadataPrefix (required)
set (optional)
resumptionToken (exclusive)
• Errors/exceptions:
–
–
–
–
–
badArgument
badResumptionToken
cannotDisseminateFormat
noRecordsMatch
noSetHierarchy
ListRecords
• Function: harvest records from a repository
• Example: archive.org/oai-script?verb=ListRecords&
metadataPrefix=oai_dc&set=biology
• Parameters:
– from (optional)
– until (optional)
– metadataPrefix (required)
– set (optional)
– resumptionToken (exclusive)
• Errors/exceptions:
–
–
–
–
–
badArgument
badResumptionToken
cannotDisseminateFormat
noRecordsMatch
noSetHierarchy
GetRecord
• Function: retrieve an individual metadata record
from a repository
• Example:
archive.org/oai-script?verb=GetRecord&identifier=oai:HUBerlin.de:
3000218 &metadataPrefix=oai_dc
• Parameters:
– Identifier (required)
– metadataPrefix (required)
• Errors/exceptions:
– badArgument
– cannotDisseminateFormat
– idDoesNotExist
Interoperability
• The goal: communication, without human
intervention, between information sources
– Books that “talk to each other”
• Live links for references
• Knowledge of how to find relevant resources
when needed
• Ability to query other information locations
Protocols
• Precise rules for interactions between
independent processes
– Format of the messages
• Both structure and content
– Specified behavior in response to specific messages
• Many ways to accomplish the same result, but
both sides must have the same understanding
of the rules of engagement.
Spot Check
• Make up a protocol
• Suppose we wanted a kind of command and
control protocol so that a master site could
cause a satellite site to clear the screen that is
displayed to the web.
• We want the response to be prompt
• We want the satellite site to confirm receipt of
the command and to notify the master when
the site screen has been cleared.
• It should be possible to accomplish this with
messages between the two sites and an action
at the satellite site.
The Semantic Web
• Some of these slides come from Lee
Giles
– Who, in turn, credits Jim Hendler, Carl
Lagoze, Jayavel Shanmugasundaram, Sara
Cohen, Jonathan Mamou, Yaron Kanza,
Mark Sapossnek, Yehoshua Sagiv, Frank van
Harmelen
Beyond XML
• Building with XML, new languages have
emerged to
– Describe content, and things in general
– Relationships between things
– Attributes (characteristics) of things
• The semantic web requires that things
be described in sufficient detail that
autonomous processes can discover
useful things and use them properly
Motivation for the Semantic Web
• Search engines
• concepts, not keywords
• semantic narrowing/widening of queries
• Shopbots
• semantic interchange, not screenscraping
• E-commerce
– Negotiation, catalogue mapping, personalization
• Web Services
– Need semantic characterizations to find them
• Navigation
• by semantic proximity, not hardwired links
• .....
Example
• Try these queries with Google:
– Distance between Paris and Madrid Google returns:
Distance between Madrid spain and Paris france
www.mapcrow.info/Distance_between_Madrid_SP_and_Paris_FR.html
COORDINATES +. TOTAL DISTANCE. Madrid, SP, -3.6833 40.4000. Paris, FR, 2.3333
48.8667. Miles: 654.57. Kilometers: 1053.40. Bearing: NE. Madrid, SPAIN ...
– (The) Largest city of France
• Google returns: France – Largest City: Paris
– (The) Largest city of Spain
• Google returns: Spain – Largest City: Madrid
• Now, try these with Google:
– Distance between largest city of France and largest city of Spain
– Distance between “largest city of France” and “largest city of
Spain”
– And worst, Distance between “the largest city of France” and
“the largest city of Spain” – No result returned by Google!
• Actually now shows a link to several versions of these slides!
Semantic Web Stack
http://www.w3.org/DesignIssues/diagrams/sw-stack-2005.png
RDF and OWL
• Resource Description Framework (RDF)
• Web Ontology Language (OWL)
So why not just use XML?
• No agreement on:
– structure
• is country a:
– object?
– class?
– attribute?
– relation?
– something
else?
• what does
nesting mean?
– vocabulary
• is country the
same as
nation?
<country name=”Netherlands”>
<capital name=”Amsterdam”>
<areacode>020</areacode>
</capital>
</country>
<nation>
<name>Netherlands</name>
<capital>Amsterdam</capital>
<capital_areacode>
020
</capital_areacode>
</nation>
Are the above XML documents the same?
● Do they convey the same information?
● Is that information machine-accessible?
●
“2nd aim of Semantic Web”:
Data integration
– Unstructured and sensors, programs,
services semi-structured sources (document
collections, message traffic, web pages, ...)
– Structured data without an explicit data
schema (non-local databases, data tables,
charts and reports, ...)
– Non-Text collections (image, video, sound, ...)
– Streams of data
Must specify the structure of data resources..
2nd aim of Semantic Web:
Data integration
... so a processor can tell how the
"attributes" and "values" are related
– What is required vs. optional?
– How many values for a particular attribute?
– What attributes are keys for other
attributes?
– Which attributes are necessarily related to
other attributes and in what way??
– How do the attributes (and values) in one
data source map to attributes and values
describing another source?
Stack of languages
• XML:
– Surface syntax, no semantics
• XML Schema:
– Describes structure of XML documents
• RDF:
– Datamodel for “relations” between “things”
• RDF Schema (RDFS):
– RDF Vocabulary Definition Language
• OWL:
– A more expressive
Vocabulary Definition Language
Semantic web languages today
• Today there are three semantic web
languages
– RDF – Resource Description Framework
http://www.w3.org/RDF/
– DAML+OIL – Darpa Agent Markup Language
http://www.daml.org/ (deprecated)
– OWL – Ontology Web Language
http://www.w3.org/2001/sw/
• OWL lit
• OWL DL
• OWL Full
RDF is the first Semantic Web
language
Graph
XML Encoding
<rdf:RDF ……..>
<….>
<….>
</rdf:RDF>
Good for
Machine
Processing
RDF
Data Model
Good For
Human
Viewing
Triples
stmt(docInst, rdf_type, Document)
stmt(personInst, rdf_type, Person)
stmt(inroomInst, rdf_type, InRoom)
stmt(personInst, holding, docInst)
stmt(inroomInst, person, personInst)
Good For
Reasoning
RDF is a simple
language for building
graph based
representations
The RDF Data Model
• An RDF document is an unordered collection of
statements, each with a subject, predicate and object (aka
triples)
• A triple can be thought of as a labelled arc in a graph
• Statements describe properties of web resources
• A resource is any object that can be pointed to by a URI:
–
–
–
–
–
a document, a picture, a paragraph on the Web, …
E.g., http://umbc.edu/~ypeng/F07671.html
a book in the library, a real person (?)
isbn://5031-4444-3333
…
• Properties themselves are also resources (URIs)
RDF without a Schema
• Object ->Attribute-> Value triples
pers05
Author-of
ISBN...
• objects are web-resources
• Value is again an Object:
• triples can be linked
• data-model = graph
pers05
Author-of
ISBN...
ISBN...
Publby
MIT
What does RDF Schema add?
• Defines vocabulary for RDF
• Organizes this vocabulary in a
typed hierarchy
• Class, subClassOf, type
• Property, subPropertyOf
• domain, range
Person
subClassOf
Author
domain
communicatesTo
type
Frank
subClassOf
range
Reader
type
communicatesTo
Lynda
Which Semantic Web?
• Version 1:
"Semantic Web as Web of Data" (TBL)
• recipe:
expose databases on the web,
use XML, RDF, integrate
• metadata from:
– expressing DB schema semantics
in machine interpretable ways
• enable integration and unexpected re-use
Which Semantic Web?
• Version 2:
“Enrichment of the current Web”
• recipe:
Annotate, classify, index
• metadata from:
– automatically producing markup:
named-entity recognition,
concept extraction, tagging, etc.
• enable personalization, search, browse,..
Which Semantic Web?
• Version 1:
“Semantic Web as Web of Data”
• Version 2:
“Enrichment of the current Web”
 Different use-cases
 Different techniques
 Different users
The Evolving Web
Web of
Knowledge
DATA/PROGRAMS
Proof, Logic and
Ontology Languages
Shared terms/terminology
Machine-Machine communication
2010
Resource Description Framework
eXtensible Markup Language
HyperText Markup Language
HyperText Transfer Protocol
Self-Describing Documents
2000
DOCUMENTS
Foundation of the Current Web
1990
Berners-Lee, Hendler; Nature, 2001