Transcript 슬라이드 1
Towards Semantically-Interlinked
Online Communities
John G. Breslin, Andreas Harth, Uldis Bojars, and Stefan Decker
Digital Enterprise Research Institute (DERI)
Galway, Ireland
2008. 11. 26
Summarized and presented by Seungseok Kang
Introduction
• SIOC (pronounced “shock”)
– Semantically-Interlinked Online Communities
– Interconnect online communities on the web
• Bulletin boards
• Weblogs
• Mailing lists
– Facilitate the location of related and relevant information
• Cross-site querying
• Topic-related searches
• Importing of SIOC data into other systems
• By the way, why do we discuss about SIOC?
– It’s an example of Semantic Web 2.0
• Semantic Web efforts is mainly towards interlinking application
• Web 2.0 is about providing user application
– These are not mutually exclusive!
– Successful example?
What is Web 2.0?
• http://en.wikipedia.org/wiki/Web_2.0
– “Web 2.0 … has … come to refer to what some people describe
as a second phase of architecture and application development
for the World Wide Web.”
• Web 2.0 focuses include :
–
–
–
–
The Web as a platform for social and collaborative exchange
Reusable community contributions
Subscriptions to information, news, data flows, services
Mass-publishing using web-based social software
For Communication and Collaboration
What is Semantic Web?
• Tim Berners-Lee (2001)
– “An extension of the current Web in which information is given
well-defined meaning, better enabling computers and people to
work in cooperation.”
– Allowing the Web to reach its full potential
– The next generation of the Web
W3C’s Semantic Web Stack
Commonly quoted problems
• Ontologies are difficult to create and are not used
– Not worth the effort
• Annotation is expensive
– Regular user won’t bother
• Metadata provides no benefits
– No consumers
• Standards are too complicated
– Developers don’t understand Description Logics
One of the Answer is…
• Metaweb
– Combination of Web 2.0 and Semantic Web
– Social semantic information spaces
Source : John Breslin, Stefan Decker, “Semantic Web 2.0: Creating Social Semantic Information Spaces”, 2006, DERI
Social Semantic Information Spaces
Creating Semantic Web Data:
Semantic Interlinking of Online Community Sites
Source : Stefan Decker’s talk, “Towards Semantic Web 2.0 – Creating Social Semantic Information Spaces”, 2006, DERI
Evolution of Online Communities
• Online community sites:
– Provide a valuable source of information
– May contain rich meta-information
– But are isolated from one another:
• Many sites discussing complementary topics
• Next steps:
– Connect sites together
– Add more value:
• Let other sites know more about the structure and contents
• Make more use of tagging and semantic metadata
• So, we need SIOC!
What is SIOC?
• Semantically-Interlinked Online Communities
– Connecting forums, posts from many types of online communities (
blogs, forums, mailing lists, etc.)
– Interesting possibilities:
• Distributed linked conversations
• Decentralised discussion channels and communities
• Typical Usage Scenario
– A user is searching for information on installing broadband on a
Linux-based PC
• Post A discusses local ISPs in bulletin board
• Post B compares broadband models on Usenet
• Post C details how to install broadband on Linux in mailing list post
– Previously the user had to traverse three sites to find the relevant
information
– By using SIOC, a search for broadband on bulletin board will also
yield the relevant text from interlinked Usenet and mailing list post
Challenges of SIOC
• Adoption SIOC by community sites
– By using concepts that can be easily understood by site admin
– By providing properties that are automatically created by enduser
• How best to use SIOC with existing ontologies
– By mapping and interfacing to commonly-used ontologies
• Dublin Core
• FOAF
• How SIOC will scale
– Keep the scaling challenge in mind (……)
11
12
13
14
15
16
17
18
19
20
Source : Stefan Decker’s talk, “Towards Semantic Web 2.0 – Creating Social Semantic Information Spaces”, 2006, DERI
SIOC Ontology
• Main concepts in online communities
SIOC Ontology (cont’d)
• Main classes
– Site
• Location of an online
community
– Forum
• A channel or discussion area
on which posts are made
– Post
• An article or message
posted by a user to a forum
– Event
• A virtual or real-world event
with a single or multiple
participants
– Group
• A set of members or users
of a community site who
have a common role
– User
• A person who is a member
of an online community
Overview of classes and properties used in SIOC
SIOC Ontology (cont’d)
• Important properties
– Topic
• Assigning topics to SIOC primitives
• Applies to most of the concepts defined
– Views
• The number of times a particular post of user profile has been viewed
– Has_sibling
• Copied posts from one forum to another relevant forum
• Non-identical twins that share most characteristics but differ in some
manner
– Closed
• The date and time that the post or forum was closed
– Has_creator
• A post to the user profile of its author
• Used in following the link from the post to the creator and locate the
other posts by the same person
– Knows
• Showing the structure of social networks inside the community sites
Mapping
• For exchanging community instance data
– Leveraging the instance data that is already available
– SIOC provide mappings in RDFS and OWL
• To allow the import and export of SIOC instance in different
vocabularies
Selected SIOC mappings
• Mappings in SIOC is not only restricted to ontologies
– Mapping from XML documents, RSS format, or Email into the
SIOC ontology using XSL stylesheets
Exchanging Instances
• Core use of SIOC
– Exchange of instance data between sites
– Wrappers to existing tools
• Legacy systems
• Web-based systems
– Document-based wrapping
• Mirror data in RDF store
– Disseminating solution for newly-developed applications
• Native RDF store
How can SIOC disseminated?
• Legacy systems
– Email, IRC, Usenet, and so on…
– Need to employ protocol wrappers for legacy protocols to HTTP
– Example : Email wrapper
• Export
– Accepts a conjunctive query over HTTP GET and returns the results in
SIOC
– The query is parsed and translated into IMAP4 to send to the original
data store
– Original data store returns the result in RFC822 format
– It translated back into RDF and returned to the original caller via HTTP
• Import
– Receives sioc:Posts via HTTP put
– Posts are translated into the RFC822 format
– Wrapper returns a status code indicating that the addition of data was
completely correctly
How can SIOC disseminated? (cont’d)
• Web-based systems
– Bulletin boards, weblogs, social networking sites, and so on…
– Create SIOC export modules for popular open-source discussion
systems
– Infecting the Web Infrastructure:
• During next upgrade cycle gigabytes of community data
become available
– Initial versions of SIOC metadata exporters created for:
• Content management system (Drupal)
– http://sioc-project.org/drupal
• Bulletin board system (phpBB) [in progress]
• Blogging system (WordPress)
– http://sioc-project.org/wordpress
• French blogging system (DotClear)
– http://sioc-project.org/dotclear
How can SIOC disseminated? (cont’d)
• Sample SIOC exports from WordPress
How can SIOC disseminated? (cont’d)
• Mirror data in RDF store
– Allowing querying of the information that sites publish in flat
files
– Replicate the information in a data store that can process
queries
– Queries are answered from the replica
– The replica is updated either a crawler or original site
• Native RDF store
– For newly architected sites that can make use of a native RDF
repository to store their data
• Jena2, Sesame, Redland, and so on
– Just use their APIs for importing and exporting
straightforwardly!
Using SIOC Data
• Then, How can SIOC data be used?
Source : Stefan Decker’s talk, “Towards Semantic Web 2.0 – Creating Social Semantic Information Spaces”, 2006, DERI
Browsing
• Navigating the SIOC Data
– Use a mapping from SIOC to a data format
– Use existing RDF browser
Querying
• Posing structural queries against the collected data
– SIOC considered querying one community site in isolation until
now
– Queries can be routed across similar community sites into the
different site management by using the infrastructure for
distributing queries
Querying (cont’d)
• An example of SIOC querying
Locating Related Information
• For preparing the data in advance at creation time of a
post
• SIOC queries the network of community sites to find
related posts when
– New post is created in a community site
– SIOC information is available
• After the related resources is received
– Community site stores resulted information using a related_to
property
– Information about the resources the article links to is also
extracted from a post body and stored in a links_to property
Conclusion
• Presenting the SIOC ontology and mapping to and from
other vocabularies
• Providing an upgrade path that allows a gradual
migration from existing systems to semantically-enabled
sites
• Developing a prototype SIOC exporter and browser
• In the future…
– Exploiting the characteristics of intra- and inter-site to guide
query routing in a P2P-like environment
Discussion
• Some technical problems
– We have to consider the types of SIOC exporter
• Naver blog’s service format is differ from Tistory.com
• We need to build each SIOC exporter according to service format of its
original service type (whether they are all kinds of blog service or not)
• In addition, it costs lots of resource while translating each crawled data
from original sources into SIOC data format
• At the bottom…
– They said that SIOC is the one of successful case of merging Web
2.0 and Semantic Web…
– SIOC project is started in Oct. 2004
• It is still being developed now
– and it still do not give any attractive contributions to social
community services or blog service providers practically
– Where’s the problem?
•
•
•
•
SIOC?
Service vendors
the vision of “Semantic Web 2.0”?
Semantic Web?