Lecture8 - The University of Texas at Dallas
Download
Report
Transcript Lecture8 - The University of Texas at Dallas
Knowledge Management,
Semantic Web and
Social Networking
Introduction to the
Semantic Web
Dr. Bhavani Thuraisingham
June 2010
3/22/2017 11:49
13-2
Outline of Part 1
0 Today’s web to tomorrow’s web
0 Semantic web
0 XML, RDF, Ontologies, OWL Rules
0 Ontology Engineering
0 Vision
3/22/2017 11:49
13-3
Semantic Web: Overview
0 According to Tim Berners Lee, The Semantic Web supports
- Machine readable and understandable web pages
- Enterprise application integration
- Nodes and links that essentially form a very large
database
Premise:
Semantic Web Technologies = XML, RDF, Ontologies, Rules
Applications: Web Database Management, Web Services,
Information Integration
3/22/2017 11:49
13-4
Today’s Web to Semantic web
0 Today’s web
- High recall, low precision: Too many web pages resulting
in searches, many not relevant; Sometimes low recall
- Results sensitive to vocabulary: Different words even if
they mean the same thing do not results in same web
pages; Results are single web pages not linked web
pages
0 Semantic web
- Machine understandable web pages
- Activities on the web such as searching with little or no
human intervention
- Solutions to the problems faced by today’s web
- Retrieving appropriate web pages, sensitive to vocabulary
3/22/2017 11:49
13-5
Knowledge Management and Personal Agents
0 Knowledge Management
- Corporation Need: Searching, extracting and maintaining information,
uncovering hidden dependencies, viewing information
- Semantic web for knowledge management
= Organizing knowledge, automated tools for maintaining knowledge,
question answering, querying multiple documents, controlling
access to documents
0 Agents
- John is a president of a company. He needs to have a surgery. With
current web he has to check each web page for relevant information,
make decisions depending on the information provided
- With the semantic web, the agent will retrieve all the relevant
information, synthesize the information, ask John if needed, and then
present the various options to John and also makes recommendations
3/22/2017 11:49
13-6
E-commerce
0 Business to Consumer
- Users shopping on the web; wrapper technology is used to extract
information about user preferences etc. and display the products
- Use of semantic web: Develop software agents that can interpret privacy
requirements, pricing and product information and display timely and
correct information to the use; also provides information about the
reputation of shops
- Future: negotiation among the behalf of the user
0 Business to Business
- Organizations work together and carrying out transactions such as
collaborating on a product, supply chains etc. With today’s web lack of
standards for data exchange
- Use of semantic web: XML is a big improvement, but need to agree on
vocabulary. Future will be the use of ontologies to agree on meanings
and interpretations
3/22/2017 11:49
13-7
Some aspects of semantic web
0 Explicit Metadata
- Metadata is data about data; Need metadata to be explicitly specified so
that different groups and organizations will know what is on the web
- Using metadata, one can then carry out various activities such as
searching, integration and executing actions
- Metadata specification languages include XML, RDF, OWL
0 Semantic web vs Artificial Intelligence
- Goal of Artificial Intelligence is to build an intelligent agent exhibiting
human-level intelligence; Goal of the semantic web is to assist the
humans in their day to day online activities
0 Logic and Reasoning
- Logic can be used to specify facts as well as rules; New facts and
derived from existing facts based on the inference rules; Descriptive
Logic is the type of logic that has been developed for semantic web
applications
3/22/2017 11:49
Layered Approach: Tim Berners Lee’s Vision
www.w3c.org
13-8
3/22/2017 11:49
13-9
Semantic Web and Its Applications
Web
Services
Information
Integration
Information
Sharing
Applications
Logic, Proof and Trust
Rules/Query
RDF, Ontologies
XML, XML Schemas
URI, UNICODE
Tim Berners
Lee’s
Technology
Stack
3/22/2017 11:49
13-10
Layered Architecture for Dependable
Semantic Web at UTD
0Adapted from Tim Berners Lee’s description of the Semantic Web
S
E
C
U
R
I
T
Y
P
R
I
V
A
C
Y
Logic, Proof and Trust
Rules/Query
RDF, Ontologies
Other
Service
s
XML, XML Schemas
URI, UNICODE
0 Some Challenges: Security and Privacy cut across all layers;
Integration of Services; Composability
3/22/2017 11:49
13-11
What is XML all about?
0 XML is needed due to the limitations of HTML and
0
0
0
0
complexities of SGML
It is an extensible markup language specified by the W3C
(World Wide Web Consortium)
Designed to make the interchange of structured documents
over the Internet easier
Key to XML used to be Document Type Definitions (DTDs)
- Defines the role of each element of text in a formal model
XML schemas have now become critical to specify the
structure
- XML schemas are also XML documents
3/22/2017 11:49
13-12
Example XML Document
Year: 2002
Asset report
Assets
Dept
Patents
Name: U. Of X
Equipment
Other assets
Funds
Patent
news
Name:
CS
Expenses
Contracts
Grants
ID Author title
3/22/2017 11:49
13-13
RDF
0 Resource Description Framework is the essence of the
0
0
0
0
semantic web
XML cannot be used to specify semantics
Example:
- Professor is a subclass of Academic Staff
- Professor inherits all properties of Academic Staff
RDF was specified so that the inadequacies of XML could be
handled; RDF uses XML Syntax
RDF Concepts
- Basic Model
= Resources, Properties and Statements
- Container Model
= Bag, Sequence and Alternative
3/22/2017 11:49
13-14
Ontology
0 RDF has issues also
- Cannot express several other properties such as Union,
Interaction, relationships, etc
0 Need a richer language; Ontology languages were developed
by the semantic web community for this purpose
0 What are ontologies?
- Common definitions for any entity, person or thing
- Several ontologies have been defined and available for
use
- Defining common ontology for an entity is a challenge
- Mappings have to be developed for multiple ontologies
- Specific languages have been developed for ontologies
3/22/2017 11:49
13-15
OWL: Background
0 It’s a language for ontologies and relies on RDF
0 DARPA (Defense Advanced Research Projects Agency)
0
0
0
0
developed early language DAML (DARPA Agent Markup
Language)
Europeans developed OIL (Ontology Interface Language)
DAML+OIL combines both and was the starting point for OWL
OWL was developed by W3C
OWL Features
- Subclass relationship; Class membership; Equivalence of
classes
- Consistency (e.g., x is an instance of A, A is a subclass of
B, x is not an instance of B)
- Three types of OWL: OWL-Full, OWL-DL, OWL-Lite
3/22/2017 11:49
13-16
Why Rules?
0 RDF is built on XML and OWL is built on RDF
0 We can express subclass relationships in RDF; additional
relationships can be expressed in OWL
0 However reasoning power is still limited in OWL
0 Therefore the need for rules and subsequently a markup language
for rules so that machines can understand
0 Examples: SWRL, RuleML
3/22/2017 11:49
13-17
What is Ontology Engineering?
0 Tools and Techniques to
0
0
0
0
- Create Ontologies, Specify Ontologies, Maintain
Ontologies, Query Ontologies, Evolve Ontologies, Reuse
Ontologies
Much of the research is focusing on developing ontologies
using tools from multiple heterogeneous data sources
Essentially extracting concepts and expanding on concepts
from the data sources
Uses combination of data integration, metadata extraction,
and machine learning techniques
E.g. Clustering of concepts, Classification of concepts etc.
3/22/2017 11:49
13-18
Vision
0 Semantic Web technologies represent and reason about the data on
the web
- Databases, Weblogs, Blogs, Chats, FOAF, Images, Video, etc.
0 Social Networks are extracted from semantic web data using
reasoning and data mining
0 Social network analysis analyzes social networks using data mining
and other reasoning techniques and extracts nuggets
0 The nuggets are used for effective knowledge management
3/22/2017 11:49
13-19
Outline of Part II
0 This unit describes the relationship between Social Networks
0
0
0
0
0
and Semantic Web
FOAF
LINK (Peter Mika, Free University)
Extracting social networks from Semantic Web Data
(Tim Finin et al, UMBC, Jennifer Golbeck UMC)
Convergence and Vision
Reference: P. Mika, Semantic Web and Social Networks,
Springer, 2008
3/22/2017 11:49
13-20
Semantic Social Networks
0 The latest breed of social networking services combine social networks
with the sharing of content such as bookmarks, documents, photos,
reviews.
0 The use of of Semantic Web technology facilitated distributed control.
-
The friend-of-a-friend (FOAF) project is a first attempt at a formal,
machine processable representation of user profiles and friendship
networks. (Unlike with Friendster and similar sites that have central
control)
-
FOAF profiles are created and controlled by the individual user and
shared in a distributed fashion.
- http://www.foaf-project.org.
3/22/2017 11:49
13-21
FOAF
0 The Friend of a Friend (FOAF) project is creating a Web of
machine-readable pages describing people, the links between
them and the things they create and do; it is a contribution to
the linked information system known as the Web.
0 FOAF defines an open, decentralized technology for
connecting social Web sites, and the people they describe.
0 FOAF is part of a shift towards a Web where we can choose
the sites and tools we like, without being cut off from friends
who made different choices.
0 FOAF lets you share and inter-connect information from
diverse sources, move it around, and use it in unexpected
new ways.
Sharif University of
Technology,
3/22/2017 11:49
13-22
FOAF Example
0 <foaf:Person rdf:about="#me“
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<foaf:name>Dan Brickley</foaf:name>
<foaf:mbox_sha1sum>241021fb0e6289f92815fc210f9e9137262c252e<
/foaf:mbox_sha1sum>
<foaf:homepage rdf:resource="http://danbri.org/" /> <foaf:img
rdf:resource="/images/me.jpg" />
</foaf:Person>
3/22/2017 11:49
13-23
Semantic Social Networks
Semantic Web researchers and their connections
across the globe.
3/22/2017 11:49
13-24
Semantic Social Networks
Social
Network
of a
Semantic
Web
Researcher
3/22/2017 11:49
13-25
FLINK (Peter Mika, Free University)
0 Flink, the system developed at Free University 9The Netherlands) is one
of the early semantic social networks that exploits FOAF for the
purposes of social intelligence.
- social intelligence, is consdiered to be the semantics-based
integration and analysis of social knowledge extracted from
electronic sources under diverse ownership or control. In our case,
these sourcesFrom
0 Flink extracts knowledge about the social networks of the community
and consolidates what is learned using a common semantic
representation, namely the FOAF
3/22/2017 11:49
13-26
FLINK Architecture
Architecture
Of Flink
3/22/2017 11:49
13-27
FLINK Architecture
0
The architecture of Flink can be divided in three layers concerned with metadata
acquisition, storage and visualization
0
Acquisition layer of the system concerns the acquisition of metadata. (e.g., HTML pages
from the web, FOAF profiles from the Semantic Web, public collections of emails and
bibliographic data)
0
The web mining component of Flink employs a co-occurrence analysis technique The
web mining component also performs the additional task of finding topic interests, i.e.
associating researchers with certain areas of research.
0
The middle layer is responsible for storing and enhancing metadata through reasoning.
0
Inference is another major task of the middle layer. Sesame (we can also use JENA)
applies the RDF closure rules to the data at upload time. This feature can be extended
by defining domain-specific inference rules in Sesame’s custom rule language.
0
The third layer, is the browing and visualization layer,. The user interface of Flink is a
pure Java web application based on the Model-View-Controller (MVC) paradigm.
3/22/2017 11:49
13-28
Social Network Analysis on Semantic Web Data
0 Social network analysis tasks for Flink augments the web mining
task with finding which people belong to which groups (called
GROUP DETECTION)
0 The association and links between people including what is the
relationship between John and James? Are they just friends or do
they have a romantic relationship? Do they often travel together?
0 Semantic web reasoning tools (e.g., based on OWL, RDF and SWRL)
may be used to reason and extract the nuggets.
3/22/2017 11:49
13-29
Group Detection
0 A large community often breaks up to a set of closely knit groups of
individuals, woven together more loosely by the occasional
interaction across groups.
Based on this theory, SNA offers a number of clustering algorithms for
identifying communities based on network data. Alternatively, the
subgroups may be identified by the researcher using additional
attribute data on the
Peter Mika’s research uses an interactive clustering software
provided as a sample with the JUNG Java toolkit for SNA. This
software allows the user to cluster a network using an edgebetweenness clusterand visualize the results.
As an example, a group of researchers from the AIFB Institute of the
University of Karlsruhe quickly emerge as a single cluster of the
network.
3/22/2017 11:49
13-30
Linking Social Networks with FOAF
0 One of the core goals of the Semantic Web is to store data in distributed
locations, and use ontologies and reasoning to aggregate it.
0 Social networking is a large movement on the web, and social networking
data using the Friend of a Friend (FOAF) vocabulary makes up a
significant portion of all data on the Semantic Web.
0 Many traditional web-based social networks share their members’
information in FOAF format.
0 While this is by far the largest source of FOAF online, there is no
information about whether the social network models from each network
overlap to create a larger unified social network model, or whether they
are simply isolated components.
0 Researchers at the U of MD have studied the intersection of FOAF data
found in many online social networks. Using the semantics of the FOAF
ontology and applying Semantic Web reasoning techniques, they show
that a significant percentage of profiles can be merged from multiple
networks.
3/22/2017 11:49
13-31
Extracting Social Networks
0 Extracting social network from noisy, real world data is a
challenging task, even if the information is already encoded in RDF
using well defined ontologies.
0 The process consists of three steps: discovering instances of
foaf:Person, merging information about unique individuals, and
linking person through various social relation properties such as
foaf:knows.
3/22/2017 11:49
13-32
Extracting Social Networks (Tim Finin)
0 A critical problem is determining whether two foaf:Person instances
denote the same person. The semantics of FOAF vocabulary
suggests several heuristics to answer this question:
- • named URI. Non-anonymous individuals using the same URI
denote the same person.
- • Inverse-functional properties. Inverse functional properties
such as foaf:mbox and foaf:homepage identify unique
individuals. Other properties, such as foaf:name and foaf:nick,
while not strictly inverse functional, can be used in practice in
conjunction with other properties like foaf:phone to identify
individuals with high probability.
-
0 \
Semantic equality. When two or more values of an inverse
functional property co-exist in the same individual’s description,
they are semantically equivalent as identifying the same
individual.
3/22/2017 11:49
13-33
Convergence
0 Semantic web data includes databases, files, web logs, blogs,
0
0
0
0
emails, etc.
Data mining applied to semantic web data together with the
reasoning capabilities of semantic web result in social
networks
Data mining applied to social networks extract the nuggets
Nuggets together with additional semantic web data such as
ontologies result in knowledge
Knowledge utilized to improve the effectiveness of an
organization
3/22/2017 11:49
13-34
Convergence
Semantic Web
Data/Reasoning
XML, RDF, OWL
e.g., databases
Blogs, email
Data
Management/
Data Mining/
Data Analytics
Social
Networks/
Analysis
Knowledge
Management
3/22/2017 11:49
13-35
Vision
0 Improved technologies for data representation
0
0
0
0
0
0
- Data will include structured and unstructured databases,
emails, blogs, files, relationships, video, images, audio,
tags, links, - - - - Improved tools for reasoning
Improved tools for data mining/data analytics
Improved tools for social network extraction
Improved tools for knowledge extraction
Improved tools for knowledge management
We call the above Information Analytics