Lecture13 - The University of Texas at Dallas

Download Report

Transcript Lecture13 - The University of Texas at Dallas

Building and Analyzing
Social Networks
Case Studies of Semantic Social
Network Analysis
Dr. Bhavani Thuraisingham
February 22, 2013
4/8/2016 07:51
23-2
Outline
0 Reference: P. Mika, Semantic Web and Social Networks,
0
0
0
0
Springer, 2008: Chapter 7, 8, 9, 10
Evaluation of Web-based Social Network Extraction
Semantic-based Social Network Analysis in the Sciences
Ontologies in Folksonomy Systems
How have Semantic Social Networks benefitted communities
4/8/2016 07:51
23-3
Evaluation of Web-based Social Network
Extraction: Chapter 7
0 Survey methods and electronic data extraction
0 Empirical Study
0 Data Collection
0 Preparing the data
0 Optimizing goodness of fit
0 Comparison across methods and networks
0 Predicting the goodness of fit
0 Evaluation through analysis
4/8/2016 07:51
23-4
Differences between survey methods and
electronic data extraction
0 Differences in what is measured
- Challenge is to extract data from the web that reflects the
real world
0 Errors introduced by the extraction methods
- Homonyms
0 Errors introduced by the survey data collection
- Impossible to get all of the data for analysis
4/8/2016 07:51
23-5
Context of the Empirical Study
0 Collected network data of 123 researchers in Vrije University
0 Human subject approval is needed
0 Department organization
0 Different from the semantic web community which has
common research interest
4/8/2016 07:51
23-6
Data Collection
0 Collected personal and social information from online
surveys
0 Multiple page survey
0 Questions such as
- Who do you know from the list of names?
- Who are similar to you?
0 The results of the survey was represented in RDF and
Sesame was used to manage the database
4/8/2016 07:51
23-7
Preparing the Data
0 Remove all non respondents; only 79 responded
0 Build the network: nodes and edges
- Advice seeking, Advice giving, Friendships, Troubled
relationships, Similarity, etc.
- Nodes after non respondents removed, Nodes with edges,
Edges, Edges after non respondents removed, etc.
0 Handle incomplete and inconsistent data
4/8/2016 07:51
23-8
Optimizing goodness of fit
0 Need to prune the network
- Minimal number of pages one must have on the web
- Minimal number of relationships
- PhD students have less than1000 pages while professor
may have over 10000 pages
- What is the appropriate parameter for filtering?
- What is the similarity between the survey network and the
extracted network
0 Extract relationships
- Information retrieval task
- Precision and Recall
4/8/2016 07:51
23-9
Comparison across methods and networks
0 Use more than one method for analysis
0 Select parameters for each method separately
0 Methods selected by the author are:
- Co-occurrence analysis
- Average precision
0 Determine which method produces better precision and recall
0 Data mining techniques such as different association rule
mining methods can also be used
4/8/2016 07:51
23-10
Predicting the goodness of fit
0 Challenge is to determine the closeness a person’s real world
network and his/her online network
0 Need to measure the similarity between the personal network
and the survey network
0 Attributes considered include member of relations mentioned,
age of the individual, number of years spent at the university,
etc.
0 Some observations
- More respondents are mentioned by someone the higher
the precision of extraction
- Survey attributes did not impact the result
4/8/2016 07:51
23-11
Evaluation through analysis
0 Networks from surveys or web are used as raw data to carry
out complex data analysis
0 Author has concluded that 100% match is not required for
obtaining relevant results
0 Most network measures are statistical aggregates
0 Robust to missing or incorrect information
4/8/2016 07:51
23-12
Semantic-based Social Network Extraction:
Chapter 8
0 Context
0 Methodology
- Data acquisition
- Representation, storage and reasoning
- Visualization and analysis
0 Results
- Descriptive analysis
- Structural and cognitive effects of scientific performance
4/8/2016 07:51
23-13
Context
0 Community of Researchers working on semantic web
0 Community defined using the ISWC conference authors
0 Objective
- Study the community, the contributions they are making,
the interactions between them so that semantic web
research is enhanced
4/8/2016 07:51
23-14
Methodology
0 Combine existing methods of web mining ands extraction
from publications and emails with semantic web-based data
storage, aggregation and reasoning with social network data
0 Flank supports data collection, storage and visualization of
social networks
0 Methodology consist of
- Data Acquisition
- Data Representation, Storage and Reasoning
- Visualization and Analysis
4/8/2016 07:51
23-15
Methodology
0 Data Acquisition
- Four types of knowledge sources: HTML pages., FOAF
profiles, public emails, and bibliographical data
- Web mining component of Flink extracts social networks
from the data; Calculate strength of the individuals;
Associate individuals with domain concepts
0 Representation., Storage and Reasoning
- Data in RDF format
- Reasoning with ontologies; Ontology matching
0 Visualization and Analysis
- Browse the social network through the web interface
- Compute statistics
4/8/2016 07:51
23-16
Results
0 Descriptive Analysis
- Who are the major players in semantic web research
- Central figures: , Ian Horrocks., Frank van Harmelen,
Deborah Mcguiness, Jim Hendler
0 Structural and cognitive effects of scientific performance
- Discussions on the structure of the network on the
scientific performance
- Structural and cognitive effects of scientific performance
4/8/2016 07:51
23-17
Results
0 Descriptive Analysis
- Who are the major players in semantic web research
- Central figures: , Ian Horrocks., Frank van Harmelen,
Deborah McGuiness, Jim Hendler
0 Structural and cognitive effects of scientific performance
- Discussions on the structure of the network on the
scientific performance
- Dense interconnected networks vs. Sparse networks
= Dense networks maybe mean closer ties
= Sparse networks may mean diversity
4/8/2016 07:51
23-18
Ontologies in Folksonomy Systems: Chapter 9
0 A folksonomy is a system of classification derived from the
practice and method of collaboratively creating and managing
tags to annotate and categorize content; this practice is also
known as collaborative tagging, social classification, social
indexing, and social tagging
0 Topics covered
- Tripartite model of ontologies
- Case Studies
- Evaluation
4/8/2016 07:51
23-19
Tripartite model of ontologies
0 Folksonomy allows users to describe a set shared objects
with a set of keywords
0 Networks of folksonomies are modeled as a tripartite graph
with hyper edges
0 In a social tagging system users tag objects with concepts
creating as ternary association between the user, concepts
and the object
0 Ultimately the tagging system is represented by a collection
of ontologies
4/8/2016 07:51
23-20
Case Studies
0 Otology emergence in del.icio.is
- Del.icio.us is a social book marking tool
- Users manage personal collections of links to web sites
and describe those links
- Ontologies are used to represent the bookmarks, and
descriptions
0 Community based ontology extraction from web pages
- Actor-concept-instance ontology
- Web pages of a person and the topic of interest
- Flink is used to represent and reason about the
ontologies
4/8/2016 07:51
23-21
Evaluation
0 How do you evaluate the results from constructing the
ontologies and reasoning about the ontologies?
0 Which ontologies are better?
0 Need to consult the community to validate the results
- Emailed the set of researchers and asked them to answer
the questions
- Not all of them responded
- Apply methods discussed in Chapter 7
4/8/2016 07:51
23-22
How have Semantic Social Networks benefitted
communities: Chapter 10
0 Katrina PeopeFinder
0 A Second Life
4/8/2016 07:51
23-23
Katrina PeopleFinder
0 Katrina, one of the worst hurricanes in US History
0 Thousands of people were displaced
0 Through the semantic social network Katrina PeopleFinder a
network of the people was constructed and the associations
determined
0 The results were used to connect relatives and friends
4/8/2016 07:51
23-24
Second Life
0 Second Life is an online virtual world developed by Linden Lab. It was
launched on June 23, 2003. A number of free client programs, or Viewers,
enable Second Life users, to interact with each other through avatars (Also
called Residents). Residents can explore the world (known as the grid), meet
other residents, socialize, participate in individual and group activities, and
create and trade virtual property and services with one another. Second Life
is intended for people aged 16 and over.
0 Built into the software is a three-dimensional modeling tool based on simple
geometric shapes that allows residents to build virtual objects. There is also
a procedural scripting language, Linden Scripting Language, which can be
used to add interactivity to objects. Sculpted prims (sculpties), mesh,
textures for clothing or other objects, animations, and gestures can be
created using external software and imported. The Second Life Terms of
Service provide that users retain copyright for any content they create, and
the server and client provide simple digital rights management functions.