Web 2.0 Project Presentation
Download
Report
Transcript Web 2.0 Project Presentation
Analysis of online hate
communities in Social
Networks
Presented by :
Ruchi Bhindwale
OUTLINE
•
•
•
•
•
•
•
•
•
Introduction
Related Work
Analysis
Our Approach
Data Preprocessing
Graph Creation
Manual Mining Results
Advantages/Disadvantages
Conclusion
Introduction
Web 2.0
Blogsphere
Social Networking
Sites
Hate Groups
Related Work
•
•
Often Social Networks as represented as
a graph
Approaches to identify communities
Co-citation Analysis
o Hidden Markov Model
o Content Analysis
o
Analysis
•
•
•
•
One supporter and many
opponents
98 % were in the category
Countries and Regional and
Religion and Belief
All the communities with hate title
do not have posts with hate
content
Such communities contained
foreign language words
Our Approach
•
•
Combination of content (text) mining and
graph mining.
Text mining is employed to deal with the
posts while graph mining considers the
communication pattern within these
communities.
Data Preprocessing
Select communities related to country
and politics
Mine the title with “hate keyword”
Consider only those communities with
substantial number of members
Mine the thread title to select relevant
posts
Consider only those posts with
substantial number of replies
Text mine the post to provide a hate
content
Representation the communication as
a graph
Rules for generating nodes and
edges
•
•
•
•
Each Member as a node.
A directed edge between nodes for the message
posted by one member, addressed to the other
member in a particular discussion thread.
Self loop edge for the member who creates a
new hate thread.
The message not addressed to anybody is
considered as addressed to the creator of the
thread.
Weighing scheme
•
•
•
Weights are assigned to edges according
to degree of hate content of the
corresponding messages.
Positive weight for the message that
support the topic of the community and
negative for opposing.
Different weight values are assigned. E.g.
1 for normal, 2 for high and 3 for very high
hate or anti-hate content.
Graph Characteristics
•
•
•
Reveals two communities inside one community.
One who supports the community and the other
who opposes.
Very less communication inside these sub
communities.
Easy to identify the members who spread hate
heavily by the weight of the edges going out
from the node corresponding to that member.
Manual Mining Results
•
•
•
25 communities were
selected
Resulting Set
obtained was
manually validated
•
•
•
•
•
•
•
•
•
•
•
•
•
•
ASU MS 2006
Microsoft Corporation
Cricket Fans
Linux Kernel Programmers
We hate India
USA Democrats
Communism
Hate Israel
Data Mining and KDD
We hate exams
Hate Pakistan
Brad Pitt Fan club
For those who hate idol worship
Hate Indian Muslims
Buddhism
Step 1(Select Category)
We hate India
Hate Israel
We hate exams
Communism
USA Democrats
Hate Pakistan
For those who hate Idol worship
Hate Indian Muslims
Buddhism
ASU MS 2006
Microsoft Corporation
Cricket Fans
Linux Kernel Programmers
Data Mining and KDD
Brad Pitt Fan club
Step 2
•
•
•
•
•
We hate India
Hate Israel
Hate Pakistan
For those who hate
Idol worship
Hate Indian
Muslims
Step 3
•
•
•
Communism
USA
Democrats
Buddhism
•
•
•
•
•
We hate India
Hate Pakistan
For those who hate
Idol worship
Hate Indian
Muslims
•
Hate Israel
Step 4(Number of threads)
•We hate India
•Hate Pakistan
•Hate Indian Muslims
•For those who hate Idol
worship
The Graph
Advantages and Disadvantages of
the approach
•
•
•
•
The Approach clearly reveal basic
communication pattern in a hate community.
Can easily identify the hate spreading people.
Difficult to measure degree of hate content as
hate content tend to be very subjective.
Not easy to figure out that - To whom a particular
message is addressed in an ongoing discussion,
when it is not explicitly cited.
Conclusion
•
•
•
Hate community targeted to a country or a
religion usually contains high amount of
offensive content.
For social networking websites providing
features to create communities and discussion
boards inside such communities, detecting hate
communities has become very important.
We have tried to give a model to analyze such
offensive hate communities.
Thanks to Nitin and Lei