paper_wherefore

Download Report

Transcript paper_wherefore

Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden
Patterns, and Structural Stenography

A social network occurs anywhere there is
social interaction between people.

Examples include Email, instant messaging,
Facebook, blogging trackbacks, coauthor
networks

The structure of social networks can be
interesting
How are friendships usually
structured? Are there hubs, such as
Heather, who connect separate
networks? How many degrees of
Kevin Bacon?
We can investigate these questions
if we have the data to mine.

For our examples, we will use a network of
emails sent between users.
John
Vertex

Mary
Directed edge
Vertex
How do we protect users’ privacy while still
releasing the data for research?

Remove any identifiable information, such as
name and other attributes.

Randomly rename the vertices
R3579X
R73313

Convert directed edges to undirected edges.
This increases the complexity and makes it
harder to attack.
R3579X
R73313
Undirected edge

Let’s say you want to know if
two vertices are connected on
the graph.

All the identifying info has been
removed, so how do we do it?

An active attack involves the adversary
creating vertices in the graph before the
graph is released

The adversary will create edges between the
vertices in a fashion that it can then recognize
later on in when the graph is released

We create k new vertices around 2*(log n)
where n is the total number of vertices

We create new do – d1 edges between these
new vertices and the other ones in the graph

Then, we randomly create edges between
these new nodes with independent
probability of 1/2

Given the graph, how do we find the
subgraph that we created?

Create a search tree, pruning the tree based
on the properties of our subgraph, such as
the number of degrees of our new vertices
Mike
John
Mary
Zoe
Tom
John
Mike
k1
k5
k2
k4
Zoe
Mary
k3
Tom
John
Mike
k1
k5
k2
k4
Zoe
Mary
k3
Tom
John
Mike
k1
k5
k2
k4
Zoe
Mary
k3
Tom
ZXCV
ASDF
DFG
WER
UYT
ASD
BNM
HGF
JKL
QWER
ZXCV
ASDF
k1
k5
k2
k4
BNM
QWER
k3
JKL
John
ASDF
k1
k5
k2
k4
BNM
Mary
k3
JKL

The paper proves that the search tree does
not grow too large and that the algorithm
displays good performance

Also, it proves that the subgraph is unique so
that we don’t identify the wrong subgraph

They simulate an attack on LiveJournal
friendship links. They create the accounts on
the website, make the connections, and then
crawl the site and anonymize the data

The network has 4.4 million nodes and 77
million edges

Only needs sqrt(log(n)) new nodes to attack
the graph

However, it’s much more computationally
intensive and less practical in the real world,
although it takes less nodes

It’s a lot like an active attack, except you
don’t create new nodes, instead you
collaborate with your friends and find
yourselves in the graph

However, because you did not specifically
target certain people, you may not be able to
identify other people when you find yourself

We cannot rely on anonymization to ensure
privacy in social networks

Possible improvements: add noise to the data
by adding/removing random edges