Measurement-driven Modeling and Design of Internet

Download Report

Transcript Measurement-driven Modeling and Design of Internet

OSN Research As If Sociology Mattered
Krishna P. Gummadi
Networked Systems Research Group
MPI-SWS
OSN research today
• Computational sociology: A natural sciences approach
– Gather and analyze OSN data to study problems in sociology
– Sociologists today use pretty sophisticated computing tools
• Social computing: An engineering approach
– Build systems that support / leverage human social interactions
– But, we tend to treat human behavior as annoying noise
• rather than leverage insights from sociology
This talk
• Argues that insights from sociology can help design
better systems
• Example 1: Dunbar’s number
– The case for decentralized content sharing in OSNs
• Example 2: Group attachment theory
– How social network-based Sybil defenses do or don’t work
Example 1: Dunbar’s number
• Limits the # of stable social relationships a user can have
– To less than a couple of hundred
– Linked to size of neo-cortex region of the brain
– Observed throughout history since hunter-gatherer societies
• Also observed repeatedly in studies of OSN user activity
– Users might have a large number of contacts
– But, regularly interact with less than a couple of hundred of them
User generated content sharing over OSNs
• A very popular activity over Facebook
– UGC like pictures, videos, and wall posts
• Facebook is building massive datacenters to support UGC
– Uses Akamai to deliver it
• But, most of Facebook’s UGC is of personal nature
– Pictures and videos of family and social events
• Content popularity would be limited by Dunbar’s number!
• Do we really need datacenters & CDNs to share this UGC?
Why not share personal UGC from homes?
• Advantage: Regain control over personal data sharing
– Better control over what you share & whom you share
• Concerns:
– Can we get good performance?
• Yes, due to Dunbar’s limit on popularity
– Can we get good availability?
• Yes, using always-on and always-connected gateways
• They are inexpensive: cheap and low-power
Example 2: Group attachment theory
• Explains how humans join and relate to groups
• Common-bond based groups
– Membership based on inter-personal ties, e.g., family or kinship
– Necessarily small, but tightly-knit and cohesive
• Common-identity based groups
– Membership based on self- or shared- interest
– Could be larger, but become less cohesive with scale
OSN graphs and groups
• Most OSN graphs include all manners of links
• Can extract bond groups from graph structure
– By looking for highly clustered communities of nodes
• But, not identity groups
– Loosely-knit, they merge into the rest of the network
• Result: A size limit on detectable graph communities
Sybil attack
• A fundamental problem in distributed systems
• Attacker creates many fake/sybil identities
• Many cases of real world attacks : Digg, Youtube
Automated sybil attack on
Youtube for $147!
Defending against Sybil attacks
• Traditional solutions rely on central trusted authorities
– Runs counter to open membership policies of OSNs
• Recent proposals leverage social networks
– Key Insight: Social links are hard to acquire in abundance
– Look for small cuts in the graph
– Conversely, look for communities around known trusted nodes
Links difficult to create
Lots of research activity recently
All schemes analyse the graph structure to isolate Sybils
SybilGuard [SIGCOMM’06]
SybilLimit [Oakland’08]
Ostra [NSDI’08]
SumUp [NSDI’09]
SybilInfer [NDSS’09]
Whanau [NSDI’10]
MobID [INFOCOM’10]
• Each optimized under assumptions
about the graph structure
– E.g., graphs are fast-mixing
• Each evaluated on different datasets
• Comparative evaluations yield
inconsistent results
Sybil resilience & group attachment theory
•
•
•
•
Sybil schemes find bond groups around a trusted node
But, these are only a fraction of all honest nodes
Bond groups are hard for Sybils to infiltrate
Not the case with identity groups
Implications
• Graph structure can identify nodes that are non-Sybils
• But, it cannot identify nodes that are Sybils
• Most nodes cannot be classified into either categories
• Does this imply Sybil schemes are useless?
– No, they can be used conservatively to find content from
people you trust
Summary
• OSN system designers should look to leverage
insights from sociology
• Presented two examples where some very basic
knowledge of sociology proved useful
• Lots more ways to leverage sociology in the future
– Can we leverage strength of ties to set privacy policies or
prioritizing updates from friends?