Enforced Community Standards For Research on Users of the Tor

Download Report

Transcript Enforced Community Standards For Research on Users of the Tor

Enforced Community Standards
For Research on Users of the Tor
Anonymity Network
Christopher Soghoian
McCoy et al (‘08): Shining a light on dark places
 Researchers created Tor exit server.
 During 4 day period in December 2007, they logged the 150 bytes of each
network packet that went through their network.
 This revealed the kind of traffic and the specific sites that users were
visiting.
 The researchers also ran a tor entry server for 15 days.
 They gathered source IP addresses of users in order to later geo-locate
them.
McCoy et al (‘08): Shining a light on dark places
 The researchers did not seek or obtain prior legal analysis of their work.
 They did speak informally to a law professor at their university for a few minutes who they
said told them that “that area of the law is ill defined.“
 Based on this, they decided that is was “unnecessary to follow up with other lawyers .“
 The researchers also did not consult with an Institutional Review Board.
 “We were advised that it wasn't necessary," one of the researchers said, adding that the
IRB review process is used “used more in medical and psychology research at our
university," and was not generally consulted in computer science projects.
Community response to McCoy et al
 The researchers did not receive a warm welcome after presenting their
work at the Privacy Enhancing Technologies Symposium.
 When questioned by an audience member after the presentation, the
researchers admitted that they had retained a copy of the logged Tor traffic,
and further, that it was not held on an encrypted storage device.

This disclosure was met with boos from the audience, even after the
researchers stressed that the data was kept in a “secure" location.
McCoy et al and their university IRB
 After some bad press (ahem), the university of Colorado quickly launched an
investigation into the research. Within days, the researchers were cleared.
 “Based on our assessment and understanding of the issues involved in your
work, our opinion was that by any reasonable standard, the work in
question was not classifiable as human subject research, nor did it involve
the collection of personally identifying information. While the underlying
issues are certainly interesting and complex, our opinion is that in this case,
no rules were violated by your not having subjected your proposed work to
prior [IRB] scrutiny. Our analysis was conned to this [IRB] issue."
Castelluccia et al (‘10): Private Information
Disclosure from Web Searches
 In this project, the researchers discovered that with stolen session cookies
(captured via open WiFi networks), it was possible to reconstruct users’
search history.
 In addition to demonstrating the flaw, the researchers also sought to
determine the degree to which users are vulnerable (how many users
conduct web searches when “logged in" to a search engine and how many
have enabled Google's Web History feature).
Castelluccia et al (‘10): Private Information
Disclosure from Web Searches
 In order to determine this information, the researchers collected data via
three different methods:
 First, network traces for the 500-600 daily users at their own research
center were collected and analyzed. These accounts were not actively
attacked. The researchers merely engaged in passive analysis of these
network traces.
 Second, the researchers received opt-in consent from 10 users, whose
Google session cookies the researchers sniffed, and then used to actively
reconstruct the individuals‘ search history information.
Castelluccia et al (‘10): Private Information
Disclosure from Web Searches
 Third, the researchers established a rogue tor exit server.
 During the one week period in which the researchers collected data from
the Tor network, 1803 distinct Google users were observed, 46% of which
were logged into their accounts.
 For each of these logged-in users, the researchers used the sniffed Google
session cookies and attempted to access the users' first and last name;
locations searched using Google Maps (along with the “default location",
when available); blogs followed using Google Reader; full Web History
(when accessible without re-entering credentials); finance portfolio; and
bookmarks.
 The researchers stressed that their research application did not store any
individual users' data. Only aggregate statistical information was retained.
Privacy of colleagues > Tor users
 In their paper, the researchers describe why they sought consent to engage
in active attacks against the 10 users who volunteered for their study:
 “it would have been otherwise impossible to conduct our study on
uninformed users without incurring legal and ethical issues.“
 Unclear why it was legally and ethically OK to violate the privacy of Tor
users.
 Also noteworthy that they were wiling to engage in active attacks against
Tor, but only passive sniffing on their employer’s network.
Analysis of these two studies
 McCoy et al specifically sought to learn more about users of the Tor
network, whereas Castelluccia simply used Tor users‘ network activity to
assist in drawing broader conclusions about general Internet behavior
 The fact that Castelluccia et al. performed only passive network monitoring
on their own colleagues but actively attacked the accounts of Tor users likely
indicates that the researchers knew they were engaging in morally and
ethically dubious behavior.
 If there were no problems with what they were doing, why would they not
do it to their friends and colleagues, but were willing to do it to users who
ha specifically signaled a desire to protect their own privacy?
Analysis of these two studies
 Neither research team submitted their study to an IRB
 McCoy et al did not believe they had to, while Castelluccia et al. did not have an IRB at
their research institution.
 Castelluccia et al. specifically designed their research tool to analyze
individual users' data in-memory, and only retained aggregate statistical
data.
 McCoy et al. retained individual users' browsing data, and performed
statistical analysis of it after the fact.
Many other studies since
 These are not the only academic studies to collect data via Tor exit servers.
 There have been several others since (one even won a best paper award).
 Unless something is done, this trend will likely continue.
Our publishing system offers skewed incentives
 Some of the attendees at PETS 2010 did not respond positively to the study.
 When the community response to the ‘08 PETS paper by McCoy et al was raised,
the researchers stated that they were not at PETS ’08, and had no idea that it
had resulted in such criticism. How should they know? Where was it
documented?
 Negative community response is not mentioned on PETS 08 site or ACM Portal.
 McCoy et al’s own person home pages and CVs do not mention it.
 The very same thing applies to the paper from PETS ‘10.
 To the casual observer, it looks like the community approves of this kind of
research (since the papers are approved at conferences with low accept rates).
We are a guild. It is time to flex our power.
Establishing a standard for acceptable research
 My ideas:
 Research should be focused on users of the Tor network
 Not just using Tor as an easy way to get data on Internet users’ browsing activity.
 Minimize user data collection and retention.
 The research should be legal in the country where it is performed.
 This may mean no rogue academic exit servers in the US
 Research studies should be vetted by an IRB, if one exists
Enforcing this standard
 The academic community needs to pick a reasonable, common-sense
standard.
 Then, they need to enforce it.
 Papers that do not meet such standards should not be accepted by any of
the respected conferences.
 Example: SOUPs now requires researchers to include a section on how
ethical concerns were addressed, including consulting with an IRB.
 If SOUPs can do it, PETS and Oakland can and should do so too.