Transcript read - CUHK

SpyAware: Investigating the Privacy Leakage
Signatures in App Execution Traces
Hui Xu, Yangfan Zhou, Cuiyun Gao, Yu Kang, Michael R. Lyu
[email protected]
1
Private Data Is Valuable
Big Data
Machine Learning
Recommendation
2
Whether a Leakage Is Legitimate?
Depends on:
† User Preference
† Software Functionality
3
How to Handle the Leakage?
Principle: Privacy Awareness
† Users should be informed when the leakage
happens.
† Malware disposing approach is inappropriate.
Your SMS has
been leaked!!!
Maybe I should
remove the app.
4
Privacy Leakage Definition
Source
Privacy
Sensitive
Data
Sink
Read
Behavior
Send
Behavior
Privacy Leakage
5
Industrial Solutions
They only control read behaviors!
http://getandroidstuff.com/best-free-android-permission-management-apps-privacy-control/
6
Research Solutions
Taint Analysis
Leakage Happens
Instructions
Send
Read
Taint Propagation
Data
Sensitive
Data
Variable1
Variable2
Without Taint Analysis
Leakage ???
Read
Instructions
Send
?
Data
Sensitive
Data
?
Variable1
?
Variable2
7
TaintDroid
Approach: dynamic taint analysis (tracks the data flow
during runtime)
Usability Issues: portability (a new OS), overhead
W. Enck, et al. TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones[J].
8
ACM Transactions on Computer Systems (TOCS), 2014
Our Inspiration & Hypothesis
Hypothesis: Some correlation exists between privacy
leakage behaviors and app execution traces.
Approach of Data Analytics: Transform data to insight.
Observable
Phenomenon
Hidden
Incident
App Execution Traces
Spyware Behavior
S1
Pre I1
S2
S3
Pre I1
S4
Pre I2
Pre I3
Pre I2
Read
Read
Read
Read
Pos I1
Pos I3
Pos I1
Pos I1
Pos I2
Pos I2
Send
Send
spyware
spyware
9
benign
benign
What Instructions Are Helpful?
System Call: widely used on Linux platform
† Pros: It contains all the information of program
executions.
† Cons: It is low level, and the interpretation is difficult.
Binder Call: newly proposed in Android OS
† Pros: It is semantical, and can be easily interpreted.
† Cons: It only traces inter-process communications.
10
Trace the Instructions with a Profiler
† To trace system calls: strace
† To trace binder calls:
a) Hijack a payload into the target app process with ptrace.
b) The homemade payload decodes binder calls.
11
Overall Framework
Statistical Pattern Recognition
Training Phase
Profiler
Apps
Binder Call
System Call
Leakage
Indicator
TaintDroid
Spyware
Samples
Feature
Extractor
Trainer
Models
Classifier
Result
Benign
Samples
Detection Phase
Profiler
App
Binder Call
System Call
Profile
Sample
Feature
Extractor
12
Android Binder Call
#
Type
Data
1
BC_TRANSACTION
****android.app.IActivityManager**********%*com.sec.multiwindow.MW_TOUCH_DETECTED*****
******************`*********mw_x****e*****mw_action*********mw_y*********
2
BR_REPLY
****
3
BC_TRANSACTION
****android.content.IContentProvider****GET_system****sound_effects_enabled***
4
BR_REPLY
****
5
BC_TRANSACTION
**"*android.gui.DisplayEventConnection**
6
BR_REPLY
**$*********value*****0*
7
BC_TRANSACTION
****android.app.IActivityManager************com.android.contacts****
8
BR_REPLY
**$*********value*****0*
9
BC_TRANSACTION
****android.content.IContentProvider****'*content://com.android.contacts
/contacts*****_id***********************
10
BR_REPLY
****0*com.android.providers.contacts.ContactsProvider2****com.android.providers.contacts*******
***********com.android.providers.contacts****************com.android.providers.contacts*****
*android.process.acore*************#*/system/app/SecContactsProvider.apk*#*/system/app/SecCo
ntactsProvider.apk*-*/data/data/com.android.providers.contacts/lib*****!*/system/
framework/sec_feature.jar*+*/data/user/0/com.android.providers.contacts*********************a
ndroid.process.acore*********contacts;com.android.contacts***android.permission.READ_CONTACTS
**!*android.permission.WRITE_CONTACTS*********.***********
Binder Instance
Details
Access
Contacts
11
BR_REPLY
****************_id*********B***************************************f***
13
Detect Read Behaviors
Signature
Data Type
android.os.IServiceManager****iphonesubinfo
IMEI, ICCID
content://com.android.contacts/
Contact List
android.content.IContentProvider + com.android.contacts
Contact List
content://sms/
content://call_log/
content://browser/bookmarks
SMS
Call History
Browser History
android.account.IAccountManager
Account
Android.os.IServiceManager****location
Location
android.location.ILocationManager***gps
Location
android.location.ILocationManager***network
Location
android.location.ILocationManager***passive
Location
android.media.IMediaRecorder
android.gui.Sensor
android.hardware.Camera
Mic
Accelerometer
Camera
14
Binder Call-based Features
Approach:
a) Use BR_TRANSACTION; discard BR_REPLY.
b) Strip details and retain the destination instance name.
c) Choose discriminative instances.
†Leakage happens automatically when starting a new activity:
android.app.IActivityManager
†Network communications are generally performed in a stand alone thread:
adroid.app.IApplicationThread
†Apps may check current network connection status before communication:
android.net.IConnectivityManager, android.net.wifi.IWifiManager
†Messenger is a common method to pass event or values between threads:
android.os.IMessenger
†Leakages may happen when an app is querying the server:
15
com.android...view.IInputMethodManager
System Call-based Features
Approach:
a) Strip the parameters and retain the name.
b) Calculate the document frequency of system calls.
Low DF:
Rarely occurred
High DF:
Not discriminative
Features: 13 system calls ranging from 0.06 to 0.22
16
Extract Features for Each Sample
Terms:
† A Sample: We separate the sequence of instructions
into samples according to touch operations.
† A sample is a suspicious sample, if it includes at least
one read behavior according to the binder call.
Steps:
a) Judge whether a sample is a suspicious sample.
b) Discard the sample if it is nonsuspicious.
c) Extract features for only suspicious ones.
Reason: Android app is UI oriented.
17
Experimental Settings
Goal: Discriminate whether a suspicious sample indicates
a privacy leakage.
Baseline: TaintDroid
App set: 100 top ranking apps from Google Play
Method: We manually run each app for a few minutes;
we don’t use Monkey because of registration issues.
Suspicious
Profiles
?
Leak
No Leak
18
Experimental Apps
App
com.wochacha
jp.naver.line.android
cn.com.fetion
com.chinamobile.contacts.im
com.tencent.pb
com.sina.weibo
com.airbnb.android
com.booking
com.tripadvisor.tripadvisor
com.musixmatch.android.lyrify
com.soundcloud.android
de.motain.iliga
com.sankuai.meituan
com.easygame.marblelegend
com.zillow.android.zillowmap
com.evernote
com.eico.weico
com.netease.newsreader
com.zhaopin.social
com.sohu.newsclient
com.cubic.autohome
com.soufun.app
com.yahoo...weather
com.moji.mjweather
DevID
Leak
Read
Leak
Leak
Read
Leak
0
0
0
Read
Read
0
Leak
Read
Leak
Read
Leak
Leak
Leak
Leak
Leak
Leak
Read
Leak
Location
Leak
0
0
Leak
0
0
Leak
Leak
Leak
Read
0
Read
Leak
0
Read
Read
Read
Leak
Leak
Leak
Leak
0
Leak
Leak
App
com.starbucks.hk
org.coursera.android
com.wonder
com.babytree.apps.lama
com.skyscape.android.ui
com.epocrates
com.ebay.mobile
com.sirma.mobile.bible.android
com.sinyee.babybus.feeling
com.etermax.preguntados.lite
com.ss.android.article.news
com.dianping.v1
com.yahoo...im
com.dolphin.browser.express.web
org.mozilla.firefox
com.ksmobile.cb
com.droidware.uninstallmaster
com.lingualeo.android
com.baidu.news
com.tencent.news
com.ifeng.news2
com.wumo
com.quanleimu.activity
com.pccw.finance
DevID Location
App
DevID Location
Read
0
com.trello
0
Leak
Leak
0
sg.bigo
Leak
Leak
Leak
Leak com.axonlabs.hkbus
0
Read
Leak
0
com.tranzmate
Read Leak
Leak
0
org.wikipedia
0
Leak
Leak
0
com.ijinshan.kbatterydoctor_en Leak
0
Read
Leak com.groupon
Leak
Leak
Leak
Leak com.coupons.ciapp
0
Leak
Leak
Leak com.nextmedia
Leak
Leak
0
Read com.Qunar
Leak
Leak
Leak
Leak cn.kuwo.kwmusichd
Leak
0
Leak
Leak com.banjo.android
Leak
Leak
Read
Leak com.kayak.android
Leak
Leak
Read
Leak net.skyscanner.android.main
0
Read
0
Read com.ik.flightherofree
0
Leak
Read
0
com.flightview.flightview_free
Leak
Leak
Read
Leak cn.bluesky.chinesechess
Leak
Leak
0
Read com.happiplay.baccarat
Leak
0
Leak
Leak me.soundwave.soundwave
0
Leak
Leak
0
com.thefancy.app
0
0
Leak
0
com.wanelo.android
Read Leak
Leak
Leak com.mobilesrepublic.appy
Read Leak
Leak
Leak com.nytimes.android
Read
0
Leak
0
com.bigduckgames.flowbridges
0
Leak
DevID Leakage: 347 suspicious profiles from 56 apps, 139 spyware behaviors
Location Leakage: 171 suspicious profiles from 51 apps, 51 spyware behaviors
19
Experimental Result
Using Support Vector Machine and Cross Validation
Dev ID
Location
Positive
Negative
Total
Accuracy
True
59
175
234
67.4%
False
33
80
113
True
21
113
134
False
7
30
37
78.4%
Naïve guesser with prior distribution knowledge
Dev ID
Accuracy
F1-Measure
Location
Accuracy
F1-Measure
Naïve Guesser
59.6%
0%
Naïve Guesser
70.2%
0%
SVM
67.4%
50.6%
SVM
78.4%
53.1%
The results justify the existence of correlation between
20
spyware behaviors and app execution traces.
Summary
† Spyware awareness is an appropriate way for
combating privacy leakage.
† Detecting privacy leakage precisely is difficult: using
dynamic taint analysis approach
† We propose to discriminate privacy leakage events
through app execution traces, which include binder
call and system call.
† We design a set of tools, and justify the correlation
between privacy leakage events and app execution
traces through real-world experiments.
21
Feature Work
† Improve the performance by:
• Investigating on in-app signatures
• Trying more complicated features
† Analyze the insights from the result:
• Understand more about the traces.
† Improve our profiler and method by:
• Considering multi-process
• Considering cross-app leakage
† Develop and deploy such a tool for real-world usage.
22