Smart Phone-Based Sensor Mining
Download
Report
Transcript Smart Phone-Based Sensor Mining
www.cis.fordham.edu/wisdm or wisdmproject.com
Gary M. Weiss
Comp & Info Science Dept
Fordham University
[email protected]
Data Mining:
Extraction of knowledge from data via automated
methods
Smart phone sensor mining:
Extraction of useful knowledge from the data
generated by smart phone sensors
1/11/2012
Gary M. Weiss
ICCS 2012
2
What sensors are found on smart phones?
Audio sensor (microphone)
Image sensor (camera, video recorder)
Tri-Axial Accelerometer
Location sensor (GPS, cell tower, WiFi)
Infrared proximity sensor; Light sensor
Magnetic compass; Temperature sensor; Touch sensor
Virtual/calculated sensors:
▪ Proximity (via light), gravity, orientation, gyroscope
1/11/2012
Gary M. Weiss
ICCS 2012
3
Learning about smart phone users
Security requires understanding how devices used
Main focus of talk not on security but on what can
be learned about smart phone users
Smart phone based biometric identification
Can be considered a security application
Many news stories about abuses
Apps to spy on your spouse; iPhone location fiasco
1/11/2012
Gary M. Weiss
ICCS 2012
4
Activity recognition (what are you doing)?
Are you walking, jogging, sitting, standing, etc?
Biometric Identification (who are you)?
Are you John Smith?
Trait Identification (who are you at diff. level)?
Are you male? Are you tall? What do you weigh?
1/11/2012
Gary M. Weiss
ICCS 2012
5
Data miners want to learn everything about you
Somehow that info will be useful
Develop useful apps, marketing leads, etc.
Many positive uses
▪ That is why NSF provided WISDM with funding for activity
recognition from “Health and Well Being” program
But obviously issues with privacy and abuse
1/11/2012
Gary M. Weiss
ICCS 2012
6
Approach to Predictive Data Mining
1. Collect labeled (sensor) training data
2. Apply data mining method to build predictive model
3. Apply predictive model to future unlabelled data
1/11/2012
Gary M. Weiss
ICCS 2012
7
1/11/2012
Gary M. Weiss
ICCS 2012
8
Why is it useful?
Context-sensitive applications
▪ Context influences handling of phone calls or music to play
Health applications
▪ Track activity levels or detect falls in elderly
Approaches to activity recognition
Uses multiple accelerometers
Use custom devices (pedometer, FitBit)
Our approach: use existing smart phones
1/11/2012
Gary M. Weiss
ICCS 2012
9
Accelerometer data from Android phone
Walking
Jogging
Climbing Stairs
Lying Down
Sitting
Standing
Gravity included
1/11/2012
Gary M. Weiss
ICCS 2012
10
1/11/2012
Gary M. Weiss
ICCS 2012
11
1/11/2012
Gary M. Weiss
ICCS 2012
12
1/11/2012
Gary M. Weiss
ICCS 2012
13
1/11/2012
Gary M. Weiss
ICCS 2012
14
Impersonal (Universal) Model
Single Model trained and used for everyone
Data Mining Method: Instance Based Learning (WEKA IB3)
Actual Class
72.4%
Accuracy
1/11/2012
Predicted Class
Walking Jogging Stairs Sitting Standing
Lying
Down
Walking
2209
46
789
2
4
0
Jogging
45
1656
148
1
0
0
Stairs
412
54
869
3
1
0
Sitting
10
0
47
553
30
241
Standing
8
0
57
6
448
3
Lying Down
5
1
7
301
13
131
Gary M. Weiss
ICCS 2012
15
Personal Model: Model Build per User
Data Mining Method: Instance Based Learning (WEKA IB3)
98.4%
accuracy
Predicted Class
Jogging
Stairs
Walking
3033
1
24
0
0
Lying
Down
0
Jogging
4
1788
4
0
0
0
Stairs
42
4
1292
1
0
0
Sitting
0
0
4
870
2
6
Standing
5
0
11
1
509
0
Lying Down
4
0
8
7
0
442
Actual Class
Walking
1/11/2012
Gary M. Weiss
ICCS 2012
Sitting Standing
16
1/11/2012
Gary M. Weiss
ICCS 2012
17
Identification based on physical/behavioral traits
Fingerprints, DNA, iris, gait, etc.
Biometrics for everyone
Equipment smaller & cheaper (sensors + processing)
▪ Laptops currently perform face recognition
Gait-based recognition
Most work is camera-based
Some applications
device security, customization & personalization
1/11/2012
Gary M. Weiss
ICCS 2012
18
Used for identification and authentication
Identification means predicting identity from pool of
users (36 in initial study and 200 in recent study)
Authentication is a binary class prediction
▪ Is it you or an imposter?
We evaluate walking and other activities as well as
unclassified activities
Predictions made on individual 10 sec. samples
but also combine “votes” to exploit larger samples
1/11/2012
Gary M. Weiss
ICCS 2012
19
Unclassified
Walk
Jog
Up
Down
J48
72.2
84.0
83.0
65.8
61.0
Neural Net
69.5
90.9
92.2
63.3
54.5
Straw Man
4.3
4.2
5.0
6.5
4.7
Based on 10 second test samples
Unclassified
Walk
Jog
Up
Down
J48
36/36
36/36
31/32
31/31
28/31
Neural Net
36/36
36/36
32/32
28.5/31
25/31
Based on most frequent prediction for 5-10 minutes of data
Authentication results even better (~90% with 10 sec samples)
Recent unpublished results demonstrate 100% accuracy with 200 users!
1/11/2012
Gary M. Weiss
ICCS 2012
20
1/11/2012
Gary M. Weiss
ICCS 2012
21
Soft biometrics: traits can aid with biometrics
As data miners we want to know everything
about a person
Marketing applications: ads based on sex
Inferred weight to predict calories burned
1/11/2012
Gary M. Weiss
ICCS 2012
22
Normally think about traits as being:
Unchanging: race, skin color, eye color, etc.
Slow changing: Height, weight, etc.
But want to know everything about a person:
What they wear, how they feel, if they are tired, etc.
Have never seen this goal for mobile sensor mining
1/11/2012
Gary M. Weiss
ICCS 2012
23
Work in early stages
Data initially collected from ~70 people, now 200
Accelerometer and survey data
Survey data includes anything we could think of that
might somehow be predictable
▪ Sex, height, weight, age, race, handedness, disability
▪ Shoe size, footwear type, size of heels, type of clothing
▪ # hours academic work , # hours exercise
Too few subjects investigate all factors
▪ Many were not predictable (maybe with more data)
1/11/2012
Gary M. Weiss
ICCS 2012
24
Accuracy
Male Female
71.2%
Male
31
7
Female
12
16
Accuracy Short
83.3%
Short
15
Tall
2
Tall
5
20
Accuracy
78.9%
Light
Heavy
Light
Heavy
13
2
7
17
Results for IB3 classifier. For height and weight middle categories removed.
1/11/2012
Gary M. Weiss
ICCS 2012
25
1/11/2012
Gary M. Weiss
ICCS 2012
26
Security policies vary widely by OS & platform
Symbian requires properly signed keys to remove
restrictions on using certain APIs
iPhone apps have relatively strict oversight
Android OS has few restrictions and Marketplace
has essentially no oversight or restrictions
▪ WISDM project has had no problem tapping into sensors
and transmitting results. Just pay $25 for account.
1/11/2012
Gary M. Weiss
ICCS 2012
27
Android notifies user of services
SYSTEM PERMISSIONS FOR WISDM SensorCollector
▪ Coarse location, fine location, internet access, keep from
sleeping, modify/delete USB storage
Applications routinely access sensitive services
Fandango : fine GPS location, read phone state &
identity, modify/delete USB storage, internet access
Angry Birds: identical permissions!
Notifications probably next to useless given this!
1/11/2012
Gary M. Weiss
ICCS 2012
28
Even legitimate applications have to be
concerned with privacy & security
WISDM will encrypt data in transit, encrypt on
phone, include secure accounts & passwords, etc.
Need to ensure than any aggregated info is made
public only if cannot be traced to individual
1/11/2012
Gary M. Weiss
ICCS 2012
29
Good Policies:
Make it clear what you are monitoring and storing
Provide application level control for the user
▪ Allow user to turn on/off monitoring of specific sensors
▪ If they use an option to upload the information to Facebook
then little privacy!
Since legitimate and illegitimate apps function
alike, no easy way to distinguish them
Could try to use only certified apps, but quite limiting
1/11/2012
Gary M. Weiss
ICCS 2012
30
WISDM is building & deploying the actitracker
service to track your activities real-time and
display them via a web-based interface
Useful health information and thus supported by
NSF Grant & Google faculty research award
Actitracker.com online and should have basic
functionality shortly
1/11/2012
Gary M. Weiss
ICCS 2012
31
WISDM research group
Current Members
▪ Anthony Alcaro, Alex Armero, Shaun Gallagher, Andrew
Grosner, Margo Flynn, Jeff Lockhart, Paul McHugh, Luigi
Patruno, Tony Pulickal, Greg Rivas, Priscilla Twum, Bethany
Wolff, Zach Wyhowanec, Jack Xue
Key Former Members
▪ Jennifer Kwapisz, Sam Moore, Shane Skowron, Alvan Wong
Funders: NSF, Google, and Fordham
1/11/2012
Gary M. Weiss
ICCS 2012
32
1.
J.R. Kwapisz, G.M. Weiss, and S.A. Moore. 2010.
Activity recognition using cell phone accelerometers, in Proceedings of the Fourth
International Workshop on Knowledge Discovery from Sensor Data, 10-18.
2.
J. R. Kwapisz, G.M. Weiss, and S.A. Moore, 2010.
Cell phone-based biometric identification, in Proceedings of the IEEE Fourth
International Conference on Biometrics: Theory, Applications and Systems.
3.
J.W. Lockhart, G.M. Weiss, J.C. Xue, S.T. Gallagher, A.B. Grosner, T.T. Pulickal. 2011.
Design considerations for the WISDM smart phone-based sensor mining
architecture, in Proceedings of the Fifth International Workshop on Knowledge
Discovery from Sensor Data, San Diego, CA.
4.
G.M. Weiss, and J.W. Lockhart, 2011.
Identifying user traits by mining smart phone accelerometer data, in Proceedings of
the 5th International Workshop on Knowledge Discovery from Sensor Data., San
Diego, CA.
1/11/2012
Gary M. Weiss
ICCS 2012
33
For more information go to wisdmproject.com
1/11/2012
Gary M. Weiss
ICCS 2012
Gary Weiss
[email protected]
34