Smart Phone-Based Sensor Mining

Download Report

Transcript Smart Phone-Based Sensor Mining

Gary M. Weiss
Fordham University
[email protected]

Smart phones are ubiquitous
 As of 4th quarter 2010 outpaced PC sales
 We carry them everywhere at almost all times

Smart phones are powerful
 Increasing processing power and storage space
 Filled with sensors

Smart phones include the following sensors:
▪
▪
▪
▪
5/17/2012
Tri-Axial Accelerometer
Location sensor (GPS, cell tower, WiFi)
Audio sensor (microphone), Image sensor (camera)
Proximity, light, temperature, magnetic compass
Gary M. Weiss
Einstein
2

Data mining: application of computational methods
to extract knowledge from data
 Most data mining involves inferring predictive models,
often for classification


Sensor mining: application of computational
methods to extract knowledge from sensor data
Supervised machine learning
 Obtain labeled time-series training data
 Create examples described by generated features
 Build model to predict example’s label
5/17/2012
Gary M. Weiss
Einstein
3

Three years ago started what is now WISDM
 Began with focus on activity recognition
▪ Determine what a user is doing based on accelerometer
 Moved to an Android-based smartphone platform
 Expanded to other applications
▪ Biometric identification
▪ Identifying user characteristics (soft biometrics)
▪ Mining GPS data (project starting with Bronx Zoo)
 Current focus on Actitracker
▪ Track user activities and present info to user via the web as a
health app (NSF “Health and Well-Being Grant)
5/17/2012
Gary M. Weiss
Einstein
4


Based on Android Smartphones but could be
extended to other mobile devices
Client/Server architecture
 Smartphones are the client (they run our app)
 We have a dedicated server
 Right now raw data is sent to the server and
processing occurs there
 Data can be streamed or sent on demand
 In future more responsibility moved to the phone
5/17/2012
Gary M. Weiss
Einstein
5

Web Interface
 Users can access their data via a web interface
▪ Accessible from smartphone or full-screen computer

Security
 Secure logins and data encrypted

Resource Issues: Power
 Power is an issue if collect GPS data and maybe if
we collect data 24x7, but not for periodic data
collection
5/17/2012
Gary M. Weiss
Einstein
6



Measures acceleration along 3 spatial axes
Detects/measures gravity (orientation matters)
Measurement range typically -2g to +2g
 Okay for most activities but falling yields higher values
 Range & sensitivity may be adjustable

Sampling rates ~20-50 Hz
 Study found 20Hz required for activity recognition
 WISDM project found could not reliably sample beyond
20Hz (50ms) and this may impact activity recognition
5/17/2012
Gary M. Weiss
Einstein
7

Activity Recognition
 Identify the activity a user is performing (walking,
jogging, sitting, etc.)

Biometric Identification
 Identify a user based on prior accelerometer data
collected from that user

Trait Identification
 Identify characteristics about a user based
(height, weight, age)
5/17/2012
Gary M. Weiss
Einstein
8

Context-sensitive applications
 Handle phone calls differently depending on context
 Play music to suit your activity
 New & innovative apps to make phones smarter

Tracking & Health applications
 Track overall activity levels & generate fitness profiles
 Care of elderly
▪ Detect dangerous situations like (falling)
▪ Warn if some with Alzheimer’s wanders outside of area
5/17/2012
Gary M. Weiss
Einstein
9

Accelerometer data from Android phone
 Walking
 Jogging
 Climbing Stairs
 Lying Down
 Sitting
 Standing
5/17/2012
Gary M. Weiss
Einstein
10
5/17/2012
Gary M. Weiss
Einstein
11
5/17/2012
Gary M. Weiss
Einstein
12
5/17/2012
Gary M. Weiss
Einstein
13
5/17/2012
Gary M. Weiss
Einstein
14
Z axis
5/17/2012
Gary M. Weiss
Einstein
15
5/17/2012
Gary M. Weiss
Einstein
16



Six activities: walking, jogging, stairs, sitting,
standing, lying down
Labeled data collected from over 50 users
Data transformed via 10-second windows
 Accelerometer data sampled (x,y,z) every 50ms
 Features (per axis):
▪ average, SD, ave diff from mean, ave resultant accel,
binned distribution, time between peaks
5/17/2012
Gary M. Weiss
Einstein
17

The 43 features used to build a classifier
 WEKA data mining suite used, multiple techniques
 Personal, universal, hybrid models built


Architecture (for now) uses “dumb” client
Basis of soon to be released actitracker service
 Provides web based view of activities over time
5/17/2012
Gary M. Weiss
Einstein
18

WISDM Results are shown for various things
 Personal, universal, and hybrid models
 Most results aggregated over all users but a few
per user to show how performance varies by user
 Results for 6 activities (ones shown in the plots)
5/17/2012
Gary M. Weiss
Einstein
19
Actual Class
72.4%
Accuracy
5/17/2012
Predicted Class
Walking Jogging Stairs Sitting Standing
Lying
Down
Walking
2209
46
789
2
4
0
Jogging
45
1656
148
1
0
0
Stairs
412
54
869
3
1
0
Sitting
10
0
47
553
30
241
Standing
8
0
57
6
448
3
Lying Down
5
1
7
301
13
131
Gary M. Weiss
Einstein
20
98.4%
accuracy
Predicted Class
Jogging
Stairs
Walking
3033
1
24
0
0
Lying
Down
0
Jogging
4
1788
4
0
0
0
Stairs
42
4
1292
1
0
0
Sitting
0
0
4
870
2
6
Standing
5
0
11
1
509
0
Lying Down
4
0
8
7
0
442
Actual Class
Walking
5/17/2012
Gary M. Weiss
Einstein
Sitting Standing
21
% of Records Correctly Classified
Personal
Universal
Straw
IB3 J48 NN IB3 J48
NN
Man
Walking
99.2 97.5 99.1 72.4 77.3
60.6
37.7
Jogging
99.6 98.9 99.9 89.5 89.7
89.9
22.8
Stairs
96.5 91.7 98.0 64.9 56.7
67.6
16.5
Sitting
98.6 97.6 97.7 62.8 78.0
67.6
10.9
Standing
96.8 96.4 97.3 85.8 92.0
93.6
6.4
Lying Down 95.9 95.0 96.9 28.6 26.2
60.7
5.7
71.2
37.7
Overall
5/17/2012
98.4 96.6 98.7 72.4 74.9
Gary M. Weiss
Einstein
22
5/17/2012
Gary M. Weiss
Einstein
23

Biometrics concerns unique identification
based on physical or behavioral traits
 Hard biometrics involves traits that are sufficient
to uniquely identify a person
▪ Fingerprints, DNA, iris, etc.
 Soft biometric traits are not sufficiently
distinctive, but may help
▪ Physical traits: Sex, age, height, weight, etc.
▪ Behavioral traits: gait, clothes, travel patterns, etc.
5/17/2012
Gary M. Weiss
Einstein
24

Numerous accelerometer-based systems that
use dedicated and/or multiple sensors
 See related work section of Cell Phone-Based
Biometric Identification for details

Possible uses:
▪
▪
▪
▪
5/17/2012
Phone security (e.g., to automatically unlock phone)
Automatic device customization
To better track people for shared devices
Perhaps for secondary level of physical security
Gary M. Weiss
Einstein
25

Same setup as WISDM activity recognition
 Same data collection, feature extraction, WEKA, …

Used for identification and authentication
 Identification: predicting identity from pool of users
 Authentication is binary class prediction problem

Evaluate single and mixed activities
 Evaluate using 10 sec. and several min. of test data
▪ Longer sample classify with “Most Frequent Prediction”

Results based on 36 users
 But hold up on preliminary experiments with 200 users
5/17/2012
Gary M. Weiss
Einstein
26
Aggregate
Walk
Jog
Up
Down
Aggregate
(Oracle)
J48
72.2
84.0
83.0
65.8
61.0
76.1
Neural Net
69.5
90.9
92.2
63.3
54.5
78.6
Straw Man
4.3
4.2
5.0
6.5
4.7
4.3
Based on 10 second test samples
Aggregate
Walk
Jog
Up
Down
Aggregate
(Oracle)
J48
36/36
36/36
31/32
31/31
28/31
36/36
Neural Net
36/36
36/36
32/32
28.5/31
25/31
36/36
Based on most frequent prediction for 5-10 minutes of data
5/17/2012
Gary M. Weiss
Einstein
27

Authentication results:
 Positive authentication of a user
▪ 10 second sample: ~85%
▪ Most frequent class over 5-10 min: 100%
 Negative Authentication of a user (an imposter)
▪ 10 second sample: ~96%
▪ Most frequent class over 5-10 min: 100%
5/17/2012
Gary M. Weiss
Einstein
28


Can do remarkably well with short amounts
of accelerometer data (10s – 2 min)
Since we can distinguish between ways
different people walk may be able to
distinguish between different gaits
5/17/2012
Gary M. Weiss
Einstein
29
5/17/2012
Gary M. Weiss
Einstein
30

Data collected from ~70 people (now over 200)
 Accelerometer and survey data
 Survey data includes anything we could think of that
might somehow be predictable
▪
▪
▪
▪
Sex, height, weight, age, race, handedness, disability
Type of area grew up in {rural, suburban, urban}
Shoe size, footwear type, size of heels, type of clothing
# hours academic work , # hours exercise
 Too few subjects investigate all factors
▪ Many were not predictable (maybe with more data)
5/17/2012
Gary M. Weiss
Einstein
31
Accuracy
Male Female
71.2%
Male
31
7
Female
12
16
Accuracy Short
83.3%
Short
15
Tall
2
Tall
5
20
Accuracy
78.9%
Light
Heavy
Light
Heavy
13
2
7
17
Results for IB3 classifier. For height and weight middle categories removed.
5/17/2012
Gary M. Weiss
Einstein
32

A wide open area for data mining research
 A marketers dream



Clear privacy issues
Room for creativity & insight for finding traits
Probably many interesting commercial and
research applications
 Imagine diagnosing back problems via your
mobile phone via gait analysis …
5/17/2012
Gary M. Weiss
Einstein
33

Can collect accelerometer data from patients
 On demand or in the background
 Data transmitted wirelessly or stored on the
phone for periodic download

Can extend study beyond gait
 Can monitor overall activity levels
 Can monitor daily routine
5/17/2012
Gary M. Weiss
Einstein
34

Facilitate quantitative analysis of gait
 “Fourth, although experienced clinicians assessed
gait, quantitative analysis of gait might be more
reliable” (Verghese et al. 2002)
 Accelerometer data can provide basis for gait
classification
 Can use data mining to learn a classifier for gait
▪ Just need carefully selected training data
▪ Yields consistent measure
5/17/2012
Gary M. Weiss
Einstein
35





Can look at other neurological diseases
besides non-Alzheimer’s dementia
Can try to track progression of Alzheimer’s
Note can monitor daily routine, travel, etc.
Smartphone can also administer surveys,
record video, provide voice prompts, etc.
Besides diagnosis, can assist people suffering
from these diseases
5/17/2012
Gary M. Weiss
Einstein
36

Gary Weiss
 Fordham University, Bronx NY 10458
 [email protected]
 http://storm.cis.fordham.edu/~gweiss/

WISDM Information
 http://www.cis.fordham.edu/wisdm/
▪ WISDM papers available: click “About” then “Publications”
 By end of summer Actitracker will allow you to track
your activities via our Android app (actitracker.com)
5/17/2012
Gary M. Weiss
Einstein
37

WISDM research group
 Current Active Members
▪ Linna AI*, Shaun Gallagher*, Andrew Grosner*, Margo
Flynn, Jeff Lockhart*, Paul McHugh*, Tony Pulickal*, Greg
Rivas*, Isaac Ronan*, Priscilla Twum, Bethany Wolff
* Working full-time on the project at Fordham over the summer
5/17/2012
Gary M. Weiss
Einstein
38
Available from: http://www.cis.fordham.edu/wisdm/publications
Kwapisz, J.R., Weiss, G.M., and Moore, S.A. 2010. Activity recognition using cell phone
accelerometers, Proceedings of the Fourth International Workshop on Knowledge Discovery from
Sensor Data, 10-18.
Kwapisz, J.R., Weiss, G.M., and Moore, S.A. 2010. Cell phone-based biometric identification,
Proceedings of the IEEE Fourth International Conference on Biometrics: Theory, Applications and
Systems.
Lockhart, J.W., Weiss, G.M., Xue, J.C., Gallagher, S.T., Grosner, A.B., and Pulickal, T.T. 2011. Design
considerations for the WISDM smart phone-based sensor mining architecture, In Proceedings of
the Fifth International Workshop on Knowledge Discovery from Sensor Data, San Diego, CA.
Weiss, G.M., and Lockhart, J.W. 2011. Identifying user traits by mining smart phone accelerometer
data, Proceedings of the 5th International Workshop on Knowledge Discovery from Sensor Data.
Weiss, G.M., and Jeffrey W. Lockhart (2012). The Impact of Personalization on Smartphone-Based
Activity Recognition, Proceedings of the AAAI-12 Workshop on Activity Context Representation:
Techniques and Languages, Toronto, CA.
5/17/2012
Gary M. Weiss
Einstein
39