Limitations with Activity Recognition Methodology and Data

Download Report

Transcript Limitations with Activity Recognition Methodology and Data

LIMITATIONS WITH
ACTIVITY RECOGNITION
METHODOLOGY AND
DATA SETS
Gary Weiss
Fo r d h a m U n i v e r s i t y
Jeffrey Lockhart
Cambridge University
Work suppor ted by
National Science
Foundation Grant No.
1 1 161 24 .
GENESIS OF THIS WORK
 Our WISDM (Wireless Sensor Data Mining) Lab
has been working on activity recognition for
several years
 Have focused on building and deploying a real-world
system called actitracker
 Recent work has focused on implementing,
analyzing, and using different types of models
 When comparing our AR work with other work we
identified several key issues in methodology,
which also impact the resulting data sets
9/13/2014
HASCA 2014
2
OVERVIEW
Identify some methodological issues and
resulting impact on data sets
 Make people aware of these issues
 Propose mechanisms for addressing these issues
 Largest focus on model type but many other factors
are considered
 Ultimate goal is to generate more diverse data
sets and precisely label underlying assumptions
9/13/2014
HASCA 2014
3
FACTORS IMPACTING ACTIVIT Y RECOGNITION
Model Type:
 Personal, Impersonal, Hybrid
Collection Method:
 Fully Natural, Semi-Natural, Laboratory
Data
 Number of Subjects
 Population (college, elderly, etc.)
 Traits (height, weight, income, education ,…)
 Activities (running, jogging, standing, …)
 Duration (1 hour of data …)
9/13/2014
HASCA 2014
4
FACTORS IMPACTING ACTIVIT Y RECOGNITION
 Sensors
 Type: accelerometer, gyroscope, barometer
 Sampling rate: 20Hz, 50 Hz, …
 Number of sensors
 Location of sensors (pocket, belt, wrist, …)
 Orientation (facing up, down, in, out)
 Features
 Raw features
 Transformed, Window Size
 Results
 Accuracy
 Consistency
9/13/2014
HASCA 2014
5
ANALYSIS OF AR RESEARCH
 We examined 34 published AR papers
 Many were smartphone-based
 Several papers cover multiple data sets and thus
38 data sets were analyzed
 Several papers utilized multiple model types and
hence 47 distinct models were analyzed
 Detailed analysis published in Lockhart’s MS thesis:
 Benefits of Personalized Data Mining Approaches to
Human Activity Recognition with Smartphone Sensor Data
 A table describes each of the factors listed on prior 2
slides for each dataset
 Summary information described in this presentation
9/13/2014
HASCA 2014
6
BACKGROUND ON MODEL T YPE
 Personal Models
 Model based on labeled data from intended user
 Requires new users to provide training data
 Our AR results show high accuracy (~98%)
 Impersonal/Universal Models
 Model based on a panel of representative users
 No training phase required– works “out of the box”
 Our AR results show modest performance (~76%)
 Hybrid Models
 Model based on panel of users that includes intended user
 Requires a training phase for user
 Our AR results much closer to personal models even though
panel includes dozens of users (~95%)
9/13/2014
HASCA 2014
7
ISSUES WITH HYBRID MODELS
 Our results show that personal models perform really
well with only small amounts of data per activity
 Little practical need for hybrid models given need for training
 Why are hybrid models often used in research
papers?
 Simple experimental setup: use cross validation on single data
set. No need to carefully partition the data.
 With n users, personal and impersonal models require n
separate partitions
 Often assumed that hybrid models approximate impersonal
models and are treated as such
 In actuality they are much closer to personal models
9/13/2014
HASCA 2014
8
ISSUE 1: MODEL T YPE
 Hybrid model most
popular and authors
often claim results
generalizable to new
users (not true)
Model Type
 In 10 of 19 cases 10 or fewer
users so even closer to
personal models (we had 59)
Count
Percentage
Personal
12
26%
Impersonal
10
21%
Hybrid
19
40%
Unknown
6
13%
Analysis of 47 models from 38 data sets
 Couldn’t determine model
type in 6 cases; serious
methodological issue
 53% of the cases we
claim methodological
issues (40% + 13%)
9/13/2014
HASCA 2014
9
ISSUE 2: # SUBJECTS & DIVERSIT Y
 Number of subjects often small
 11 studies had less than 5; 12 had less
than 10
 HASC 2010 & 2012 more users but little
data per user
 Impacts ability to evaluate
performance
 Our results show impersonal models are
very inconsistent across users
 4 studies evaluated universal models
with less than 8 users; only 2 had at
least 30
 Populations should also be diverse but
many studies focus on college
students; personal info should also be
provided (height, weight, etc)
9/13/2014
HASCA 2014
Distribution of impersonal model
performance across 59 users
10
ISSUE 3: COLLECTION METHODOLOGY
 Many possible distinctions but 3 main categories:
 Fully natural: normal daily activities
 Semi-natural: operate in normal environment but may be
directed (e.g., asked to walk for 5 minutes)
 Laboratory: structured tasks in a controlled environment
 Type of collection environment should be
documented since this impacts results and ability
to replicate
 We have released an AR data set that is semi-natural and
our Actitracker data set that is fully natural (except for
self-training phase)
9/13/2014
HASCA 2014
11
ISSUE 4: SENSORS
Type of sensor and number of sensors
 Usually provided: not an issue
Location
 Precise location and orientation is often not specified
 Our results indicate these factors are important
 For smartphone, which pants pocket?
 How oriented? Mine almost always down and in (i.e., screen
facing thigh).
9/13/2014
HASCA 2014
12
ISSUE 5: FEATURES & FEATURE GENERATION
 Usually little choice in how to
represent raw features except for
sampling rate
 Raw sensor data transformed into
multivariate records using sliding
window and summary features
 Half of studies don’t report window size
 Vast majority of smartphone AR
research only uses basic statistics
 Yield good results which appear to be
competitive with more complex features
(e.g., based on FFT info)
9/13/2014
HASCA 2014
Distribution of window
sizes for 52% of studies
that report this info
13
FEATURES & FEATURE GENERATION
Important that all AR data sets:
 Release raw data
 Transformed data or script to generate transformed data
 Descriptions of higher level features often not sufficiently well
specified
 Our datasets include raw and transformed data
sets and recently we also released the
transformation scripts
 Interestingly, researchers found inconsistencies between
our raw and transformed data and helped us identify
several bugs
9/13/2014
HASCA 2014
14
WISDM ACTIVIT Y RECOGNITION DATA SETS
Two main data sets
 Activity Prediction
 36 users with semi-natural data collection
 All data is labeled with activity
 Actitracker Data
 Data from our publically available Actitracker app
 Data set will be updated periodically
 Fully natural data collection with semi-natural data
collection for self-training data
 Self-training data is labeled; remaining data is not labeled
 Available from: http://www.cis.fordham.edu/wisdm/dataset.php
9/13/2014
HASCA 2014
15
CONCLUSIONS
 All activity recognition research should clearly
describe relevant factors and describe
experimental methodology
 Propose a list of factors/issues to include
 Many existing studies do not provide important
information
 Highlight role of model type
 Show that many studies do not specify model type or use
hybrid models
 Hybrid models are inappropriate in most cases and many
studies assume they approximate impersonal/universal
models– which is contradicted by our research
9/13/2014
HASCA 2014
16
ACKNOWLEDGEMENTS
Material based on Jeff Lockhart’s MS Thesis
Activity Recognition research was supported by
all WISDM Lab members
Funding provided by NSF Grant 1116124
9/13/2014
HASCA 2014
17
MORE INFORMATION
 Information available from wisdmproject.com
 Papers available under “About: Publications” tab
 Includes Jeff’s MS Thesis
 Jeff Lockhart, Gary Weiss (2014). The Benefits of
Personalized Smartphone-Based Activity Recognition Models,
In Proc. SIAM International Conference on Data Mining,
Society for Industrial and Applied Mathematics, Philadelphia,
PA, 614-622.
 Info about our app available from actitracker.com
 App available for download from Google Play
Feel free to download our data sets and ask us
about our data
9/13/2014
HASCA 2014
18