Limitations with Activity Recognition Methodology and Data
Download
Report
Transcript Limitations with Activity Recognition Methodology and Data
LIMITATIONS WITH
ACTIVITY RECOGNITION
METHODOLOGY AND
DATA SETS
Gary Weiss
Fo r d h a m U n i v e r s i t y
Jeffrey Lockhart
Cambridge University
Work suppor ted by
National Science
Foundation Grant No.
1 1 161 24 .
GENESIS OF THIS WORK
Our WISDM (Wireless Sensor Data Mining) Lab
has been working on activity recognition for
several years
Have focused on building and deploying a real-world
system called actitracker
Recent work has focused on implementing,
analyzing, and using different types of models
When comparing our AR work with other work we
identified several key issues in methodology,
which also impact the resulting data sets
9/13/2014
HASCA 2014
2
OVERVIEW
Identify some methodological issues and
resulting impact on data sets
Make people aware of these issues
Propose mechanisms for addressing these issues
Largest focus on model type but many other factors
are considered
Ultimate goal is to generate more diverse data
sets and precisely label underlying assumptions
9/13/2014
HASCA 2014
3
FACTORS IMPACTING ACTIVIT Y RECOGNITION
Model Type:
Personal, Impersonal, Hybrid
Collection Method:
Fully Natural, Semi-Natural, Laboratory
Data
Number of Subjects
Population (college, elderly, etc.)
Traits (height, weight, income, education ,…)
Activities (running, jogging, standing, …)
Duration (1 hour of data …)
9/13/2014
HASCA 2014
4
FACTORS IMPACTING ACTIVIT Y RECOGNITION
Sensors
Type: accelerometer, gyroscope, barometer
Sampling rate: 20Hz, 50 Hz, …
Number of sensors
Location of sensors (pocket, belt, wrist, …)
Orientation (facing up, down, in, out)
Features
Raw features
Transformed, Window Size
Results
Accuracy
Consistency
9/13/2014
HASCA 2014
5
ANALYSIS OF AR RESEARCH
We examined 34 published AR papers
Many were smartphone-based
Several papers cover multiple data sets and thus
38 data sets were analyzed
Several papers utilized multiple model types and
hence 47 distinct models were analyzed
Detailed analysis published in Lockhart’s MS thesis:
Benefits of Personalized Data Mining Approaches to
Human Activity Recognition with Smartphone Sensor Data
A table describes each of the factors listed on prior 2
slides for each dataset
Summary information described in this presentation
9/13/2014
HASCA 2014
6
BACKGROUND ON MODEL T YPE
Personal Models
Model based on labeled data from intended user
Requires new users to provide training data
Our AR results show high accuracy (~98%)
Impersonal/Universal Models
Model based on a panel of representative users
No training phase required– works “out of the box”
Our AR results show modest performance (~76%)
Hybrid Models
Model based on panel of users that includes intended user
Requires a training phase for user
Our AR results much closer to personal models even though
panel includes dozens of users (~95%)
9/13/2014
HASCA 2014
7
ISSUES WITH HYBRID MODELS
Our results show that personal models perform really
well with only small amounts of data per activity
Little practical need for hybrid models given need for training
Why are hybrid models often used in research
papers?
Simple experimental setup: use cross validation on single data
set. No need to carefully partition the data.
With n users, personal and impersonal models require n
separate partitions
Often assumed that hybrid models approximate impersonal
models and are treated as such
In actuality they are much closer to personal models
9/13/2014
HASCA 2014
8
ISSUE 1: MODEL T YPE
Hybrid model most
popular and authors
often claim results
generalizable to new
users (not true)
Model Type
In 10 of 19 cases 10 or fewer
users so even closer to
personal models (we had 59)
Count
Percentage
Personal
12
26%
Impersonal
10
21%
Hybrid
19
40%
Unknown
6
13%
Analysis of 47 models from 38 data sets
Couldn’t determine model
type in 6 cases; serious
methodological issue
53% of the cases we
claim methodological
issues (40% + 13%)
9/13/2014
HASCA 2014
9
ISSUE 2: # SUBJECTS & DIVERSIT Y
Number of subjects often small
11 studies had less than 5; 12 had less
than 10
HASC 2010 & 2012 more users but little
data per user
Impacts ability to evaluate
performance
Our results show impersonal models are
very inconsistent across users
4 studies evaluated universal models
with less than 8 users; only 2 had at
least 30
Populations should also be diverse but
many studies focus on college
students; personal info should also be
provided (height, weight, etc)
9/13/2014
HASCA 2014
Distribution of impersonal model
performance across 59 users
10
ISSUE 3: COLLECTION METHODOLOGY
Many possible distinctions but 3 main categories:
Fully natural: normal daily activities
Semi-natural: operate in normal environment but may be
directed (e.g., asked to walk for 5 minutes)
Laboratory: structured tasks in a controlled environment
Type of collection environment should be
documented since this impacts results and ability
to replicate
We have released an AR data set that is semi-natural and
our Actitracker data set that is fully natural (except for
self-training phase)
9/13/2014
HASCA 2014
11
ISSUE 4: SENSORS
Type of sensor and number of sensors
Usually provided: not an issue
Location
Precise location and orientation is often not specified
Our results indicate these factors are important
For smartphone, which pants pocket?
How oriented? Mine almost always down and in (i.e., screen
facing thigh).
9/13/2014
HASCA 2014
12
ISSUE 5: FEATURES & FEATURE GENERATION
Usually little choice in how to
represent raw features except for
sampling rate
Raw sensor data transformed into
multivariate records using sliding
window and summary features
Half of studies don’t report window size
Vast majority of smartphone AR
research only uses basic statistics
Yield good results which appear to be
competitive with more complex features
(e.g., based on FFT info)
9/13/2014
HASCA 2014
Distribution of window
sizes for 52% of studies
that report this info
13
FEATURES & FEATURE GENERATION
Important that all AR data sets:
Release raw data
Transformed data or script to generate transformed data
Descriptions of higher level features often not sufficiently well
specified
Our datasets include raw and transformed data
sets and recently we also released the
transformation scripts
Interestingly, researchers found inconsistencies between
our raw and transformed data and helped us identify
several bugs
9/13/2014
HASCA 2014
14
WISDM ACTIVIT Y RECOGNITION DATA SETS
Two main data sets
Activity Prediction
36 users with semi-natural data collection
All data is labeled with activity
Actitracker Data
Data from our publically available Actitracker app
Data set will be updated periodically
Fully natural data collection with semi-natural data
collection for self-training data
Self-training data is labeled; remaining data is not labeled
Available from: http://www.cis.fordham.edu/wisdm/dataset.php
9/13/2014
HASCA 2014
15
CONCLUSIONS
All activity recognition research should clearly
describe relevant factors and describe
experimental methodology
Propose a list of factors/issues to include
Many existing studies do not provide important
information
Highlight role of model type
Show that many studies do not specify model type or use
hybrid models
Hybrid models are inappropriate in most cases and many
studies assume they approximate impersonal/universal
models– which is contradicted by our research
9/13/2014
HASCA 2014
16
ACKNOWLEDGEMENTS
Material based on Jeff Lockhart’s MS Thesis
Activity Recognition research was supported by
all WISDM Lab members
Funding provided by NSF Grant 1116124
9/13/2014
HASCA 2014
17
MORE INFORMATION
Information available from wisdmproject.com
Papers available under “About: Publications” tab
Includes Jeff’s MS Thesis
Jeff Lockhart, Gary Weiss (2014). The Benefits of
Personalized Smartphone-Based Activity Recognition Models,
In Proc. SIAM International Conference on Data Mining,
Society for Industrial and Applied Mathematics, Philadelphia,
PA, 614-622.
Info about our app available from actitracker.com
App available for download from Google Play
Feel free to download our data sets and ask us
about our data
9/13/2014
HASCA 2014
18