Transcript 슬라이드 1
의미 모델링
Elaborating Sensor Data using Temporal and Spatial
Commonsense Reasoning
+
Mining Models of Human Activities from the Web
지능 기반 시스템 응용
2006. 11. 민준기
Agenda
B. Morgan and P. Singh,
“Elaborating Sensor Data
using Temporal and Spatial
Commonsense Reasoning,”
BSN 2006.
M. Perkowitz, et al., “Mining
Models of Human Activities
from the Web,” WWW 2004.
The Problem Space
Proposed Technique
LifeNet : A First-Person
Model
Evaluation
Introduction
Summary and Future Work
The Plug Sensor Network
2
The Problem Space
Two distinct directions for research
Human-out (This paper)
Telephone
Technology-in (Much sensor network research)
Text messaging on cell phones
Three topics
LifeNet probabilistic human model
The Plug sensor network
An experimental design for evaluation of the LifeNet
learning method
3
LifeNet : A First-Person Model
First-person common-sense inference model
OpenMind Common Sense, ConceptNet, The PlaceLab data,
Honda’s indoor common sense data
Attempts to anticipate and predict what humans do in the
world
All of the reasoning in LifeNet is based on probabilistic
propositional logic
“I am washing my hair” before “my hair is clean”
4
The Plug Sensor Network
Using for both learning common sense and for
recognizing and predicting human behavior
Using this sensor network to monitor how individuals
interact with their physical environment
Nine sensor modalities: sound, vibration, brightness,
current, wall voltage, acceleration
5
Agenda
B. Morgan and P. Singh,
“Elaborating Sensor Data
using Temporal and Spatial
Commonsense Reasoning,”
BSN 2006.
M. Perkowitz, et al., “Mining
Models of Human Activities
from the Web,” WWW 2004.
The Problem Space
Proposed Technique
LifeNet : A First-Person
Model
Evaluation
Introduction
Summary and Future Work
The Plug Sensor Network
6
Introduction : Recognize Humans Activities
Applications include activity-based actuation
Dimming lights when a video is being watched
Providing directions for someone using unfamiliar facilities
etc.
Ubiquitous, proactive, disappearing computing
Computers have to understand people’s needs by observing
their physical activities (and to act autonomously)
The cost of developing recognition infrastructure is too high
Even small classes of activities is hard to recognize
A broadly applicable system should be general-purpose
and easy to use
7
Motivation
Introduction
Vision based systems
None have reported detecting more than tens of activities in
practice
The features robustly detectable from vision are coarse
Represent the relationships between “blobs” in the image
rather than specific objects
Each activity is expensive to model
Learning of the models
The developers define the structure of the possible models
System tunes the parameters of the model based on examples
from the user
The user is expected to label the patterns
The variety of activities is quite restricted
8
Proposed Technique
RFID (Radio Frequency Identification)
Cheap: Postage-stamp sized, forty-cent
Wireless and battery free
Activity modeling
Define an activity in terms of the probability and sequence
of the objects
Generate the models by translating textual definitions
Structured like recipes
Produced automatically by mining appropriate web sites
Mining models is part of a larger activity recognition system,
PROACT (Proactive Activity Toolkit)
9
Usage Model
Proposed Technique
Assumes that interesting objects in the environment
contain RFID tags (tens ~ hundreds)
Making a database entry mapping the tag ID to a name
Within a few years, many household objects may be RFIDtagged before purchase, thus eliminating the overhead of
tagging
Medium-range readers (Tag-detecting Gloves) and
Long-range readers (Run robots, Carts, …)
PROACT uses the sequence and timing of object to
deduce what activity is happening
Likelihood of various activities, details of those activities,
degree of certainty, etc…
10
System Overview
Proposed Technique
PROACT provides an activity viewer for debugging
Real-time view of activities in progress
The sensor data seen
Changing of belief in each activity with the data
Inference Engine converts the activity models produced
by the mining engine into Dynamic Bayesian Networks
D. Patterson, L. Liao, D. Fox, H. Kautz, “Inferring High-Level
Behavior from Low-Level Sensors,” Ubicomp 2003.
11
Sensors and Models
Proposed Technique
Sensors
Use two different kinds of RFID readers
Long-range reader (mobile robot): map the location of objects
Short-range reader (glove): determine the objects that are
touched
Models
Each model (activity) is composed of a sequence (step) s1
~sn
Each step si has optional duration ti and object oij involved
along with the probability pij
12
The Model Extractor
Proposed Technique
Builds formal models of activities using directions
Directions are written in natural language by human
How-to (ehow.com), recipes (epicurious.com), training
manuals, protocols, etc.
Syntactic structure of directions
1. A title t for the activity
2. A textual list r1~rm, Each step ri has:
Possibly a special keyword delimiting duration di
What to do during the step: subset of the objects and duration
13
Proposed Technique
Converting Directions to Activity Models
Key steps
1. Labeling
Set label of the mined model to title of the directions
2. Parsing steps
Duration: Gaussian with mean = d, stdev = S(d, i, l )
Object Oi and Probability P
3. Tagged object filtering
For example,
[“making tea”] has 24,200 matches, and
[“making tea” cup] has 7,340 matches, then
conditional probability of a cup being involved in
making tea is 7340/2400 = 0.3
Functions
Object
Object extraction: WordNet ontology
Noun-phrase extraction: QTag tagger
Probability
Fixed probabilities
Google conditional probabilities (GCP)
14
Example
15
Proposed Technique
Evaluation
Mined models
ehow.com: 2300 directions
ffts.com: 400 recipes
epicurious.com: 18,600 recipes
Three strategies to approximate comprehensive
evaluation
Human activity-trace recognition
Activities of Daily Living (ADLs)
Inter-corpus consistency
Making cookies recipes
Intra-corpus distinguish-ability
Distinguish-ability within activity domains
16
Distinguish-ability
17
Evaluation
Human and inter-corpus trace recognition
ADLs domain
Many objects were not tagged, missed, and interleaved
Models were not perfect
Cookie domain
The identical recipe can have quite different structure
For some of the recipes, there is no counterpart in the other
corpus
18
Impact of techniques on accuracy
Evaluation
ADLs
Domain is fairly sparse, with many activities involving only
few object
Cookie domain
Each activity model involves many more objects
19
Evaluation
Impact of techniques on compactness
20
Summary and Future Work
An introduction to the idea of mining activity detection
from the web
Future work
Perform a more comprehensive evaluation
Improving the effectiveness of mined models
Include location
Synonymous words
Synsets (collections of synonymous words) can be extracted
from WordNet
21