Powerpoint template for scientific posters (Swarthmore

Download Report

Transcript Powerpoint template for scientific posters (Swarthmore

Mining Sensor Data in Smart Environment for Temporal
Activity Prediction
Vikramaditya R. Jakkula & Diane J. Cook
Washington State University
First International Workshop on Knowledge Discovery from Sensor Data (Sensor-KDD '07)
Temporal Relations
Introduction
Use Allen’s Temporal Relations [3] to
identify temporal relations among
Activities in Daily Life of the resident.
Allen’s relations form the basic
representation of the temporal intervals,
which when used with constraints
become a powerful method of
expressing expected temporal orderings
between events in a smart environment.
In this poster we consider the problem
of activity prediction based on the
discovery and application of temporal
relations.
“It is common to describe
scenarios using time intervals
rather than time points”
- James F. Allen
Food
Water
Smart Home Goals:
Figure 3.
Smart
Home
Scenario
illustrated
using
temporal
relations.
Maximum Comfort
and Security
Reminder Assistance
• Reminder system based on
temporal relations.
Anomaly Detection
• If pills are to be taken
“After” food, we can notice
violation of this activity!
Temporal
Relations
Data
Interval
Data
Algorithm : Temporal Interval Analyzer
Input: data timestamp, event name and state
Repeat
While [Event && Event + 1 found]
Find paired “ON” or “OFF” events in data to
determine temporal
range.
Read next event and find temporal range.
Identify relation type between event pair from
possible relation types
(see Table 1).
Record relation type and related data.
Increment Event Pointer
Loop until End of Input.
Timestamp
Sensor State
3/3/2003 11:18:00 AM OFF
3/3/2003 11:23:00 AM ON
Identify Time Intervals
Sensor ID
E16
G12
Date
Sensor ID Start Time End time.
03/02/2003 G11
01:44:00
01:48:00
03/02/2003 G19
02:57:00
01:48:00
Associated Temporal Relations
Date time
Sensor ID Temporal Relation Sensor ID
3/3/2003 12:00:00 AM G12
DURING
E16
3/3/2003 12:00:00 AM E16
BEFORE
I14
Step 2: Association rule
generation using Weka
Use Apriori classifier in Weka [2] for
generating best rules with a given
support and confidence.
Assosiation Rule Mining on Real Data
The major goal of MavHome project is to design an
environment that acts as an intelligent agent and can
acquire information about the resident and the
environment in order to adapt the environment to the
residents and meet the goals of comfort and efficiency.
This sensor network consists of around 100 sensors
include motion, devices, light, pressure, humidity and
more.
Unified project incorporating varied AI techniques
cross disciplinary with mobile computing, databases
,multimedia, and others.
Figure 2.
Temporary Need Analysis
• If oven used for turkey, is
turkey at home?
Improve Prediction
• Increase predictive
accuracy by
incorporating additional
temporal information.
Real & Synthetic Datasets.
Real Dataset (Sample):
3/2/2003 12:40:0 AM, (Studio E) E9 OFF
3/2/2003 2:40:0 AM, (Living Room) H9 ON
3/2/2003 2:40:0 AM, (Living Room) H9 OFF
3/2/2003 6:4:0 AM, (Living Room) H9 OFF
3/3/2003 3:43:0 AM, (Studio C) C14 ON
3/3/2003 3:43:0 AM, (Studio C) C15 ON
3/3/2003 3:43:0 AM, (Studio C) C13 ON
90
Synthetic Dataset (Sample):
2/1/2006 10:02:00 AM, off, oven
2/1/2006 11:00:00 AM, on, lamp
2/1/2006 11:11:00 AM, off, thermostat
2/1/2006 12:02:00 PM, off, lamp
2/1/2006 12:35:00 PM, off, cooker
2/1/2006 1:30:00 PM, on, lamp
2/1/2006 2:02:00 PM, off, fan
Parameter Setting
No of
Size
No of No of Intervals of
Datasets Days Events Identified Data
Synthetic
60
8
1729 106KB
Real
60
17
1623 104KB
70
60
50
40
30
20
No. Best Rules
10
Minimum Confidence
0
1
2
Minimum Support
3
Minimum Support
Minimum
Confidence
No. Best Rules
4
1
0
2
0.01
3
0.02
4
0.05
0.5
0.5
0.5
0.5
100
6
2
1
Temporal Relation
Before
After
During
Contains
Overlaps
Overlapped-By
Meets
Met-by
Starts
Started-By
Finishes
Finished-By
Equals
100
ACCURACY%
ERROR%
REAL (WITHOUT RULES)
55
45
SYNTHETIC
RULES)
64
36
REAL (WITH RULES)
56
44
SYNTHETIC (WITH RULES)
69
31
(WITHOUT
Real data had 1.86% and synthetic data had 7.81%
prediction improvements.
Good model for offline prediction of multiple events.
Cannot adapt to online dynamic model of the
environment.
Online Model: Enhance existing ALZ prediction [4].
Predictionc:=P(C|P)
:=P(C|P)SEQ+P(C|P)TEM/Global – (α * P(C|P)TEM)
Where α = | #CPHRASE| / | #CGLOBAL |.
Unique and new Approach.
Real data had 1.86% and synthetic data
had 7.81% improvement.
Larger datasets would be incorporated.
Extended model includes direct
application of temporal relations based
probability to calculate the prediction.
expansion of the temporal relations by
including more temporal relations, such as
until, since, next, and so forth, to create a
richer collection of useful temporal relations.
90
80
[1] G. Michael Youngblood, Lawrence B. Holder, and Diane J. Cook. Managing
Adaptive Versatile Environments. Proceedings of the IEEE International
Conference on Pervasive Computing and Communications, 2005.
70
60
50
40
[2] Ian H. Witten, Eibe Frank. 2005. Data Mining: Practical Machine Learning
Tools and Techniques, 2nd Edition. Morgan Kaufmann, San Francisco.
No of Best rules found
10
Minimum Confidence
0
X


X
X

X






Results:
Literature cited
20
Usable
P(Z|Y) = |After(Y,Z)| + |During(Y,Z)| + |OverlappedBy(Y,Z)|
+ |MetBy(Y,Z)| + |Starts(Y,Z)| +
|StartedBy(Y,Z)| + |Finishes(Y,Z)| +
|FinishedBy(Y,Z)| + |Equals(Y,Z)| / |Y|
Assosiation Rule Mining on Synthetic
Data
30
Allen’s 13 Relations [3]
Equation to calculate evidence using
Probability of occurrence:
Conclusions
100
Axis Title
Figure 1. MavHome Smart Home Architecture [1]
[1] Get the current predicted output and check for any rule
which satisfies it. If yes proceed else goto next predicted.
[2] Now we check for the relation and based on the
evidence as calculated by equation displayed below if it is
greater than Mean+2* Std. Dev. Then add this to the
predicted.
[3] If relation is after the evidence becomes cumulative
until greater then Mean +2*Std. Dev.
Raw Sensor Data
80
Maintenance
• If cooker is spoiled should
we call emergency or a
normal repair?
Enhancement to the
Prediction.
DATA SET
Why Temporal Relations?
Data Collection
Environment
Step 3: Temporal Rules
Pseudo code: Temporal Rules Enhanced
prediction.
temporal intervals
Raw Sensor
Data
Food “Contains” water
or
Water “Before” pills
or
Food “Meets” pills
or
Food “Contains” water “before” pills
Cost Effective and
Reliable
Step 1: Process raw data to form
Time Interval 
Pills
Adapt to Needs
Experimentation & Results
1
2
Minimum Support
Minimum
Confidence
No of Best rules
found
3
Minimum Support
4
1
0
2
0.01
3
0.02
4
0.05
0.5
0.5
0.5
0.5
100
10
5
3
Due to small datasets used, we use the top
rules generated with a minimum confidence of 0.5
and a minimum support of 0.01.
Confidence level above 0.5 and support above
0.05 could not be used, as they could not result in
any viable rules.
Sample of the best rules observed in a real smart
environment dataset:
Activity=C11
Relation=CONTAINS 36 ==> Activity=A14 36
.
Activity=D15 Relation=FINISHES 32 ==> Activity=D9 32
Activity=D15 Relation=FINISHESBY 32 ==> Activity=D9 32
Activity=C14 Relation=DURING 18 ==> Activity=B9 18
[3]James F. Allen, and George Ferguson, Actions and Events in Interval
Temporal Logic, Technical Report 521, July 1994.
[4] K. Gopalratnam & D. J. Cook (2004). Active LeZi: An Incremental Parsing
Algorithm for Sequential Prediction. International Journal of Artificial
Intelligence Tools. 14(1-2):917-930.
Acknowledgments
This work is supported by NSF
grant IIS-0121297.
Contact Us:
Vikramaditya R. Jakkula
[email protected]
Diane J. Cook
[email protected]