Database Issues in Smart Homes

Download Report

Transcript Database Issues in Smart Homes

Database Issues in
Smart Homes
Pervasive Intelligent
Environments
Spring 2004
CRESCENT
TCU Dept. of Computer Science
Topics: Lecture 1
•
•
•
•
•
•
•
What’s being done
What do you need it for?
Issues
Where’s the data come from? Data sources
DB Communication
How do we store the data?
Storing LOTS of data:
– Data warehouses
• Now we’ve got it, what do we do with it? Looking
ahead
• Next time: examples, more troubles…
CRESCENT
TCU Dept. of Computer Science
DB in Smart Environements
EasyLiving
Hal
Home Automation
Smart Home for Health Monitoring
House_n
IIB
i-LAND
Interactive Workspaces
MavHome
Microsoft
MIT Artificial
Intelligence
Laboratory
IBM
Relational
Medical Automation
Research Center
MIT
Trinity College
Dublin
Ambiente
Stanford University XML DBMS: LORE
University of Texas
at Arlington
Active, distributed
CRESCENT
TCU Dept. of Computer Science
UTA MavHome DB
• Active
– Reactive & proactive (e.g., to predict)
• Distributed
• Information collection agents
– Rules
• Local Agent: what data they need to collect
• Distributed: coordinate overall monitoring of
collected information
• Continuous monitoring of events
• Extension of SNOOP
CRESCENT
TCU Dept. of Computer Science
Microsoft Easy Living DB
(2002)
• Relational
– Fast & robust, but awkward for some data
• World Model DB Describes:
–
–
–
–
Computing devices
People and their personal preferences/settings
Services
Rooms and doorways
• Serves as Abstraction Layer between sensors and
application that use data from sensors
– e.g. new sensors  no change to applications
CRESCENT
TCU Dept. of Computer Science
Stanford Interactive
Workspace
• Uses LORE
– A semi-structured XML DB system
• Still available, but work stopped in 2000
– Data stored is catalog of (index to)
• documents, images, 3-D models, applicationspecific domain models
CRESCENT
TCU Dept. of Computer Science
What do you need it for?
• Kitchen
• Entertainment
• General (many uses)
– When does Molly usually come home?
– Where is Rigel now?
– What’s the rain forecast?
CRESCENT
TCU Dept. of Computer Science
• Data source
Issues
– Local (sensors, input devices)
– Outside (weather forecast)
• Data quality
• Data volume
• Data lifetime
– Do you save images once info extracted (e.g.
Ian walked in front door at 2:13pm)
• Data rep
– Relational is awkward
CRESCENT
TCU Dept. of Computer Science
Data input
• LOTS AND LOTS OF DATA
– Required for good prediction, decision making
• Inputs from
–
–
–
–
Sensors
Bar code / RF readers
Voice
PC keyboard
• Sensors
• Recording media choices
CRESCENT
TCU Dept. of Computer Science
Sensor Databases
• UTA IT Lab and Diane Cook
– sensor-generated data collection,
management, analysis, triggering
– continuous queries, stream query
processing
• Sharma Chakravarthy’s work
– Active databases
CRESCENT
TCU Dept. of Computer Science
Real Sensor Data Input
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
9/8/2002
2:0:1 AM~A5 (Coffee Maker) ON
1:6:59 AM~A9 (A/C) ON
3:58:52 AM~A0 (Stereo) ON
5:57:0 AM~A2 (Kitchen Light) ON
3:1:42 AM~A5 (Coffee Maker) OFF
7:8:3 AM~A3 (Stove) ON
12:54:52 PM~A10 (Bathroom Light) ON
4:58:5 AM~A0 (Stereo) OFF
8:1:20 AM~A3 (Stove) OFF
9:6:10 AM~A8 (Computer) ON
10:8:19 AM~A4 (Bathtub Heater) ON
11:9:4 AM~A0 (Stereo) ON
9:4:5 AM~A8 (Computer) OFF
10:9:4 AM~A4 (Bathtub Heater) OFF
2:2:5 PM~A10 (Bathroom Light) OFF
2:52:37 PM~A0 (Stereo) OFF
4:2:0 PM~A9 (A/C) OFF
CRESCENT
TCU Dept. of Computer Science
Simulated Sensor Input
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
11/15/2001
7:3:53 AM (BedRoom Alarm) A9 ON
7:4:2 AM (Bath Shower) A11 ON
7:4:8 AM (Bath BathDisplay) A10 ON
7:4:8 AM (Bath L4) A4 ON
7:4:45 AM (Kitchen CoffeePot) A8 ON
7:4:47 AM (Kitchen KitchenDisplay) A12 OFF
7:4:55 AM (Kitchen KitchenDisplay) A12 ON
7:4:47 AM (LivingRoom Thermostat) A16 ON
7:4:49 AM (Kitchen L3) A3 ON
7:4:50 AM (Garage/Patio Locks) A17 OFF
9:29:59 AM (Yard Sprinklers) A14 ON
9:29:59 AM (LivingRoom JanitorRobot) A13 ON
6:59:53 PM (Garage/Patio Locks) A17 ON
CRESCENT
TCU Dept. of Computer Science
Media Viewing Data
Watching Events
Date
Day Mood
Start End Device Program name Type
Comments
Others Rating
020302 Su normal 1330 1600 T
nba basketball sports
dallas mavericks go
team
none
5
020302 Su normal 1700 2100 t
super bowl
sports
gotta watch the
commercials
Dad
5
020402 m
normal 1900 2000 t
boston public
drama
hot teachers
none
5
020402 m
normal 2000 2100 t
ally mcbeal
drama
funny lawyers
none
4
020402 m
normal 2300 100 V
WWF RAW
wrestling testosterone
none
5
020502 t
normal 2100 2200 t
philly
drama
hot lawyers
none
4
020602 w
bored
1830 2200 t
nba basketball sports
GO MAVS
none
5
020702 th
tired
1900 2100 t
wwf
smackdown
wrestling its me soap
none
5
020702 th
tired
2100 2200 t
ER
drama
good show
none
4
020802 f
excited 1900 2230 t
olympics
sports
gotta watch
none
4
020902 sa excited 1900 2230 t
olympics
sports
gotta watch
none
4
021002 su ecstatic 1500 1800 t
NBA allstar
game
sports
gotta see what
happens
none
3
012802 M
normal 1900 2000 T
Boston Public Drama
hot chicks teaching
none
5
012802 M
normal 2000 2100 T
Ally McBeal
hot chicks lawyering
none
5
Drama
CRESCENT
TCU Dept. of Computer Science
What data to collect?
• Digital Silhouettes (Predictive Networks)
– Predicting web surfing behavior ($$$)
• Microsoft (2002) track TV viewing preferences
– 140 data items for each user
• Demographics (50)
– Subcategories within gender, age, income,
education, occupation, and race
• 90 Content preferences
– golf, music, yoga
CRESCENT
TCU Dept. of Computer Science
Communication with the DB
• Agent communication languages
– KQML
– FIPA
•
•
•
•
XML
SOAP
UPnP (upnp.org)
For more information, slides 11-26 of
– personal.tcu.edu/~lburnell/SE/SmartHomeAgents.zip
CRESCENT
TCU Dept. of Computer Science
KQML Examples
• Turn the TV on to channel 5
– (sendCommandToDevice :deviceName TV: type ask :command
(alterSettings :isOn 1 :channel 5))
• Can embed into an event
– (event :year 2001 :month October :dayOfMonth 15 :hour 15
:minute 45 :command (sendCommandToDevice :deviceName TV:
type ask :command (alterSettings :isOn 1 :channel 5)))
CRESCENT
TCU Dept. of Computer Science
Data Warehouses
• An organization-wide snapshot of data, typically used for
decision-making
•
Evolved via consultants, RDBMS vendors, and startup companies.
– All had something to prove; to "differentiate their product".
– Researchers making progress cleaning up the BIG mess they created
•
A DBMS that runs decision-making queries efficiently sometimes
called a "Decision Support System" DSS
– OLAP (on-line analytical processing) is 1 class of DSS queries
•
•
DSS systems and warehouses are typically separate from the online transaction processing (OLTP) system
Data Mart
–
a mini-warehouse -- typically a DSS for one aspect or branch of a
company, with lots of relatively homogeneous data (i.e. a straight DSS)
02.15.04 from http://redbook.cs.berkeley.edu/lec28.html
CRESCENT
TCU Dept. of Computer Science
Warehouse/DSS properties
• Very large: 100gigabytes to many terabytes
• Tends to include historical data
• Workload: mostly complex queries that access lots of data,
and do many scans, joins, aggregations. Tend to look for
"the big picture".
• Updates pumped to warehouse in batches (overnight)
• Data may be heavily summarized and/or consolidated in
advance (must be done in batches too, must finish
overnight).
– Research work has been done (e.g. "materialized views") -- a
small piece of the problem.
02.15.04 from http://redbook.cs.berkeley.edu/lec28.html
CRESCENT
TCU Dept. of Computer Science
Data Warehouses
02.15.04 from http://redbook.cs.berkeley.edu/lec28.html
CRESCENT
TCU Dept. of Computer Science
Data Warehouses
•
Data Cleaning
•
Data Loading
– Data Migration: simple transformation rules (replace "gender" with
"sex")
– Data Scrubbing: use domain-specific knowledge (e.g. zip codes) to
modify data. Try parsing and fuzzy matching from multiple sources.
– Data Auditing: discover rules and relationships (or signal violations
thereof). Not unlike data mining.
– can take a very long time! (Sorting, indexing, summarization, integrity
constraint checking, etc.) Parallelism a must.
– Full load: like one big xact – change from old data to new is atomic.
– Incremental loading ("refresh") makes sense for big warehouses, but
transaction model is more complex – have to break the load into lots of
transactions, and commit them periodically to avoid locking
everything. Need to be careful to keep metadata & indices consistent
along the way.
02.15.04 from http://redbook.cs.berkeley.edu/lec28.html
CRESCENT
TCU Dept. of Computer Science
Looking Ahead
• Using the data we have
–
–
–
–
Prediction
Decision making
Problem Solving
Getting better over time…
• Reinforcement learning
• Updating
– Bayesian networks
– Neural networks
– Rules and cases
CRESCENT
TCU Dept. of Computer Science
Looking Ahead:
Data Mining & Prediction
• Find patterns
– Verify user supplied patterns
– Generate patterns
• Sequences – HARD!
• Noise
• Missing data
CRESCENT
TCU Dept. of Computer Science
Decision Making: Bayes Nets
• What assumptions and methods allow us to turn
observations into causal knowledge, and how can
even incomplete causal knowledge be used in
planning and prediction to influence and control
our environment? *
• One solution: Bayesian nets
– a.k.a. Bayes nets, Bayesian networks, belief networks
•*From from “Causation, Prediction, and Search, 2nd
Edition”, Spirtes, Glymour & Scheines
CRESCENT
TCU Dept. of Computer Science
Problem Solving
•
•
•
•
Rule-based systems
Case-based reasoning
Neural networks
Influence diagrams
CRESCENT
TCU Dept. of Computer Science
Looking Ahead:
Reinforcement Learning
• "RL is learning what to do --- how to map situations to actions
--- so as to maximize a numerical reward signal. The learner
is not told which actions to take, as in most machine learning,
but instead must discover which actions yield the most
reward by trying them." from Reinforcement Learning: An Introduction.
• MDP & semi-MDP: assumptions about how world can be
described and that you don’t have to remember the past.
• Agents in a state can choose actions to take in an
environment.
– Choice (decision) is rewarded or punished
– Agent learns to make better choices
• Model can be stored in database. May have many
states/actions/probabilities to store.
CRESCENT
TCU Dept. of Computer Science
More information
•
Filip Perich, Anupam Joshi, Tim Finin, and Yelena Yesha, “On Data
Management in Pervasive Computing Environments. IEEE
Transactions on Knowledge and Data Engineering, October 12, 2003
– http://ebiquity.umbc.edu/v2.1/_file_directory_/papers/3.pdf
•
•
•
Fundamentals of Database Systems, 4th edition. Elmasri and
Navathe.
http://mavhome.uta.edu/publications.html
Reinforcement learning
– http://www.aaai.org/Pathfinder/html/reinf.html
– http://reinforcementlearning.ai-depot.com/Tutorials.html
CRESCENT
TCU Dept. of Computer Science