preferenceQuery - Department of Electrical Engineering

Download Report

Transcript preferenceQuery - Department of Electrical Engineering

Foundations of Preferences in Database
Systems
Author: Werner Kieling
Institute of Computer Science
University of Augsburg
Germany
Presented by:
Haoyuan Wang
Contents
Introduction
Practical Work
Preference Queries
Preference Algebra
Preference Engineering
Basics of Preferences
An Online Algorithm for Skyline Queries
Introduction
(Motivation)
We have the Exact World




Delivering exact dream
objects.
Wishes satisfied completely or
not at all.
Uses hard constraints.
Already investigated, such as:
SQL, E-R modeling, XML.
We Need the Real World
 If not perfect match, worse
alternative acceptable.
 Requires best possible match
making.
 Requires soft constraints.
 Preference-driven choice lagged
behind
SELECT * FROM car
WHERE make = 'Opel'
PREFERRING (category = 'roadster' ELSE
category <> 'passenger'
AND price AROUND 40000
AND HIGHEST(power))
CASCADE color = 'red'
CASCADE LOWEST(mileage);
Introduction
Approaches to cope these two worlds:
Cooperative Database System:

Not comprehensive.
Exact world
Real world
Preference Database Systems:
Intuitive semantics.
 Concise mathematical foundation.
 Constructive and extensible preference model.
 Conflicts of preferences do not cause system failure.
 Declarative preference query languages.

Basics of Preferences
What is preference ?
• Intuitively: “ I like A better than B”.
• Mathematically: relation which is in strict partial order.
irreflexive, transitive
a
b
c
d
• Our definition: Preference P=(A, <P).
A: a set of attribute names.
<p  dom(A)  dom(A)
a strict partial order referring to attribute names A.
<P
A={A1, A2, …, Ak}
A={A1, A2, …, Ak}
Basics of Preferences
Better-than graph G
Level 1:
val1
val3
Level 2:
val2
val4
Level 3:
val5
val6
val7

x <P y, if y is a predecessor of x in G.

Values in G without predecessor are maximal, being at level 1.

x is on level j if the longest path from x to maxmal has j-1 edges.

if no directed path between x and y, x and y are not ranked.
Basics of Preferences
Special cases of preferences
Chain preference
If for all x,
y∈dom(A), x≠y:
x <P y ∨ y <P x.
All values are
ranked in chain
preference.
Anti-Chain preference
S = (S, ),
given any set of
values S.
Any set, including
dom(A) can be
converted into an
anti-chain.
Dual preference
P=(A, P) reverse
the order on P, x
<P y iff y<P x.
Any P can be
completely
reversely ranked
Subset preference
Given P=(A, P),
every Sdom(A)
induces a subset
reference P =(S,
< P ), if for any x,
yS: x P y iff
x<P y.
For any subset of
A, <P can apply.
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
POS preference: POS(A, POS-set)
P is a POS preference, if:
x <P y iff x POS-set  yPOS-set
dom(A)
POS-set
Example
POS(transmission, {automatic})
Intuitive definition:
A desired value should be in
a finite set of favorites
POSdom(A). If this
infeasible, better than
getting nothing, any other
value from dom(A) is
acceptable.
Implication:
all value in POS-set are
maximal, all value not in
POS-set are at level 2 and
worse than all POS-set
values.
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
NEG preference: NEG(A, NEG-set) Intuitive definition:
P is a NEG preference, if:
x <P y iff y NEG-set  xNEG-set
dom(A)
A desired value should not
be in a finite set of dislikes
NEG-set. If this infeasible,
better than getting nothing,
any other value from NEGset is acceptable.
Implication:
NEG-set
Example
NEG(make, {Ferrari})
all value not in NEG-set are
maximal, all value in NEGset are at level 2 and worse
than all maximal values.
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
POS/NEG preference:
POS/NEG(A, POS-set, NEG-set)
P is a POS preference, if:
x <P y iff ( x NEG-set  y NEGset)(x NEG-set  x POS-set 
yPOS-set)
dom(A)
NEG-set
POS-set
Intuitive definition:
A desired value should be
one in a finite set of
favorites. Otherwise it
should not be any from a
finite set of disjoint dislikes.
If this infeasible, better than
getting nothing, any other
value from dislikes is
acceptable.
Example
POS/NEG(color, {yellow};
{gray})
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
POS/POS preference:
POS/POS(A, POS1-set, POS2-set)
x <P y iff ( x POS2-set  y 
POS1-set)(x POS1-set  x
POS2-set  yPOS2-set)  (x
POS1-set  x POS2-set 
yPOS1-set)
Intuitive definition:
A desired value should be in a
finite set of favorites POS1-set.
Otherwise, it should be from a
disjoint finite set of
alternatives POS2-set. If this
infeasible, better than getting
nothing, any other value is
acceptable.
Example
dom(A)
POS2-set
POS1-set
POS/POS(category, {cabrio};
{roadster})
Preference Engineering
Inductive Construction of Preference-Numerical base preference
AROUND preference: AROUND(A, z)
Given z dom(A), for all v dom(A) .
Define distance(v,z) :=abs(v-z)
Intuitive definition:
The desired value should
be z. If this infeasible,
values with shortest
distance from z are
acceptable.
P is called AROUND preference, if:
x <P y iff distance(x,z)>distance(y,z)
z
v
Example
AROUND(price, 40000)
Preference Engineering
Inductive Construction of Preference-Numerical base preference
BETWEEN preference: BETWEEN(A,[low, up])
Given z[low, up]dom(A)dom(A), for all v
dom(A) .
Define distance(v,[low, up]) :=
if v [low, up] then 0 else
if v <low then low-v else v-up
P is called BETWEEN preference, if:
x <P y iff:
distance(x, [low,up] ) > distance(y, [low, up])
Intuitive definition:
The desired value
should be between the
bounds of an interval.
If this infeasible,
values with shortest
distance from the
bounds are acceptable.
Example
v
v
low
up
BETWEEN(mileage,
[20000, 30000])
Preference Engineering
Inductive Construction of Preference-Numerical base preference
LOWEST, HIGHEST preference:
LOWEST(A), HIGHEST(A)
P is called LOWEST preference, if
x <P y iff x > y
P is called HIGHEST preference, if
x <P y iff x < y
Intuitive definition:
The desired value should
be as low (high) as
possible.
Example
LOW(price)
HIGHEST(power)
Preference Engineering
Inductive Construction of Preference-Numerical base preference
SCORE preference: SCORE(A, f)
P is called SCORE preference, if for some x, y dom(A):
x <P y iff f(x)<f(y)
Note: <p can be applied after the score function f gives a value.
Preference Engineering
Inductive Construction of Preference-Complex preference
Pareto preference:
P=(A1A2, <P1 P2)
Given P1=(A1, <P1) and P2=(A2,
<P2) and P2=(A2, <P2), for x, y
dom(A1)dom(A2), define:
x < P1 P2 y iff
(x1 <P1 y1  (x2 <p y2  x2=y2)) 
(x2 <P2 y2  (x1 <P1 y1 x1=y1))
Intuitive definition:
P1 and P2 are considered as
equally important
preferences. In order for
x=(x1, x2) to be better than
y=(y1, y2), it is not tolerable
that x is worse than y in any
x i.
Preference Engineering
Inductive Construction of Preference-Complex preference
Pareto preference example
For dom(A1)=dom(A2)=dom(A3)=integer and
P1 :=AROUND(A1, 0),
P2 :=LOWEST(A2),
P3 :=HIGHEST(A3),
P4 :=({A1, A2, A3}), <P4) :=(P1P2)P3
let’s study a subset preference of P4 for the following set:
R(A1, A2, A3)={val1:(-5, 3, 4), val2:(-5, 4, 4), val3:(5, 1, 8), val4:(5, 6, 6),
val5:(-6, 0, 6), val6:(-6, 0, 4), val7:(6, 2, 7)}
The ‘better-than’ graph of P4 for subset R can be obtained by
performing exhaustive ‘better-than’ checks:
Level 1:
val1
val3
Level 2:
val2
val4
val5
val7
val6
Preference Engineering
Inductive Construction of Preference-Complex preference
Prioritized preference: Intuitive definition:
P=(A1A2, <P1& P2)
Given P1=(A1, <P1) and
P2=(A2, <P2) and
P2=(A2, <P2), for x, y
dom(A1)dom(A2),
define:
P2 is respected only when P1 does not mind.
Example:
Let’s revisit Example 1, now studying:
P8 = ({A1, A2}, <P8) := P1&P2
R(A1, A2, A3)={val1:(-5, 3), val2:(-5, 4), val3:(5,
1), val4:(5, 6), val5:(-6, 0), val6:(-6, 0), val7:(6,
2)}
The ‘better-than’ graph of P8 for subset R
is this:
Level 1:
val1
val3
(x1 <P1 y1  (x1=y1  Level 2:
x2 <P y2)
Level 3:
val2
val4
x < P1& P2 y iff
val5
val6
val7
Preference Engineering
Inductive Construction of Preference-Complex preference
Writing a preference query - a used-car scenario
1. Write wish-list
Julia wants to buy a used car for herself and her friend Leslie, she wishes:
“My favorite car is Cabrio, but roadster is also good, I like an automatic
car and it should have a horsepower of about 100, these issues are equally
important to me, but color is the most important, it should not be gray, I
do not care too much about price, but since it is a used car, the lower, the
better”.
Julia goes to vendor Michael. Michael wishes : “ Clients’ wishes are
always more important than mine, I like to sell older cars, the have higher
commission” .
Julia also needs to ask Leslie, Leslie wishes: “I agree with Julia, I
convinced Julia money should matter as much as color, I like blue, if not
available, please not gray and red”
Preference Engineering
Inductive Construction of Preference-Complex preference
2. Convert wish-list to
preference query
terms
Julia:
P1:=POS/POS(category, {cabrio};{roadster})
P2:=POS(transmission, {automatic})
P3:=AROUND(horsepower, 100)
P4:=LOWEST(price)
P5:=NEG(color, {gray})
Michael:
P6:=HIGHEST(year-of-construction)
P7:=HIGHEST(commission)
Leslie:
P8:=POS/NEG(color, {blue};{gray, red})
3. Use Pareto and
Prioritization to add
each query terms
Q1=({color, category, transmission,
horsepower, price},
<Q1) :=P5&((P1P2P3)&P4)
Q2=({color, category, transmission,
horsepower, price, year-ofconstruction,
commission},<Q2) :=(Q1&P6)&P7
=((P5&((P1P2P3)&P4))&P6)&
P7
Q2=({color, category, transmission,
horsepower, price, year-ofconstruction,
commission},<Q2) :=(Q1&P6)&P7=
(((P5  P8 
P4)&(P1P2P3))&P6))&P7
Evaluation of Preference
Queries
How a preference query like the following is evaluated:
Decomposition of queries
Q2=({color, category, transmission, horsepower, price, year-of-construction,
commission},<Q2) :=(Q1&P6)&P7=(((P5  P8  P4)&(P1P2P3))&P6))&P7

Decomposition of ‘&’ and ‘’
[P1&P2](R)= [P1] [P2 groupby A1](R)
[P1P2](R)= ([P1](R)  [P2 groupby A1](R)) 
([P2](R)  [P1 groupby A2](R))  YY(P1&P2, P2& P1)R

Decomposition of ‘+’ and ‘’
[P1+P2](R)= [P1](R) [P2](R)
[P1P2](R)= [P1](R) [P2](R) YY(P1, P2)R
Evaluation of Preference
Queries
BMO query model R
Database
preference P
Assume P=(A, <P), where A=(A1, A2, …, Ak).
a) Each R[A]  dom(A) defines a subset preference, called a database
preference and denoted by:
PR=(R[A], <P)
b) Tuple tR is a perfect match in a database set R, if:
t[A] max(P)  t[A] R.
Note: preference query performs a match-making between the stated
preference and the database preference.
[P](R)={tR | t[A] max(PR )}
Note: [P](R)
evaluates P against database set R by retrievingall maximal
R
values from P
Shooting Stars in the Sky:
An Online Algorithm for Skyline Queries
Authors:
Donald Kossmann
Frank Ramsak
Steen Rost
Technische Universit. at M. unchen
Orleansstr. 34
81667 Munich
Germany
A Online Algorithm for Skyline Queries
Contents:

Skyline Queries

The NN Algorithm for 2-Dimensional Skylines


An Example for 2-Dimensional Skylines
The NN Algorithm for d-Dimensional Skylines
Skyline Queries
Retrieves all interesting
points.

Helps user get a big
picture of interesting
options.

If users moves, skyline
should be re-computed,
user’s choice based on
user’s location.

The NN Algorithm for 2- Dimensional
Skylines
Input: Data set D
Distance function f (Euclidean distance)
/* Initialization: the whole data space needs to be inspected*/
T ={(, )}
/* Loop: iterate until all regions have been investigated*/
WHILE (T 0) Do
(mx, my)=takeElement(T)
IF( boundedNNSearch(O, D, (mx, my), f)) THEN
( nx, ny)= boundedNNSearch(O, D, (mx, my), f))
T=T{(nx, my), (mx, ny)}
OUTPUT n
ENDIF
ENDWHILE
An Example for 2-Dimensional Skylines
The NN Algorithm for d-Dimensional
Skylines
Approaches to Deal With Duplicates

Laisser-faire

propagate

Merge

Fine-grained Partitioning

Hybrid Approaches
Foundations of Preferences in Database
Systems
Shooting Stars in the Sky:
An Online Algorithm for Skyline Queries
Authors:
Donald Kossmann
Frank Ramsak
Steen Rost
Technische Universit. at M. unchen
Orleansstr. 34
81667 Munich
Germany