preferenceQuery - Department of Electrical Engineering
Download
Report
Transcript preferenceQuery - Department of Electrical Engineering
Foundations of Preferences in Database
Systems
Author: Werner Kieling
Institute of Computer Science
University of Augsburg
Germany
Presented by:
Haoyuan Wang
Contents
Introduction
Practical Work
Preference Queries
Preference Algebra
Preference Engineering
Basics of Preferences
An Online Algorithm for Skyline Queries
Introduction
(Motivation)
We have the Exact World
Delivering exact dream
objects.
Wishes satisfied completely or
not at all.
Uses hard constraints.
Already investigated, such as:
SQL, E-R modeling, XML.
We Need the Real World
If not perfect match, worse
alternative acceptable.
Requires best possible match
making.
Requires soft constraints.
Preference-driven choice lagged
behind
SELECT * FROM car
WHERE make = 'Opel'
PREFERRING (category = 'roadster' ELSE
category <> 'passenger'
AND price AROUND 40000
AND HIGHEST(power))
CASCADE color = 'red'
CASCADE LOWEST(mileage);
Introduction
Approaches to cope these two worlds:
Cooperative Database System:
Not comprehensive.
Exact world
Real world
Preference Database Systems:
Intuitive semantics.
Concise mathematical foundation.
Constructive and extensible preference model.
Conflicts of preferences do not cause system failure.
Declarative preference query languages.
Basics of Preferences
What is preference ?
• Intuitively: “ I like A better than B”.
• Mathematically: relation which is in strict partial order.
irreflexive, transitive
a
b
c
d
• Our definition: Preference P=(A, <P).
A: a set of attribute names.
<p dom(A) dom(A)
a strict partial order referring to attribute names A.
<P
A={A1, A2, …, Ak}
A={A1, A2, …, Ak}
Basics of Preferences
Better-than graph G
Level 1:
val1
val3
Level 2:
val2
val4
Level 3:
val5
val6
val7
x <P y, if y is a predecessor of x in G.
Values in G without predecessor are maximal, being at level 1.
x is on level j if the longest path from x to maxmal has j-1 edges.
if no directed path between x and y, x and y are not ranked.
Basics of Preferences
Special cases of preferences
Chain preference
If for all x,
y∈dom(A), x≠y:
x <P y ∨ y <P x.
All values are
ranked in chain
preference.
Anti-Chain preference
S = (S, ),
given any set of
values S.
Any set, including
dom(A) can be
converted into an
anti-chain.
Dual preference
P=(A, P) reverse
the order on P, x
<P y iff y<P x.
Any P can be
completely
reversely ranked
Subset preference
Given P=(A, P),
every Sdom(A)
induces a subset
reference P =(S,
< P ), if for any x,
yS: x P y iff
x<P y.
For any subset of
A, <P can apply.
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
POS preference: POS(A, POS-set)
P is a POS preference, if:
x <P y iff x POS-set yPOS-set
dom(A)
POS-set
Example
POS(transmission, {automatic})
Intuitive definition:
A desired value should be in
a finite set of favorites
POSdom(A). If this
infeasible, better than
getting nothing, any other
value from dom(A) is
acceptable.
Implication:
all value in POS-set are
maximal, all value not in
POS-set are at level 2 and
worse than all POS-set
values.
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
NEG preference: NEG(A, NEG-set) Intuitive definition:
P is a NEG preference, if:
x <P y iff y NEG-set xNEG-set
dom(A)
A desired value should not
be in a finite set of dislikes
NEG-set. If this infeasible,
better than getting nothing,
any other value from NEGset is acceptable.
Implication:
NEG-set
Example
NEG(make, {Ferrari})
all value not in NEG-set are
maximal, all value in NEGset are at level 2 and worse
than all maximal values.
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
POS/NEG preference:
POS/NEG(A, POS-set, NEG-set)
P is a POS preference, if:
x <P y iff ( x NEG-set y NEGset)(x NEG-set x POS-set
yPOS-set)
dom(A)
NEG-set
POS-set
Intuitive definition:
A desired value should be
one in a finite set of
favorites. Otherwise it
should not be any from a
finite set of disjoint dislikes.
If this infeasible, better than
getting nothing, any other
value from dislikes is
acceptable.
Example
POS/NEG(color, {yellow};
{gray})
Preference Engineering
Inductive Construction of Preference-Nonnumerical base preference
POS/POS preference:
POS/POS(A, POS1-set, POS2-set)
x <P y iff ( x POS2-set y
POS1-set)(x POS1-set x
POS2-set yPOS2-set) (x
POS1-set x POS2-set
yPOS1-set)
Intuitive definition:
A desired value should be in a
finite set of favorites POS1-set.
Otherwise, it should be from a
disjoint finite set of
alternatives POS2-set. If this
infeasible, better than getting
nothing, any other value is
acceptable.
Example
dom(A)
POS2-set
POS1-set
POS/POS(category, {cabrio};
{roadster})
Preference Engineering
Inductive Construction of Preference-Numerical base preference
AROUND preference: AROUND(A, z)
Given z dom(A), for all v dom(A) .
Define distance(v,z) :=abs(v-z)
Intuitive definition:
The desired value should
be z. If this infeasible,
values with shortest
distance from z are
acceptable.
P is called AROUND preference, if:
x <P y iff distance(x,z)>distance(y,z)
z
v
Example
AROUND(price, 40000)
Preference Engineering
Inductive Construction of Preference-Numerical base preference
BETWEEN preference: BETWEEN(A,[low, up])
Given z[low, up]dom(A)dom(A), for all v
dom(A) .
Define distance(v,[low, up]) :=
if v [low, up] then 0 else
if v <low then low-v else v-up
P is called BETWEEN preference, if:
x <P y iff:
distance(x, [low,up] ) > distance(y, [low, up])
Intuitive definition:
The desired value
should be between the
bounds of an interval.
If this infeasible,
values with shortest
distance from the
bounds are acceptable.
Example
v
v
low
up
BETWEEN(mileage,
[20000, 30000])
Preference Engineering
Inductive Construction of Preference-Numerical base preference
LOWEST, HIGHEST preference:
LOWEST(A), HIGHEST(A)
P is called LOWEST preference, if
x <P y iff x > y
P is called HIGHEST preference, if
x <P y iff x < y
Intuitive definition:
The desired value should
be as low (high) as
possible.
Example
LOW(price)
HIGHEST(power)
Preference Engineering
Inductive Construction of Preference-Numerical base preference
SCORE preference: SCORE(A, f)
P is called SCORE preference, if for some x, y dom(A):
x <P y iff f(x)<f(y)
Note: <p can be applied after the score function f gives a value.
Preference Engineering
Inductive Construction of Preference-Complex preference
Pareto preference:
P=(A1A2, <P1 P2)
Given P1=(A1, <P1) and P2=(A2,
<P2) and P2=(A2, <P2), for x, y
dom(A1)dom(A2), define:
x < P1 P2 y iff
(x1 <P1 y1 (x2 <p y2 x2=y2))
(x2 <P2 y2 (x1 <P1 y1 x1=y1))
Intuitive definition:
P1 and P2 are considered as
equally important
preferences. In order for
x=(x1, x2) to be better than
y=(y1, y2), it is not tolerable
that x is worse than y in any
x i.
Preference Engineering
Inductive Construction of Preference-Complex preference
Pareto preference example
For dom(A1)=dom(A2)=dom(A3)=integer and
P1 :=AROUND(A1, 0),
P2 :=LOWEST(A2),
P3 :=HIGHEST(A3),
P4 :=({A1, A2, A3}), <P4) :=(P1P2)P3
let’s study a subset preference of P4 for the following set:
R(A1, A2, A3)={val1:(-5, 3, 4), val2:(-5, 4, 4), val3:(5, 1, 8), val4:(5, 6, 6),
val5:(-6, 0, 6), val6:(-6, 0, 4), val7:(6, 2, 7)}
The ‘better-than’ graph of P4 for subset R can be obtained by
performing exhaustive ‘better-than’ checks:
Level 1:
val1
val3
Level 2:
val2
val4
val5
val7
val6
Preference Engineering
Inductive Construction of Preference-Complex preference
Prioritized preference: Intuitive definition:
P=(A1A2, <P1& P2)
Given P1=(A1, <P1) and
P2=(A2, <P2) and
P2=(A2, <P2), for x, y
dom(A1)dom(A2),
define:
P2 is respected only when P1 does not mind.
Example:
Let’s revisit Example 1, now studying:
P8 = ({A1, A2}, <P8) := P1&P2
R(A1, A2, A3)={val1:(-5, 3), val2:(-5, 4), val3:(5,
1), val4:(5, 6), val5:(-6, 0), val6:(-6, 0), val7:(6,
2)}
The ‘better-than’ graph of P8 for subset R
is this:
Level 1:
val1
val3
(x1 <P1 y1 (x1=y1 Level 2:
x2 <P y2)
Level 3:
val2
val4
x < P1& P2 y iff
val5
val6
val7
Preference Engineering
Inductive Construction of Preference-Complex preference
Writing a preference query - a used-car scenario
1. Write wish-list
Julia wants to buy a used car for herself and her friend Leslie, she wishes:
“My favorite car is Cabrio, but roadster is also good, I like an automatic
car and it should have a horsepower of about 100, these issues are equally
important to me, but color is the most important, it should not be gray, I
do not care too much about price, but since it is a used car, the lower, the
better”.
Julia goes to vendor Michael. Michael wishes : “ Clients’ wishes are
always more important than mine, I like to sell older cars, the have higher
commission” .
Julia also needs to ask Leslie, Leslie wishes: “I agree with Julia, I
convinced Julia money should matter as much as color, I like blue, if not
available, please not gray and red”
Preference Engineering
Inductive Construction of Preference-Complex preference
2. Convert wish-list to
preference query
terms
Julia:
P1:=POS/POS(category, {cabrio};{roadster})
P2:=POS(transmission, {automatic})
P3:=AROUND(horsepower, 100)
P4:=LOWEST(price)
P5:=NEG(color, {gray})
Michael:
P6:=HIGHEST(year-of-construction)
P7:=HIGHEST(commission)
Leslie:
P8:=POS/NEG(color, {blue};{gray, red})
3. Use Pareto and
Prioritization to add
each query terms
Q1=({color, category, transmission,
horsepower, price},
<Q1) :=P5&((P1P2P3)&P4)
Q2=({color, category, transmission,
horsepower, price, year-ofconstruction,
commission},<Q2) :=(Q1&P6)&P7
=((P5&((P1P2P3)&P4))&P6)&
P7
Q2=({color, category, transmission,
horsepower, price, year-ofconstruction,
commission},<Q2) :=(Q1&P6)&P7=
(((P5 P8
P4)&(P1P2P3))&P6))&P7
Evaluation of Preference
Queries
How a preference query like the following is evaluated:
Decomposition of queries
Q2=({color, category, transmission, horsepower, price, year-of-construction,
commission},<Q2) :=(Q1&P6)&P7=(((P5 P8 P4)&(P1P2P3))&P6))&P7
Decomposition of ‘&’ and ‘’
[P1&P2](R)= [P1] [P2 groupby A1](R)
[P1P2](R)= ([P1](R) [P2 groupby A1](R))
([P2](R) [P1 groupby A2](R)) YY(P1&P2, P2& P1)R
Decomposition of ‘+’ and ‘’
[P1+P2](R)= [P1](R) [P2](R)
[P1P2](R)= [P1](R) [P2](R) YY(P1, P2)R
Evaluation of Preference
Queries
BMO query model R
Database
preference P
Assume P=(A, <P), where A=(A1, A2, …, Ak).
a) Each R[A] dom(A) defines a subset preference, called a database
preference and denoted by:
PR=(R[A], <P)
b) Tuple tR is a perfect match in a database set R, if:
t[A] max(P) t[A] R.
Note: preference query performs a match-making between the stated
preference and the database preference.
[P](R)={tR | t[A] max(PR )}
Note: [P](R)
evaluates P against database set R by retrievingall maximal
R
values from P
Shooting Stars in the Sky:
An Online Algorithm for Skyline Queries
Authors:
Donald Kossmann
Frank Ramsak
Steen Rost
Technische Universit. at M. unchen
Orleansstr. 34
81667 Munich
Germany
A Online Algorithm for Skyline Queries
Contents:
Skyline Queries
The NN Algorithm for 2-Dimensional Skylines
An Example for 2-Dimensional Skylines
The NN Algorithm for d-Dimensional Skylines
Skyline Queries
Retrieves all interesting
points.
Helps user get a big
picture of interesting
options.
If users moves, skyline
should be re-computed,
user’s choice based on
user’s location.
The NN Algorithm for 2- Dimensional
Skylines
Input: Data set D
Distance function f (Euclidean distance)
/* Initialization: the whole data space needs to be inspected*/
T ={(, )}
/* Loop: iterate until all regions have been investigated*/
WHILE (T 0) Do
(mx, my)=takeElement(T)
IF( boundedNNSearch(O, D, (mx, my), f)) THEN
( nx, ny)= boundedNNSearch(O, D, (mx, my), f))
T=T{(nx, my), (mx, ny)}
OUTPUT n
ENDIF
ENDWHILE
An Example for 2-Dimensional Skylines
The NN Algorithm for d-Dimensional
Skylines
Approaches to Deal With Duplicates
Laisser-faire
propagate
Merge
Fine-grained Partitioning
Hybrid Approaches
Foundations of Preferences in Database
Systems
Shooting Stars in the Sky:
An Online Algorithm for Skyline Queries
Authors:
Donald Kossmann
Frank Ramsak
Steen Rost
Technische Universit. at M. unchen
Orleansstr. 34
81667 Munich
Germany