Evaluating (deterministic) BT Search Algorithms

Download Report

Transcript Evaluating (deterministic) BT Search Algorithms

Evaluation of (Deterministic) BT
Search Algorithms
Foundations of Constraint Processing
CSCE421/821, Spring 2009
www.cse.unl.edu/~choueiry/S09-421-821/
All questions to [email protected]
Berthe Y. Choueiry (Shu-we-ri)
Avery Hall, Room 360
[email protected]
Tel: +1(402)472-5444
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
1
Outline
• Evaluation of (deterministic) BT search
algorithms
[Dechter, 6.6.2]
–
–
–
–
CSP parameters
Comparison criteria
Theoretical evaluations
Empirical evaluations
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
2
CSP parameters
•
•
•
•
•
•
Binary: n,a,p1,t; Non-binary: n,a,p1,k,t
Number of variables: n
Domain size: a, d
Degree of a variable: deg
Arity of the constraints: k
forbidden tuples
Constraint tightness:
t
all tuples
• Proportion of constraints (a.k.a., constraint density, constraint probability)
p1 = e / emax, e is number of constraints
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
3
Comparison criteria
1.
Number of nodes visited (#NV)
•
2.
Every time you call label
Number of constraint check (#CC)
•
3.
Every time you call check(i,j)
CPU time
•
4.
Be as honest and consistent as possible
Number of Backtracks (#BT)
•
5.
Every un-assignment of a variable in unlabel
Some specific criterion for assessing the quality of the improvement
proposed
Presentation of values:
•
•
•
Descriptive statistics of criterion: average, median, mode, max, min
(qualified) run-time distribution
Solution-quality distribution
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
4
Theoretical evaluations
• Comparing NV and/or CC
• Common assumptions:
– for finding all solutions
– static/same orderings
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
5
Empirical evaluation: data sets
• Use real-world data (anecdotal evidence)
• Use benchmarks
– csplib.org
– Solver competition benchmarks
• Use randomly generated problems
– Various models of random generators
– Guaranteed with a solution
– Uniform or structured
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
6
Empirical evaluations: random problems
• Various models exist (use Model B)
– Models A, B, C, E, F, etc.
• Vary parameters: <n, a, t, p>
–
–
–
–
Number of variables: n
Domain size: a, d
Constraint tightness: t = |forbidden tuples| / | all tuples |
Proportion of constraints (a.k.a., constraint density, constraint
probability): p1 = e / emax
• Issues:
– Uniformity
– Difficulty (phase transition)
– Solvability of instances (for incomplete search techniques)
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
7
Model B
1. Input: n, a, t, p1
2. Generate n nodes
3. Generate a list of n.(n-1)/2 tuples of all combinations of
2 nodes
4. Choose e elements from above list as constraints to
between the n nodes
5. If the graph is not connected, throw away, go back to
step 4, else proceed
6. Generate a list of a2 tuples of all combinations of 2
values
7. For each constraint, choose randomly a number of
tuples from the list to guarantee tightness t for the
constraint
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
8
Cost of solving
Phase transition
Mostly solvable
problems
[Cheeseman et al. ‘91]
Mostly un-solvable
problems
Critical value of
order parameter
Order parameter
• Significant increase of cost around critical value
• In CSPs, order parameter is constraint tightness & ratio
• Algorithms compared around phase transition
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
9
Tests
• Fix n, a, p1 and
– Vary t in {0.1, 0.2, …,0.9}
• Fix n, a, t and
– Vary p1 in {0.1, 0.2, …,0.9}
• For each data point (for each value of t/p1)
– Generate (at least) 50 instances
– Store all instances
• Make measurements
– #NV, #CC, CPU time, #messages, etc.
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
Comparing two algorithms A1 and A2
• Store all measurements in Excel
• Use Excel, R, SAS, etc. for statistical
measurements
#CC
• Use the t-test, paired test
A1
A2
• Comparing measurements
– A1, A2 a significantly different
• Comparing ln measurements
i1
100
i2
…
200
ln(#CC)
A1
A2
…
…
i3
…
i50
– A1is significantly better than A2
For Excel: Microsoft button, Excel Options, Adds in, Analysis ToolPak, Go, check the box
for Analysis ToolPak, Go. Intall…
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
t-test in Excel
• Using ln values
– p  ttest(array1,array2,tails,type)
• tails=1 or 2
• type1 (paired)
– t  tinv(p,df)
• degree of freedom = #instances – 2
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
t-test with 95% confidence
• One-tailed test
–
–
–
–
–
Interested in direction of change
When t > 1.645, A1 is larger than A2
When t  -1.645, A2 is larger than A1
When -1.645  t  1.645, A1 and A2 do not differ significantly
|t|=1.645 corresponds to p=0.05 for a one-tailed test
• Two-tailed test
–
–
–
–
–
Although it tells direction, not as accurate as the one-tailed test
When t > 1.96, A1 is larger than A2
When t  -1.96, A2 is larger than A1
When -1.96  t  1.96, A1 and A2 do not differ significantly
|t|=1.96 corresponds to p=0.05 for a two-tailed test
• p=0.05 is a US Supreme Court ruling: any statistical analysis needs
to be significant at the 0.05 level to be admitted in court
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
Computing the 95% confidence interval
• The t test can be used to test the equality of the means
of two normal populations with unknown, but equal,
variance.
• We usually use the t-test
• Assumptions
Normal distribution of data
Sampling distributions of the mean approaches a uniform
distribution (holds when #instances  30)
Equality of variances
Sampling distribution: distribution calculated from all possible samples
of a given size drawn from a given population
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
Alternatives to the t test
• To relax the normality assumption, a non-parametric
alternative to the t test can be used, and the usual
choices are:
– for independent samples, the Mann-Whitney U test
– for related samples, either the binomial test or the Wilcoxon
signed-rank test
• To test the equality of the means of more than two
normal populations, an Analysis of Variance can be
performed
• To test the equality of the means of two normal
populations with known variance, a Z-test can be
performed
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search
Alerts
• For choosing the value of t in general, check
http://www.socr.ucla.edu/Applets.dir/T-table.html
• For a sound statistical analysis
– consult the Help Desk of the Department of Statistics
at UNL
– held at least twice a week at Avery Hall.
• Acknowledgments: Dr. Makram Geha, Department of
Statistics @ UNL. All errors are mine..
Foundations of Constraint Processing, Spring 2009
February 20, 2009
Evaluation to BT Search