Evaluating & Comparing (Deterministic) BT Search Algorithms
Download
Report
Transcript Evaluating & Comparing (Deterministic) BT Search Algorithms
Evaluation of (Deterministic) BT
Search Algorithms
Foundations of Constraint Processing
CSCE421/821, Fall 2014
www.cse.unl.edu/~choueiry/F14-421-821/
All questions to Piazza
Berthe Y. Choueiry (Shu-we-ri)
Avery Hall, Room 360
Foundations of Constraint Processing
Evaluation to BT Search
1
Outline
• Evaluation of (deterministic) BT search
algorithms
[Dechter, 6.6.2]
–
–
–
–
CSP parameters
Comparison criteria
Theoretical evaluations
Empirical evaluations
Foundations of Constraint Processing
Evaluation to BT Search
2
CSP parameters
•
•
•
•
•
•
Binary: n,a,p1,t; Non-binary: n,a,p1,k,t
Number of variables: n
Domain size: a, d
Degree of a variable: deg
Arity of the constraints: k
forbidden tuples
Constraint tightness:
t
all tuples
• Proportion of constraints (a.k.a., constraint density, constraint probability)
p1 = e / emax, e is number of constraints
Foundations of Constraint Processing
Evaluation to BT Search
3
Comparison criteria
1.
Number of nodes visited (#NV)
•
2.
Every time you call label
Number of constraint check (#CC)
•
3.
Every time you call check(i,j)
CPU time
•
4.
Be as honest and consistent as possible
Number of Backtracks (#BT)
•
5.
Every un-assignment of a variable in unlabel
Some specific criterion for assessing the quality of the improvement
proposed
Presentation of values:
•
•
•
Descriptive statistics of criterion: average, median, mode, max, min
(qualified) run-time distribution
Solution-quality distribution
Foundations of Constraint Processing
Evaluation to BT Search
4
Theoretical evaluations
• Comparing NV and/or CC
• Common assumptions:
– for finding all solutions
– static/same orderings
Foundations of Constraint Processing
Evaluation to BT Search
5
Empirical evaluation: data sets
• Use real-world data (anecdotal evidence)
• Use benchmarks
– csplib.org
– Solver competition benchmarks
• Use randomly generated problems
– Various models of random generators
– Guaranteed with a solution
– Uniform or structured
Foundations of Constraint Processing
Evaluation to BT Search
6
Empirical evaluations: random problems
• Various models exist (use Model B)
– Models A, B, C, E, F, etc.
• Vary parameters: <n, a, t, p>
–
–
–
–
Number of variables: n
Domain size: a, d
Constraint tightness: t = |forbidden tuples| / | all tuples |
Proportion of constraints (a.k.a., constraint density, constraint
probability): p1 = e / emax
• Issues:
– Uniformity
– Difficulty (phase transition)
– Solvability of instances (for incomplete search techniques)
Foundations of Constraint Processing
Evaluation to BT Search
7
Model B
1. Input: n, a, t, p1
2. Generate n nodes
3. Generate a list of n.(n-1)/2 tuples of all combinations of
2 nodes
4. Choose e elements from above list as constraints to
between the n nodes
5. If the graph is not connected, throw away, go back to
step 4, else proceed
6. Generate a list of a2 tuples of all combinations of 2
values
7. For each constraint, choose randomly a number of
tuples from the list to guarantee tightness t for the
constraint
Foundations of Constraint Processing
Evaluation to BT Search
8
Cost of solving
Phase transition
Mostly solvable
problems
[Cheeseman et al. ‘91]
Mostly un-solvable
problems
Critical value of
order parameter
Order parameter
• Significant increase of cost around critical value
• In CSPs, order parameter is constraint tightness & ratio
• Algorithms compared around phase transition
Foundations of Constraint Processing
Evaluation to BT Search
9
Tests
• Fix n, a, p1 and
– Vary t in {0.1, 0.2, …,0.9}
• Fix n, a, t and
– Vary p1 in {0.1, 0.2, …,0.9}
• For each data point (for each value of t/p1)
– Generate (at least) 50 instances
– Store all instances
• Make measurements
– #NV, #CC, CPU time, #messages, etc.
Foundations of Constraint Processing
Evaluation to BT Search
Comparing two algorithms A1 and A2
• Store all measurements in Excel
• Use Excel, R, SAS, etc. for statistical
measurements
#CC
• Use the t-test, paired test
A1
A2
• Comparing measurements
– A1, A2 a significantly different
• Comparing ln measurements
i1
100
i2
…
200
ln(#CC)
A1
A2
…
…
i3
…
i50
– A1is significantly better than A2
For Excel: Microsoft button, Excel Options, Adds in, Analysis ToolPak, Go, check
the box for Analysis ToolPak, Go. Intall…
Foundations of Constraint Processing
Evaluation to BT Search
t-test in Excel
• Using ln values
– p ttest(array1,array2,tails,type)
• tails=1 or 2
• type1 (paired)
– t tinv(p,df)
• degree of freedom = #instances – 2
Foundations of Constraint Processing
Evaluation to BT Search
t-test with 95% confidence
• One-tailed test
–
–
–
–
–
Interested in direction of change
When t > 1.645, A1 is larger than A2
When t -1.645, A2 is larger than A1
When -1.645 t 1.645, A1 and A2 do not differ significantly
|t|=1.645 corresponds to p=0.05 for a one-tailed test
• Two-tailed test
–
–
–
–
–
Although it tells direction, not as accurate as the one-tailed test
When t > 1.96, A1 is larger than A2
When t -1.96, A2 is larger than A1
When -1.96 t 1.96, A1 and A2 do not differ significantly
|t|=1.96 corresponds to p=0.05 for a two-tailed test
• p=0.05 is a US Supreme Court ruling: any statistical analysis needs
to be significant at the 0.05 level to be admitted in court
Foundations of Constraint Processing
Evaluation to BT Search
Computing the 95% confidence interval
• The t test can be used to test the equality of the means
of two normal populations with unknown, but equal,
variance.
• We usually use the t-test
• Assumptions
Normal distribution of data
Sampling distributions of the mean approaches a uniform
distribution (holds when #instances 30)
Equality of variances
Sampling distribution: distribution calculated from all possible samples
of a given size drawn from a given population
Foundations of Constraint Processing
Evaluation to BT Search
Alternatives to the t test
• To relax the normality assumption, a non-parametric
alternative to the t test can be used, and the usual
choices are:
– for independent samples, the Mann-Whitney U test
– for related samples, either the binomial test or the Wilcoxon
signed-rank test
• To test the equality of the means of more than two
normal populations, an Analysis of Variance can be
performed
• To test the equality of the means of two normal
populations with known variance, a Z-test can be
performed
Foundations of Constraint Processing
Evaluation to BT Search
Alerts
• For choosing the value of t in general, check
http://www.socr.ucla.edu/Applets.dir/T-table.html
• For a sound statistical analysis
– consult the Help Desk of the Department of Statistics
at UNL
– held at least twice a week at Avery Hall.
• Acknowledgments: Dr. Makram Geha, Department of
Statistics @ UNL. All errors are mine..
Foundations of Constraint Processing
Evaluation to BT Search