April 21, 2005 Lal: M.S. thesis defense 1 Constraint Systems Laboratory Neighborhood Interchangeability (NI) for Non-Binary CSPs & Application to Databases Anagh Lal Constraint Systems Laboratory Computer Science & Engineering University of Nebraska-Lincoln Research supported by NSF CAREER award #0133568 and by Maude Hammond Fling Faculty Research Fellowship.
40
Embed
Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense1 Neighborhood Interchangeability (NI) for Non-Binary CSPs & Application to Databases.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
April 21, 2005 Lal: M.S. thesis defense 1
Constraint Systems Laboratory
Neighborhood Interchangeability (NI) for Non-Binary CSPs
& Application to Databases
Anagh Lal
Constraint Systems LaboratoryComputer Science & Engineering
University of Nebraska-Lincoln
Research supported by NSF CAREER award #0133568 and by Maude Hammond Fling Faculty Research Fellowship.
April 21, 2005 Lal: M.S. thesis defense 2
Constraint Systems Laboratory
Main contributions
CSPs1. Interchangeability: An algorithm for neighborhood
interchangeability (NI) in non-binary CSPs2. Dynamic bundling: Integrating NI + backtrack search
for solving non-binary CSPs3. Exploratory: Towards detecting substitutabilityDatabases1. A new model of the join query as a CSP2. A new sorting-based bundling algorithm3. A new sort-merge join algorithm that produces
bundled tuples4. Exploratory: Application to materialized views
April 21, 2005 Lal: M.S. thesis defense 3
Constraint Systems Laboratory
Outline
• Background
• Neighborhood Interchangeability (NI) for non-binary CSPs
• Empirical evaluations
• Database algorithms based on dynamic bundling
• Conclusions & future work
Administrator
April 21, 2005 Lal: M.S. thesis defense 4
Constraint Systems Laboratory
Constraint Satisfaction Problem
• Given P = (V, D, C)– V : set of variables– D : set of their domains– C : set of constraints restricting the acceptable
combination of values for variables– Solution is a consistent assignment of values to variables
• Query: find 1 solution, all solutions, etc.• Examples: SAT, scheduling, product configuration• NP-Complete in general
V3
{d}
{a, b, d} {a, b, c}
{c, d, e, f}
V4
V2V1
April 21, 2005 Lal: M.S. thesis defense 5
Constraint Systems Laboratory
Systematic search• Basic mechanism
– DFS & backtracking (BT)– Variable being instantiated: current variable– Uninstantiated variables: future variables– Instantiated variables: past variables
• Constraint propagation– Remove values inconsistent with constraints– Forward checking filters domains of future
variables given the instantiation of current variable
April 21, 2005 Lal: M.S. thesis defense 6
Constraint Systems Laboratory
Value interchangeability [Freuder, ‘91]
Equivalent values in the domain of a variable
{c, d, e, f }{d}
{a, b, d} {a, b, c}
V4
V2V1
V3
• Full Interchangeability (FI): – d, e, f interchangeable for V2 in any solution
• Neighborhood Interchangeability (NI): – Efficiently approximates FI– Finds e, f but misses d– Discrimination tree DT(Vx)
April 21, 2005 Lal: M.S. thesis defense 7
Constraint Systems Laboratory
• Dynamic bundling [Our group, ‘01]
– Dynamically identifies NI– Finds fatter solution than BT & static bundling– Never less efficient than BT & static bundling
product of bundles of variables• Solution-bundle size
= 1 3 1 2 = 6
April 21, 2005 Lal: M.S. thesis defense 9
Constraint Systems Laboratory
Phase transition [Cheeseman et al. ‘91]
• Significant increase of cost around critical value• In CSPs, order parameter is constraint tightness & ratio• Algorithms compared around phase transition
Cos
t of
sol
ving
Mostly solvable problems
Mostly un-solvable problems
Critical value of order parameter
Order parameter
April 21, 2005 Lal: M.S. thesis defense 10
Constraint Systems Laboratory
Non-binary CSPs
Constraint Variable
C1 C2 C3 C4
V V1 V2 V V3 V2 V3 V4 V1 V4
1 1 3 1 3 1 2 1 1 1
1 3 3 2 3 1 2 2 2 2
2 1 3 3 2 2 2 1 3 1
2 3 3 4 2 2 2 2
3 1 1 4 2 3 1 1
3 2 2 6 1
4 1 1
4 2 2
5 3 2
6 3 2
C4
{1, 2, 3, 4, 5, 6}
{1, 2, 3}
{1, 2, 3}
{1, 2, 3}
{1, 2, 3}
C2
C1
C3 V1
V2
V3
V4
V
• Scope(Cx): the set of variables involved in Cx
• Arity(Cx): size of scope
Computing NI for non-binary CSPs is not a trivial extension from binary CSPs
April 21, 2005 Lal: M.S. thesis defense 11
Constraint Systems Laboratory
CSP parameters
• n number of variables
• a domain size
• t constraint tightnessratio of number of disallowed tuples over all possible tuples
• deg degree of a variable
• ck number of constraints of arity k
• pk = ck / (nk) constraint ratio
April 21, 2005 Lal: M.S. thesis defense 12
Constraint Systems Laboratory
Outline
• Background
• Neighborhood Interchangeability (NI) for non-binary CSPs– Non-binary discrimination tree (nb-DT)
• Empirical evaluations
• Database algorithms based on dynamic bundling
• Conclusions & future work
Administrator
April 21, 2005 Lal: M.S. thesis defense 13
Constraint Systems Laboratory
NI for non-binary CSPs1. Building an nb-DT for each constraint
– Determines the NI sets of variable given constraint
2. Intersecting partitions from nb-DTs – Yields NI sets of V (partition of DV)
3. Processing paths in nb-DTs– Gives, for free, updates necessary for forward checking
C4
{1, 2, 3, 4, 5, 6}
C2
C1
C3
V1
V2
V3
V4
V
{1, 2} {5, 6} {3, 4}
Root
nb-DT(V, C1)
Root
{1, 2} {3, 4}{6}
{5}
nb-DT(V, C2)
April 21, 2005 Lal: M.S. thesis defense 14
Constraint Systems Laboratory
Building an nb-DT: nb-DT(V, C1)
(<V1 1>, <V2 3>)
(<V1 3>, <V2 3>)
{1, 2}
Root
C1
V V1 V2
1 1 3
1 3 3
2 1 3
2 3 3
3 1 1
3 2 2
4 1 1
4 2 2
5 3 2
6 3 2
(<V1 3>, <V2 2>)
Annotation
Path
{1}
Domain of V
5 62 3 41
O (deg . a (k+1) . (1 - t))
(<V1 2>, <V2 2>)
{3, 4}
(<V1 1>, <V2 1>)
{5, 6}
April 21, 2005 Lal: M.S. thesis defense 15
Constraint Systems Laboratory
Bundling = Search + NI• Benefits of bundling
1. Bundles solutions
2. Bundles no-goods
• Dynamic bundling (DynBndl)– Re-computes NI during search– Yields larger bundles,boosts effects
of bundling
• Skeptics’ objection to DynBndl – Costly & not worthwhile
• We show that the converse holds
{3, 4}
{2}
{1}
{1, 2}
{1, 3}{1}
{3}{1}
No-good
bundle
V
V4
V3
V1
V2
Solution bundle
April 21, 2005 Lal: M.S. thesis defense 16
Constraint Systems Laboratory
Advantages of DynBndl
• We exploit nb-DTs for forward checking• DynBndl versus FC (BT+ forward checking)
– Finding all solutions: theoretically best– Finding first solution: empirical evidence
DynBndl yields multiple, robust for less cost
April 21, 2005 Lal: M.S. thesis defense 17
Constraint Systems Laboratory
Outline
• Background
• Neighborhood Interchangeability (NI) for non-binary CSPs
• Empirical evaluations
• Database algorithms based on dynamic bundling
• Conclusions & future work
Administrator
April 21, 2005 Lal: M.S. thesis defense 18
Constraint Systems Laboratory
Empirical evaluations
• DynBndl versus FC (BT+forward checking)
• Experiments– Effect of varying tightness– In the phase-transition region
• Effect of varying domain size • Effect of varying constraint ratio (CR)
• Randomly generated problems, Model B• ANOVA to statistically compare performance of
DynBndl and FC with varying t• t-distribution for confidence intervals
April 21, 2005 Lal: M.S. thesis defense 19
Constraint Systems Laboratory
Experimental set-up
• Generated 16 data sets– n = {20,30} a = {10,15} {CR1,CR2,CR3,CR4}– 9—12 values for t [25%,75%] – 1,000 instances per tightness value
• Performance measurements– FBS, size of the first solution bundle– NV, number of nodes visited in the search tree– CC, number of constraints checked– CPU time
April 21, 2005 Lal: M.S. thesis defense 20
Constraint Systems Laboratory
Analysis: Varying tightness• Low tightness
– Large FBS • 33 at t=0.35 • 2254 (Dataset #13, t=0.35)
– Small additional cost
• Phase transition– Multiple solutions present– Maximum no-good bundling
causes max savings in CPU time, NV, & CC
• High tightness– Problems mostly unsolvable– Overhead of bundling minimal
Analysis: Varying domain size• Increasing a in phase-
transition– FBS increases: More
chances for symmetry– CPU time decreases:
more bundling of no-goods
CR Improv (CPU) %
FBS
a=10 a=15 a=10 a=15
CR1 33.3 34.3 5.5 11.9
CR2 28.6 33.0 5.0 5.5
CR3 29.8 31.7 3.6 5.0
CR4 28.4 31.6 1.2 1.4
Increasing a (n=30)
Because the benefits of DynBndl increase with increasing domain size, DynBndl is particularly interesting for database applications where large domains are typical
April 21, 2005 Lal: M.S. thesis defense 22
Constraint Systems Laboratory
Outline
• Background• Neighborhood Interchangeability (NI) for
non-binary CSPs• Empirical evaluations• Database algorithms based on