Constraint-satisfaction problems (and how to solve them)🔗ℹ

Matthew Butterick <mb@mbtype.com>

package: csp

This package is in development. I make no commitment to maintaining the public interface documented below.

Simple solvers for simple constraint-satisfaction problems. It uses the forward-checking + conflict-directed backjumping algorithm described in Hybrid Algorithms for the Constraint Satisfaction Problem by Patrick Prosser. Plus other improvements of my own devising.

1 Installation & usage🔗ℹ

At the command line:

raco pkg install csp

After that, you can update the package like so:

raco pkg update csp

Import into your program like so:

(require csp)

2 Introduction🔗ℹ

A constraint-satisfaction problem (often shortened to CSP) has two ingredients. The first is a set of variables, each associated with a set of possible values (called its domain). The other is a set of constraints — a fancy word for rules — that describe relationships among the variables.

When we select a value for each variable, we have what’s known as an assignment or a state. Solving a CSP means finding an assignment that satisfies all the constraints. A CSP may have any number of solution states (including zero).

Even if the name is new, the idea of a CSP is probably familiar. For instance, many brain teasers — like Sudoku or crosswords or logic puzzles — are really just constraint-satisfaction problems. (Indeed, you can use this package to ruin all of them.)

When the computer solves a CSP, it’s using an analogous process of deductive reasoning to eliminate impossible assignments, eventually converging on a solution (or determining that no solution exists).

3 So this is the ultimate tool for the lazy programmer?🔗ℹ

It allows us to describe a problem to the computer in higher-level terms than we usually do. That can be helpful when we have no idea how to create a specialized algorithm, or we just don’t feel like it.

But there’s still some finesse and artistry involved in setting up the CSP, especially its constraints. In general, a CSP with more constraints will converge on a solution faster. Furthermore, since we’re not just lazy but also impatient, we usually want our answer in a few seconds, not tomorrow or next week. So it’s usually worth spending a little extra effort to specify the constraints as carefully as we can, to maximize our chances of getting an answer in a reasonable time.

4 First example🔗ℹ

Suppose we wanted to find Pythagorean triples with sides between 1 and 29, inclusive.

First we create a new CSP called triples, using make-csp:

> (require csp)
> (define triples (make-csp))

We use CSP variables to represent the values in the triple. We insert each one with add-var!, where each variable has a symbol for its name and a list of values for its domain:

> (add-var! triples 'a (range 1 30))
> (add-var! triples 'b (range 1 30))
> (add-var! triples 'c (range 1 30))

Then we need our constraint. We make a function called valid-triple? that tests three values to see if they qualify as a Pythagorean triple. Then we insert this function as a constraint using add-constraint!, passing as arguments 1) the function we want to use for the constraint, and 2) a list of variable names that the constraint applies to.

> (define (valid-triple? x y z)
(= (expt z 2) (+ (expt x 2) (expt y 2))))
> (add-constraint! triples valid-triple? '(a b c))

Notice that the argument names used within the constraint function (x y z) have nothing to do with the CSP variable names that are passed to the function '(a b c). This makes sense — we might want constraints that apply the same function to different groups of CSP variables. What’s important is that the arity of the constraint function matches the number of variable names, and that the variable names are ordered correctly (the first variable will become the first argument to the constraint function, and so on).

Finally we call solve, which finds a solution (if it exists):

> (solve triples)
'((a . 12) (b . 9) (c . 15))

Perhaps we’re curious to see how many of these triples exist. We use solve* to find all 20 solutions:

> (solve* triples)
'(((a . 12) (b . 9) (c . 15))
  ((a . 12) (b . 5) (c . 13))
  ((a . 12) (b . 16) (c . 20))
  ((a . 10) (b . 24) (c . 26))
  ((a . 9) (b . 12) (c . 15))
  ((a . 24) (b . 10) (c . 26))
  ((a . 24) (b . 7) (c . 25))
  ((a . 8) (b . 6) (c . 10))
  ((a . 8) (b . 15) (c . 17))
  ((a . 7) (b . 24) (c . 25))
  ((a . 6) (b . 8) (c . 10))
  ((a . 21) (b . 20) (c . 29))
  ((a . 5) (b . 12) (c . 13))
  ((a . 20) (b . 21) (c . 29))
  ((a . 20) (b . 15) (c . 25))
  ((a . 4) (b . 3) (c . 5))
  ((a . 3) (b . 4) (c . 5))
  ((a . 15) (b . 8) (c . 17))
  ((a . 15) (b . 20) (c . 25))
  ((a . 16) (b . 12) (c . 20)))

“But some of those solutions are just multiples of others, like 3–4–5 and 6–8–10.” True. Suppose we want to ensure that the values in each solution have no common factors. We add a new coprime? constraint:

> (require math/number-theory)
> (add-constraint! triples coprime? '(a b c))

We solve* again to see the reduced set of 10 results:

> (solve* triples)
'(((a . 12) (b . 5) (c . 13))
  ((a . 24) (b . 7) (c . 25))
  ((a . 8) (b . 15) (c . 17))
  ((a . 7) (b . 24) (c . 25))
  ((a . 21) (b . 20) (c . 29))
  ((a . 5) (b . 12) (c . 13))
  ((a . 20) (b . 21) (c . 29))
  ((a . 4) (b . 3) (c . 5))
  ((a . 3) (b . 4) (c . 5))
  ((a . 15) (b . 8) (c . 17)))

“But really there’s only five unique solutions — the values for a and b are swapped in the other five.” Fair enough. We might say that this problem is symmetric relative to variables a and b, because they have the same domains and are constrained the same way. We can break the symmetry by adding a constraint that forces a to be less than or equal to b:

> (add-constraint! triples <= '(a b))
> (solve* triples)
'(((a . 8) (b . 15) (c . 17))
  ((a . 7) (b . 24) (c . 25))
  ((a . 5) (b . 12) (c . 13))
  ((a . 20) (b . 21) (c . 29))
  ((a . 3) (b . 4) (c . 5)))

Now our list of solutions doesn’t have any symmetric duplicates.

By the way, what if we had accidentally included c in the last constraint?

> (add-constraint! triples <= '(a b c))
> (solve* triples)
'(((a . 8) (b . 15) (c . 17))
  ((a . 7) (b . 24) (c . 25))
  ((a . 5) (b . 12) (c . 13))
  ((a . 20) (b . 21) (c . 29))
  ((a . 3) (b . 4) (c . 5)))

Nothing changes. Why not? Because of the existing valid-triple? constraint, c is necessarily going to be larger than a and b. So it always meets this constraint too. It’s good practice to not duplicate constraints between the same sets of variables — the “belt and suspenders” approach just adds work for no benefit.

We should use solve* with care. It can’t finish until the CSP solver examines every possible assignment of values in the problem, which can be a big number. Specifically, it’s the product of the domain sizes of each variable, which in this case is 29 × 29 × 29 = 24,389. This realm of possible assignments is also known as the CSP’s state space. We can also get this number from state-count:

> (state-count triples)
24389

It’s easy for a CSP to have a state count in the zillions. For this reason we can supply solve* with an optional argument that will only generate a certain number of solutions:

> (time (solve* triples))
cpu time: 2 real time: 2 gc time: 0
'(((a . 8) (b . 15) (c . 17))
  ((a . 7) (b . 24) (c . 25))
  ((a . 5) (b . 12) (c . 13))
  ((a . 20) (b . 21) (c . 29))
  ((a . 3) (b . 4) (c . 5)))
> (time (solve* triples 2))
cpu time: 0 real time: 0 gc time: 0
'(((a . 8) (b . 15) (c . 17)) ((a . 7) (b . 24) (c . 25)))

Here, the answers are the same. But the second call to solve* finishes sooner, because it quits as soon as it’s found two solutions.

Of course, even when we use ordinary solve, we don’t know how many assignments it will have to try before it finds a solution. If the problem is impossible, even solve will have to visit the entire state space before it knows for sure. For instance, let’s see what happens if we add a constraint that’s impossible to meet:

> (add-constraint! triples = '(a b c))
> (solve triples)
#f

Disappointing but accurate.

The whole example in one block:

(require csp)

(define triples (make-csp))

(add-var! triples 'a (range 1 30))
(add-var! triples 'b (range 1 30))
(add-var! triples 'c (range 1 30))

(define (valid-triple? x y z)
(= (expt z 2) (+ (expt x 2) (expt y 2))))
(add-constraint! triples valid-triple? '(a b c))

(require math/number-theory)
(add-constraint! triples coprime? '(a b c))

(add-constraint! triples <= '(a b))

(solve* triples)

5 Interlude🔗ℹ

“Dude, are you kidding me? I can write a much shorter loop to do the same thing—"

> (for*/list ([a (in-range 1 30)]
              [b (in-range 1 30)]
              #:when (<= a b)
              [c (in-range 1 30)]
              #:when (and (coprime? a b c) (valid-triple? a b c)))
             (map cons '(a b c) (list a b c)))
'(((a . 3) (b . 4) (c . 5))
  ((a . 5) (b . 12) (c . 13))
  ((a . 7) (b . 24) (c . 25))
  ((a . 8) (b . 15) (c . 17))
  ((a . 20) (b . 21) (c . 29)))

Yes, I agree that in this toy example, the CSP approach is overkill. The variables are few enough, the domains small enough, and the constraints simple enough, that a loop is more concise. Also, with only 24,389 possibilities in the state space, this sort of brute-force approach is cheap & cheerful.

6 Second example🔗ℹ

But what about a more complicated problem — like a Sudoku? A Sudoku has 81 squares, each of which can hold the digits 1 through 9. The goal in Sudoku is to fill the grid so that no row, no column, and no “box” (a 3 × 3 subgroup of cells) has a duplicate digit. About 25 of the squares are filled in at the start, so the size of the state space is therefore:

> (expt 9 (- 81 25))
273892744995340833777347939263771534786080723599733441

Well over a zillion, certainly. Let’s optimistically suppose that the 3.7GHz processor in your computer takes one cycle to check an assignment. There are 31,557,600 seconds in a year, so the brute-force method will only take this many years:

> (define states (expt 9 (- 81 25)))
> (define states-per-second (* 3.7 1000000000.0))
> (define seconds-per-year 31557600)
> (/ states states-per-second seconds-per-year)
2.3457127986588646e+36

"sudoku.rkt"
#lang racket
(require csp)

(define (make-base-sudoku)
  (define sudoku (make-csp))

  (define cells (range 81))
  (add-vars! sudoku cells (range 1 10))

  (for ([i 9])
    (define row-cells (filter (λ (cell) (= (quotient cell 9) i)) cells))
    (add-all-diff-constraint! sudoku row-cells)

    (define col-cells (filter (λ (cell) (= (remainder cell 9) i)) cells))
    (add-all-diff-constraint! sudoku col-cells))

  (define box-starts '(0 3 6 27 30 33 54 57 60))
  (define box-offsets '(0 1 2 9 10 11 18 19 20))
  (for ([start box-starts])
    (add-all-diff-constraint! sudoku (map (curry + start) box-offsets)))

  sudoku)

(define (make-sudoku-board . strs)
  (define sudoku (make-base-sudoku))
  (define vals (for*/list ([str (in-list strs)]
                           [c (in-string str)]
                           #:unless (memv c '(#\- #\|)))
                 (string->number (string c))))
  (for ([(val vidx) (in-indexed vals)]
        #:when val)
    (add-constraint! sudoku (curry = val) (list vidx)))
  sudoku)

(current-inference forward-check)
(current-select-variable mrv-degree-hybrid)
(current-order-values shuffle)
(current-node-consistency #t)
(current-arity-reduction #t)

(solve (make-sudoku-board
        "  8|   | 45"
        "   | 8 |9  "
        "  2|4  |   "
        "-----------"
        "5  |  1|76 "
        " 1 | 7 | 8 "
        " 79|5  |  1"
        "-----------"
        "   |  7|4  "
        "  7| 6 |   "
        "65 |   |3  "))

7 Another interlude🔗ℹ

“Dude, are you serious? The JMAXX Sudoku Solver runs three to four times faster—”

Yes, I agree that an algorithm custom-tailored to the problem will likely beat the CSP solver, which is necessarily general-purpose.

But let’s consider the labor involved. To write something like the JMAXX Sudoku Solver, we’d need a PhD in computer science, and the time to explain not just the rules of Sudoku to the computer, but the process for solving a Sudoku.

By contrast, when we use a CSP, all we need are the rules. The CSP solver does the rest. In this way, a CSP gives us an alternative, simpler way to explain Sudoku to the computer, just like regular expressions are an alternate way of expressing string patterns. And if the CSP solver is half a second slower, that seems like a reasonable tradeoff.

Daring minds might even consider a CSP solver to be a kind of domain-specific language.

8 Making & solving CSPs🔗ℹ

The variables in a CSP, and the possible values (aka the domains) of each, are usually determined by the problem itself. So when we create a CSP, there are really only two areas of artistry and finesse: the choice of constraints and the choice of solver (and related solver settings). It’s usually pretty easy to try different solvers & settings on a trial-and-error basis. So ultimately, most of the programming effort in CSPs comes down to designing constraints.

What makes a good list of constraints? In general, our goal is to use constraints to tell the solver things that are true about our CSP so that it can converge on a solution as fast as possible. Given that most CSP search spaces are as vast and barren as the Mojave, our constraints are often not just the difference between a fast solution and a slow one, but whether the solver can finish in a non-boring amount of time.

As it traverses the search space, the solver is constantly trying out partial assignments and learning about which avenues are likely to be fruitful. To that end, it wants to be able to avoid descending into parts of the search space that will be dead ends. For that reason, the most powerful tools for the CSP auteur are constraints relating two variables (aka two-arity constraints). Two-arity constraints can be checked early in the search process, and help the solver eliminate useless parts of the search space quickly. Two-arity constraints also work with inference functions like forward-check and ac-3. Once one of the variables is assigned, a two-arity constraint can be reduced to a one-arity constraint, which cooperates with node consistency (see current-node-consistency).

Say it with me again: two-arity constraints! OK, you got it.

Some other tips:

The golden rule is to use as many two-arity constraints as necessary. If you can express your CSP using nothing but two-arity constraints, so much the better.
Constraints with fewer variables are generally preferable to those with more variables. In programming we call this idea arity; in CSP solving it’s known as degree.
More constraints are better than fewer if the extra constraints use fewer variables. That is, lower-degree constraints are enough of a win that even if the lower-degree constraint overlaps with a higher-degree constraint, it’s still better to include it. Why? Because it lets the solver eliminate fruitless parts of the search tree by considering fewer variables.
Corollary: if a higher-degree constraint can be completely expressed in terms of lower-degree constraints, then do that, and get rid of the higher-degree constraint altogether. For an example, see add-pairwise-constraint!.
In cases where the constraints have the same degree, and they completely overlap in terms of what they prove, use the fewest possible consrtaints. For an example, see add-transitive-constraint!.

By the way, in terms of the program itself, it doesn’t matter what order you add the constraints. The CSP solver will visit them in whatever way it needs to.

procedure
(make-csp [vars constraints]) → csp?
vars : (listof var?) = null
constraints : (listof constraint?) = empty

Create a new CSP. Variables and constraints can be added to the CSP by passing them as arguments. Or you can create an empty CSP and then add variables and constraints imperatively (e.g., with add-var! or add-constraint!).

procedure
(add-var! prob name [domain]) → void?
  prob : csp?
  name : var-name?
  domain : (or/c (listof any/c) procedure?) = empty
procedure
(add-vars! prob names [domain]) → void?
  prob : csp?
  names : (listof var-name?)
  domain : (or/c (listof any/c) procedure?) = empty

Imperatively add a new variable called name to the CSP with permissible values listed in domain. The solution to a CSP is a list of pairs where each variable has been assigned a value from its domain.

add-vars! is the same, but adds multiple variables that have the same domain.

procedure
(add-constraint! prob func names [func-name]) → void?
  prob : csp?
  func : procedure?
  names : (listof var-name?)
  func-name : (or/c #false var-name?) = #f
procedure
(add-constraints! prob func namess [func-name]) → void?
  prob : csp?
  func : procedure?
  namess : (listof (listof var-name?))
  func-name : (or/c #false var-name?) = #f

Imperatively add a new constraint. The constraint applies the function func to the list of variable names given in names. The return value of func does not need to be a Boolean, but any return value other than #false is treated as if it were #true.

add-constraints! is the same, but adds the constraint func to each list of variable names in namess (which is therefore a list of lists of variable names).

procedure
(add-all-diff-constraint! prob
[ names]
#:same equal-proc) → void?
  prob : csp?
  names : (listof var-name?) = (map var-name (csp-vars prob))
  equal-proc : equal?

Imperatively add an “all diff” constraint, which is a pairwise (compose1 not equal?) constraint. A equality function other than equal? can be passed via the #:same argument. There is nothing special about using this function vs. applying the constraint manually.

procedure
(add-pairwise-constraint! prob
func
names
[ func-name]) → void?
  prob : csp?
  func : procedure?
  names : (listof var-name?)
  func-name : (or/c #false var-name?) = #f

Similar to add-constraint!, but it takes a two-arity procedure func and adds it as a constraint between each pair of names in names.

Why? CSPs are more efficient with lower-arity constraints (roughly, because you can rule out invalid values sooner). So usually, decomposing a larger-arity constraint into a group of smaller ones is a good idea.

For instance, suppose you have three variables, and you want them to end up holding values that are coprime. Your constraint function is coprime?. This function is variadic (meaning, it can take any number of arguments) so you could use add-constraint! like so:

(add-constraint! my-csp coprime? '(a b c))

But because the comparison can be done two at a time, we could write this instead:

(add-pairwise-constraint! my-csp coprime? '(a b c))

Which would be equivalent to:

(add-constraint! my-csp coprime? '(a b))
(add-constraint! my-csp coprime? '(b c))
(add-constraint! my-csp coprime? '(a c))

Still, add-pairwise-constraint! doesn’t substitute for thoughtful constraint design. For instance, suppose instead we want our variables to be strictly increasing. This time, our constraint function is <:

(add-constraint! my-csp < '(a b c))

And we could instead write:

(add-pairwise-constraint! my-csp < '(a b c))

Which would become:

(add-constraint! my-csp < '(a b))
(add-constraint! my-csp < '(b c))
(add-constraint! my-csp < '(a c))

This isn’t wrong, but if (< a b) and (< b c), then by transitivity, (< a c) is necessarily true. So pairwise expansion results in more constraints than we need, which in turn can make the search slower than it could be. In these situations, add-transitive-constraint! is the better choice.

procedure
(add-transitive-constraint! prob
func
names
[ func-name]) → void?
  prob : csp?
  func : procedure?
  names : (listof var-name?)
  func-name : (or/c #false var-name?) = #f

Similar to add-pairwise-constraint!, but adds the constraint between every sequential pair of names in names (not every possible pair).

For instance, consider this use of add-pairwise-constraint!:

(add-pairwise-constraint! my-csp < '(a b c d))

This applies the constraint between every possible pair, so the result is equivalent to:

(add-constraint! my-csp < '(a b))
(add-constraint! my-csp < '(a c))
(add-constraint! my-csp < '(a d))
(add-constraint! my-csp < '(b c))
(add-constraint! my-csp < '(b d))
(add-constraint! my-csp < '(c d))

This isn’t wrong, but as any seventh grader could tell you, it’s overkill. < is a transitive relation, therefore if it’s true that (< a b) and (< b c), it’s necessarily also true that (< a c). So there’s no need to apply a separate constraint for that.

This is the behavior we get from add-transitive-constraint!. For instance if we instead write this:

(add-transitive-constraint! my-csp < '(a b c d))

The constraint is applied between every sequential pair, so the result is equivalent to:

(add-constraint! my-csp < '(a b))
(add-constraint! my-csp < '(b c))
(add-constraint! my-csp < '(c d))

Same truth in half the constraints.

procedure
(make-var-names prefix vals [suffix]) → (listof symbol?)
  prefix : string?
  vals : (listof any/c)
  suffix : string? = ""

Helper function to generate mass quantities of variable names. The prefix and (optional) suffix strings are wrapped around each value in vals, and converted to a symbol.

> (make-var-names "foo" (range 6) "bar")
'(foo0bar foo1bar foo2bar foo3bar foo4bar foo5bar)
> (make-var-names "col" (range 10))
'(col0 col1 col2 col3 col4 col5 col6 col7 col8 col9)

procedure
(solve prob) → (or/c #false (listof (cons/c symbol? any/c)))
prob : csp?

Return a solution for the CSP, or #false if no solution exists.

procedure
(solve* prob [count]) → (listof (listof (cons/c symbol? any/c)))
prob : csp?
count : natural? = +inf.0

Return all the solutions for the CSP. If there are none, returns null. The optional count argument returns a certain number of solutions (or fewer, if not that many solutions exist)

syntax
(in-solutions prob count)

Iterator form for use with for loops that incrementally returns solutions to prob, up to a maximum of count.

9 Sideshows🔗ℹ

procedure
(state-count prob) → natural?
prob : csp?

Number of possible variable assignments for prob, otherwise known as the state space. This is the product of the domain sizes of each variable. So a CSP that assigns five variables, each of which can have the values "a-z", has a state count of (expt 5 26) = 1490116119384765625.

procedure
(csp->graph prob) → graph?
prob : csp?

Create an undirected graph (using Racket’s graph library) where each CSP variable is represented in the graph as a vertex, and each constraint between any pair of variables is represented as an edge.

procedure
(csp->graphviz prob) → string?
prob : csp?

Produce a Graphviz representation of the CSP that can be rendered into a beautiful diagram.

10 Parameters🔗ℹ

parameter
(current-select-variable) → (or/c #false procedure?)
(current-select-variable val) → void?
val : (or/c #false procedure?)
= #f

Next variable that the CSP solver will attempt to assign a value to. If #false, solver just picks the first unassigned variable.

parameter
(current-order-values) → (or/c #false procedure?)
(current-order-values val) → void?
val : (or/c #false procedure?)
= #f

Procedure that orders the remaining values in a domain. Default is #false, which means that the domain values are tried in their original order. If bad values are likely to be clustered together, it can be worth trying shuffle for this parameter, which randomizes which value gets chosen next. Shuffling is also helpful in CSPs where all the variable values must be different (because otherwise, the values for every variable are tried in the same order, which means that the search space is front-loaded with failure).

parameter
(current-inference) → (or/c #false procedure?)
(current-inference val) → void?
val : (or/c #false procedure?)
= forward-check

Current inference rule used by the solver. If #false, solver uses no-inference. Default is forward-check.

parameter
(current-solver) → procedure?
(current-solver val) → void?
val : procedure?
= backtracking-solver

Current solver algorithm used to solve the CSP. Default is backtracking-solver.

parameter
(current-decompose) → (or/c #false procedure?)
(current-decompose val) → void?
val : (or/c #false procedure?)
= #t

Whether the CSP will be decomposed into independent subproblems (if possible), because smaller CSPs are typically easier to solve than larger ones (and then the component solutions are reassembled into a larger solution).

parameter
(current-thread-count) → (or/c #false natural?)
(current-thread-count val) → void?
val : (or/c #false natural?)
= 4

Number of threads used by the min-conflicts-solver.

parameter
(current-node-consistency) → (or/c #false procedure?)
(current-node-consistency val) → void?
val : (or/c #false procedure?)
= #f

Whether node consistency is applied. Node consistency is helpful for certain CSPs, but not others, so it is #false by default.

Helpful for which CSPs? Node consistency means that for any one-arity (aka unary) constraints on a variable, we can filter out any domain values that don’t satisfy the constraint, thereby reducing the size of the search space. So if the CSP starts with unary constraints, and the constraints foreclose certain values, node consistency can be useful. The cost of node consistency is proportional to the number of values in the domain (because all of them have to be tested).

Node consistency tends to be especially helpful in CSPs where all the assignment values have to be different, and even more so where the variables all have the same domain (say, 100 variables, each with a value between 0 and 99 inclusive). In a case like this, any assignment to one variable means that value can no longer be used by any other variable. Node consistency will remove these values from the other variable domains, thereby pruning the search space aggressively.

parameter
(current-arity-reduction) → (or/c #false procedure?)
(current-arity-reduction val) → void?
val : (or/c #false procedure?)
= #t

Whether constraints are reduced in arity where possible. This usually helps, so the default is #true.

Why does it help? Because lower-arity constraints tend to be faster to test, and the solver can use node consistency on one-arity constraints (see current-node-consistency).

For instance, suppose we have variables representing positive integers a and b and the constraint says (< a b). Further suppose that b is assigned value 5. At that point, this constraint can be expressed instead as the one-arity function (< a 5). This implies that there are only four possible values for a (namely, '(1 2 3 4))). If node consistency is active, the domain of a can immediately be checked to see if it includes any of those values. But none of this is possible if we don’t reduce the arity.

11 Solvers🔗ℹ

Pass these functions to current-solver.

procedure
(backtracking-solver prob) → generator?
prob : csp?

The default solver. Conducts an exhaustive, deterministic search of the state space. Backtracking means that when the solver reaches a dead end in the search space, it unwinds to the last successful variable assignment and tries again. The details of its behavior are modified by current-select-variable, current-inference, and current-node-consistency.

The advantage of the backtracking solver: it proceeds through the search space in a systematic manner. If there is a solution, the backtracking solver will find it. Eventually.

The disadvantage: the same. Some search spaces are so huge, and the solutions so rare, that concentrating the effort on searching any particular branch is likely to be futile. For a more probabilistic approach, try min-conflicts-solver.

procedure
(min-conflicts-solver prob [max-steps]) → generator?
prob : csp?
max-steps : exact-positive-integer? = 100

An alternative solver. Begins with a random assignment and then tries to minimize the number of conflicts (that is, constraint violations), up to max-steps (which defaults to 100). In essence, this is a probabilistic hill-climbing algorithm, where the solver makes random guesses and then tries to nudge those guesses toward the correct answer.

I like to imagine the solver flying above the search space with a planeload of paratroopers, who are dropped into the search territory. Each of them tries to walk from the place they land (= the initial random assignment) toward a solution.

It’s a little weird that this works at all, but it does. Sometimes even better than the backtracking-solver, because the minimum-conflicts solver is “sampling” the search space at many diverse locations. Whereas the backtracking-solver can get stuck in a fruitless area of the search space, the minimum-conflicts solver keeps moving around.

Of course, to avoid getting stuck, the minimum-conflicts solver has to abandon guesses that aren’t panning out. Hence the max-steps argument, which controls the number of steps the solver takes on a certain attempt before giving up.

The other parameter that affects this solver is current-thread-count, which defaults to 4. The solver is multithreaded in the sense that it pursues multiple solutions simultaneously. This way, if one thread finds a solution earlier, it will not be blocked by the others.

12 Selecting the next variable🔗ℹ

Pass these functions to current-select-variable.

procedure
(mrv-degree-hybrid prob) → (or/c #false var?)
prob : csp?

Selects next variable for assignment by choosing the one with the fewest values in its domain (aka minimum remaining values or mrv; see also minimum-remaining-values) and largest number of constraints (aka degree; see also max-degree). The idea is that this variable is likely to fail more quickly than others, so we’d rather trigger that failure as soon as we can (in which case we know we need to explore a different part of the state space).

procedure
(minimum-remaining-values prob) → (or/c #false var?)
prob : csp?

Selects next variable for assignment by choosing the one with the fewest values in its domain.

procedure
(max-degree prob) → (or/c #false var?)
prob : csp?

Selects next variable for assignment by choosing the one with the largest number of constraints.

13 Inference🔗ℹ

Pass these functions to current-inference.

procedure
(forward-check prob name) → csp?
prob : csp?
name : var-name?

Used for inference when current-inference is not otherwise set. Forward checking determines whether the assignment to name necessarily causes another variable domain to become empty. How? It examines the remaining two-arity constraints that link variable name to an unassigned variable. For each of these constraints, it plugs in the new value for name and checks that the other variable still has values in its domain that can meet the constraint. If not, the assignment to name must fail. Forward checking can discover failures faster than backtracking alone.

procedure
(ac-3 prob name) → csp?
prob : csp?
name : var-name?

Applies the AC-3 arc-consistency algorithm. Similar to forward checking, but checks farther ahead. For that reason, it will usually take longer. (It is not necessarily better, however.)

Specifically: following a new variable assignment, AC-3 examines all constraints that link exactly two unassigned variables. It checks that each variable has at least one value in its domain that can be paired with the other to satisfy the constraint (this pair comprises the eponymous arc). If no such pair exists, then the constraint can never be satisfied, so the new variable assignment must fail.

“So AC-3 is a superset of forward-check?” Yes. Both techniques examine two-arity constraints after variable name has been assigned a value. Forward checking, however, only examines two-arity functions that include variable name in the constraint. Whereas AC-3 checks all two-arity functions (even those that don’t include name).

In this way, AC-3 can detect inconsistencies that forward checking would miss. For instance, consider a CSP with three variables a b and c, and three constraints ab, ac, and ab. We assign a value to a. Forward checking would then check constraints ab and ac, perhaps removing values from the domains of b and c to be consistent with the new value of a. These domain reductions, however, might be inconsistent with constraint bc. Forward checking won’t notice this, because it never tests bc. But AC-3 does test bc, so it would notice the inconsistency.

The problem with AC-3 is that it’s necessarily recursive: each time it eliminates a domain value from a certain variable, it has to recheck all the two-arity constraints (because any of them might have been made inconsistent by the removal of this value). AC-3 only stops when it can no longer remove any value from any domain. So yes, compared to simple forward checking, it does more. But it also potentially costs a lot more, especially if the variables have large domains.

procedure
(no-inference prob name) → csp?
prob : csp?
name : var-name?

Truth in advertising: performs no inference.

14 Structure types & predicates🔗ℹ

struct
(struct csp (vars constraints)
    #:extra-constructor-name make-csp
    #:transparent)
  vars : (listof var?)
  constraints : (listof constraint?)

Represents a CSP.

struct
(struct var (name domain)
    #:extra-constructor-name make-var
    #:transparent)
  name : var-name?
  domain : (listof any/c)

Represents a variable in a CSP.

struct
(struct constraint (names proc)
    #:extra-constructor-name make-constraint
    #:transparent)
  names : (listof var-name?)
  proc : procedure?

Represents a constraint in a CSP.

procedure
(var-name? x) → boolean?
x : any/c

Check whether x is a valid CSP variable name, which today can mean any value, but I might change my mind.

15 License & source code🔗ℹ

This module is licensed under the MIT license.

Source repository at http://git.matthewbutterick.com/mbutterick/typesetting. Suggestions & corrections welcome.

1	Installation & usage
2	Introduction
3	So this is the ultimate tool for the lazy programmer?
4	First example
5	Interlude
6	Second example
7	Another interlude
8	Making & solving CSPs
9	Sideshows
10	Parameters
11	Solvers
12	Selecting the next variable
13	Inference
14	Structure types & predicates
15	License & source code