At Logic class last week we saw how to solve a Sudoku using SAT and for fun I decided to actually try this out using Python. It turned out to be pretty trivial to implement and I thought I’d share the experience.

First of all let’s see how the Sudoku problem was described at class: we have a table with 9 rows and 9 columns;

  • 1. Each field [i, j] (where i=1..9 and j=1..9) has at least one value (between 1 and 9).
  • 2. Each field [i, j] (where i=1..9 and j=1..9) doesn’t have more than one value.
  • 3. There isn’t any repeated value in any row, column or 3×3 group.
  • 4. Some of the fields have a predefined value.

Now to implement this in code, first of all I needed a Python module implementing SAT solving. A quick search in Debian’s repositories gave me python-logilab-constraint, which I’ve found to be quite nice to use, even though it could definitely take some speed improvements.

Conditions 1 and 2 aren’t a problem at all, as logilab.constraint can be used quite naturally [0]. We just define a variable for each field (eg., x11 to x99, where the first number is the row and the second number is the column) and the domain in which they operate (integer value from 1 to 9):

values = range(1, 10) # [1..9]
variables = ["x%d%d" % (i, j) for j in values for i in values]
domains = {}

for variable in variables:
	domains[variable] = fd.FiniteDomain(values)

The 4th rule is also straightforward, we just need to hardcode the values. If we have a bidimensional list sudoku containing the initial numbers and None in all empty fields, we add each of them as a constraint:

constraints = []
for i, row in enumerate(sudoku):
	for j, field in enumerate(row):
		if field is None:
			continue
		variable = "x%d%d" % (i+1, j+1)
		constraints.append(fd.make_expression((variable,),
			"%s == %d" % (variable, field)))

Now only rule 3 remains; here we basically have to set up three more groups of constraints: one for rows, one for columns and one for the 3×3 groups. My initial implementation checked each row/column/group at once; for example, for the first row «x11 != x12 != x13 != … != x19», for the first column «x11 != x21 != … != x91», etc. However, this proved to be extremely slow, and after checking the «Performance considerations» section of Logilab Constraint’s documentation I split up the row and column conditions [1] to lots of smaller conditions, as in: «x11 != x12», «x11 != x13», «x11 != x14», etc. I also moved the constraints for the initial numbers to the top (I had them at the end of the constraints list before), as they are the simplest ones. With those changes resolution time changed from several minutes to some tenths of a second.

And this is it. After all constraints have been added, we just need to run the solver:

repository = Repository(variables, domains, constraints)
solutions = Solver().solve(repository)

The complete code is available via Bazaar at lp:~rainct/+junk/sudoku-sat. Being completely new to the logilab.constraints module, or implementing any such stuff at all, it took me around half an hour to write this, which shows how SAT makes such sort of problems really straightforward.


[0] Using logilab.constraint it’s possible to assign arbitrary Python data to variables (here we just give each an integer, but variables could also take tuples or whatever else). When this problem was presented at class using pure propositional logic it was a bit more cumbersome, as we couldn’t just say “there’s a variable x11 with domain [1..9]”. For instance, rule 1 was «(p111 | p112 | p113 | … | p119) and (p121 | … | p129) …», where “p111″ would be True if field [1,1] is supposed to contain a one, “p112″ is True if it’s supposed to contain a two, etc.
[1] I didn’t bother also splitting up he 3×3 group constraints since the other two changes already gave me enough of a speedup; changing that may squeeze a few msecs more out of it.
P.S.: If you’d like a more formal explanation of this, a search on Google found this paper: A SAT-based Sudoku Solver.