Hi all
I couldn't help but notice how many members of my family were puzzling over su doku challenges in the Newspapers around Christmas time (is Christmas really that boring that people resort to newspaper puzzles for entertainment?).
Anyway, I didn't have time to live up to my fathers challenge of writing a program to solve the puzzles, but I did ponder the possible design of such a program.
For those of you who don't know what Su Doku is, take a look here:
http://act365.com/sudoku/ . The general rule is: You are given a grid of 9 rows and 9 columns. The grid is divided into nine segments (each consisting of 3*3 cells). You fill in each cell with an integer (1-9 inclusive), and with each box, row and column there must be only be one occurrence of any integer. Some integers are already placed inside the box.
This means that you can instantly look at any su doku puzzle, and specify all the possible integers that could go in each cell (that is because if the is, for example, a '5' in one row, you know that no other cell in that row can contain a '5'). I found you can normally use this method to reduce each cell to having only around 5 possible values. Of course, as soon as you fill in any cell with a definite value you can then continue to do another process of elimination on the whole grid.
The problem is, elimination alone is not enough to solve the puzzle. You need to start specifying unknown values, and checking them to see if they work.
Here's an interesting link with an algorithm which does this very effectively:
http://act365.com/sudoku/faq.htm#FAQ2 (you can get the source code for the solver here:
http://sf.net/projects/sudoku).
If you're put in a maze, a method to mathematically garuntee you will reach the centre is as follows:
You walk into the maze until you reach the first junction. You tie a piece of string onto the wall of this junction, and draw an arrow pointing to the route you chose.
Walking down that route, you draw an arrow for every other corner you take. Each time you reach a dead end, you go back to the last junction you came across, and go down another route (which you have not yet taken), until you reach the centre (the point of the string is just to help you retrace your steps).
The same concept can be applied to su doku.
When you have a clean grid, with only the predefined integers in it, you place 1 random number (in the top left box, say). You record this number. This is where your string starts. You then place another number, and record that number, and so on until you find the puzzle is impossible to solve. You then change the value of the last number placed. Once all the different values of that number have been tried, go back one step. Change this number, and mark of it's first value (perhaps in an array of "used values"). Keep entering more numbers, and stepping back etc until you have a possible solution.
To enhance the performance of this method, you can only put down values that you know *can* work (by using the elimination process described above). That'll significantly increase the speed at which a puzzle is solved.
This is again enhanced by choosing to take the path with the fewest possible "candidate" values (again, by process of elimination).
Now, this is where I start disagreeing with the link I pasted. Until now, I've just been explaining what's said in the link in my own words. Here a beg to differ.
Apparent the "smallest candidate" algorithm can be improved upon by picking a random cell instead of the first one when 2 or more cells have the same number of possible candidates. *I* think this should make no difference at all, but possibly make it run even *slower*.
Reason for making no difference:
The rules of probability state that if I toss a coin 50 times, and each time it is heads, the 51st time still has exactly 50% chance of being tails, even though the first 50 times resulted in heads (that is, assuming the coin is not weighted in any way). Same thing here (any comments on this?)
Reasons for making things slower:
Memory management. If you select the first cell each time, you don't need to record the cells already tried. Instead, you just need to record the current cell, and the ++ it when you come to retrying that step. That's obviously alot less heavy on the memory than stating all the cells which have been tried. On modern computers, it's not really going to make much difference, but still...
I can't think of any other things to improve the performance of a solver program, but I think the key lies in making efficient use of the memory.
If (in a maze) you take a piece of string, and tie it at the start and walk through the maze taking the first *left* corner in *every* turn, you no longer need to mark which root you took when you come back, since you know the one you should take next time is right of your current route (unless it has your thread in it, in which case you should retrace your steps and follow the thread to the last corner). This means less to record, so less time & space is used up writing to memory.
Can anyone else think of any other memory hacks, or hacks in general to speed this up?
It's a pity that the program I linked to is in java, since I only know C++ and PHP well. Had it been in one of those languages I could've run a time trail to see which algorithm (and/or programming) is best :-D
Cheers
--
Please contact me at:
Colin (dot) Horne (at) gmail (dot) com