## Notes on Lookup – Another Sieve for the Collatz Problem

This post is a follow-up on an earlier post in which I introduced the Collatz Problem and designed a sieve that systematically builds solutions and is very efficient in the work it does.  In this post, I’ll give a version of a sieve that is more straightforward, though perhaps not as efficient.  First, let’s restate the Collatz Problem:

Each counting number n past 1 is assigned a successor number, as follows:
$successor[n] = \begin{cases}3n+1 & \text{if n is odd} \\ n/2 & \text{if n is even} \end{cases}$
The number “1″ is considered home, and when you’re home, you stop.  If you start at a given number away from home, and cycle through its successors, you may end up home.  Is there any starting number from which you will not eventually reach home?

When playing with a starting value of 6, we found that the travel path took us to 3, to 10, to 5, to 16, to 8, to 4, to 2, and finally to 1.  We observed that in finalizing this path from 6 to 1, we have also solved the path from 3 to 1, and the path from 10 to 1 and so on.  The sieve we introduce in this post will take full advantage of this observation.  Unlike the sieve in the prior post, this sieve will work essentially forward.

First, we decide how big the sieve is going to be.  For our examle, we will make the sieve size 16.  Each cell in the sieve will have two numbers and a marker.  The marker can have the following values “not visited”, which we will represent by a blank; “tentative”, represented by a “-“; “succeed”, represented by a “+”, and “fail”, represented by a “≠”.  A typical transition will be from blank to tentative to succeed (“-” to “+”); sometimes it will be from tentative to fail (“-” to “≠”).  Below are a series of snapshots:

The first snapshot is how we start out: all cells are marked “unvisited” (blank) except for the first cell, which is completed as “succeed”.  In each step, we look for an unvisited cell and then do a sweep.  The second snapshot shows what happens when we visit cell 2.    When we visit cell 2 we mark it as “tentative”, and then do a sweep by marking its successors, in turn, as tentative as well.  As we do this, we keep a running count of how many cells we mark tentative this way.  The “tentative” sweep ends in one of two ways.  One, we encounter a cell already marked “succeed”.  If so, we do a final sweep, starting at the same cell (here marked in red).  In the final succeed sweep, we mark all the cells “succeed” and fill in the time slot.  We have the information we need for filling in the time slot: we add our count of tentative markings to  the time value we found in the “succeed” cell we ran into.  For each successor cell we decrease the time value, till we bump into the cell already marked as succeed.  The third snapshot shows the result of the final succeed sweep for 2.

We now visit the first unvisited cell, which is cell 3, and mark it “tentative”.  Similarly, we mark its successors as tentative, marking 10, 5, 16, 8 and 4 this way.  Our running count indicates we have marked 6 cells as tentative.  The next successor cell, 2, is already marked “succeed”.  So we can now do a final succeed sweep, starting at 3.  The time slot for 3 will be filled in as 8, the sum of our running count of 6 and the value we found in the “succeed” cell of 2.

The fourth snapshot shows the result of the final succeed sweep for 3.  This sweep visits the same values, and in the same order, as the tentative sweep, always going from a value to its successor.  We can stop this sweep when it arrives at a cell already marked as “succeed”.

After visiting 6, both the tentative sweep and the final sweep, the result looks like the first snapshot above.  Something new happens when we visit 7, mark it tentative, and mark its successors.  Here, we encounter the case where the successor does not lie inside of the sieve that we constructed.  The successor of 7, 22, is outside the sieve’s range of 1 through 16.  In this case, we do a final “fail” sweep, starting at 7.  A “fail” sweep works just like a “succeed” sweep, except we mark the cell as “fail” and don’t bother with the time slot.  The last snapshot above shows the result of this.

The snapshot below shows the result of continuing this process till all cells in the sieve have been visited and swept:

Note, first of all, that marking a cell as “fail” doesn’t mean we’ve found a counter example to the Collatz question.  Instead, it merely indicates our sieve wasn’t big enough to settle the issue.  Note, second, that we paid for the relative straightforwardness of this sieve’s forward actions by needing to do a final sweep to ratify and record the result of the tentative sweep.  This is true both in case the tentative sweep succeeds and when it fails.  Note, third, that compared to the predecessor sieve we showed in the earlier post, this sieve is easier to extend.  If after completing the size 16 sieve I decide I want to complete a size 1000 sieve, I could start with the completed size 16 sieve and simple erase all the fail markers: all the succeed markers can be left in place.

Isn’t it interesting that two such very different sieve processes both serve to construct travel paths home for a range of starting points?  In the process, they have revealed a number of key characteristics of sieve processes.