# Ed's Big Plans

## C & Math: Sieve of Eratosthenes with Wheel Factorization

In the first assignment of Computer Security, we were to implement The Sieve of Eratosthenes. The instructor gives a student the failing grade of 6/13 for a naive implementation, and as we increase the efficiency of the sieve, we get more marks. There are the three standard optimizations: (1) for the current prime being considered, start the indexer at the square of the current prime; (2) consider only even numbers; (3) end crossing out numbers at the square root of the last value of the sieve.

Since the assignment has been handed in, I’ve decided to post my solution here as I haven’t seen C code on the web which implements wheel factorization.

We can think of wheel factorization as an extension to skipping all even numbers. Since we know that all even numbers are multiples of two, we can just skip them all and save half the work. By the same token, if we know a pattern of repeating multiples corresponding to the first few primes, then we can skip all of those guaranteed multiples and save some work.

The wheel I implement skips all multiples of 2, 3 and 5. In Melissa O’Neill’s The Genuine Sieve of Erastothenes, an implementation of the sieve with a priority queue optimization is shown in Haskell while wheel factorization with the primes 2, 3, 5 and 7 is discussed. The implementation of that wheel (and other wheels) is left as an exercise for her readers ðŸ˜›

But first, let’s take a look at the savings of implementing this wheel. Consider the block of numbers in modulo thirty below corresponding to the wheel for primes 2, 3 and 5 …

 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Only the highlighted numbers need to be checked to be crossed out during sieving since the remaining values are guaranteed to be multiples of 2, 3 or 5. This pattern repeats every thirty numbers which is why I say that it is in modulo thirty. We hence skip 22/30 of all cells by using the wheel of thirty — a savings of 73%. If we implemented the wheel O’Neill mentioned, we would skip 77% of cells using a wheel of 210 (for primes 2, 3, 5 and 7).

(Note that the highlighted numbers in the above block also correspond to the multiplicative identity one and numbers which are coprime to 30.)

Below is the final code that I used.

```#include <stdlib.h>
#include <stdio.h>
#include <math.h>

const unsigned int SIEVE = 15319000;
const unsigned int PRIME = 990000;

int main(void) {
unsigned char* sieve = calloc(SIEVE + 30, 1); // +30 gives us incr padding
unsigned int thisprime = 7;
unsigned int iprime = 4;

unsigned int sieveroot = (int)sqrt(SIEVE) +1;

// Update: don't need to zero the sieve - using calloc() not malloc()

sieve[7] = 1;

for(; iprime < PRIME; iprime ++) {
// ENHANCEMENT 3: only cross off until square root of |seive|.
if(thisprime < sieveroot) {
// ENHANCEMENT 1: Increment by 30 -- 4/15 the work.
// ENHANCEMENT 2: start crossing off at prime * prime.
int i = (thisprime * thisprime);
switch (i % 30) { // new squared prime -- get equivalence class.
case 1:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 6;
case 7:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 4;
case 11:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 2;
case 13:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 4;
case 17:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 2;
case 19:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 4;
case 23:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 6;
case 29:
if(!sieve[i] && !(i % thisprime)) {sieve[i] = 1;}
i += 1; // 29 + 1 (mod 30) = 0 -- just in step
}
for(; i < SIEVE; i += 30) {
if(!sieve[i+1] && !((i+1) % thisprime)) sieve[i+1] = 1;
if(!sieve[i+7] && !((i+7) % thisprime)) sieve[i+7] = 1;
if(!sieve[i+11] && !((i+11) % thisprime)) sieve[i+11] = 1;
if(!sieve[i+13] && !((i+13) % thisprime)) sieve[i+13] = 1;
if(!sieve[i+17] && !((i+17) % thisprime)) sieve[i+17] = 1;
if(!sieve[i+19] && !((i+19) % thisprime)) sieve[i+19] = 1;
if(!sieve[i+23] && !((i+23) % thisprime)) sieve[i+23] = 1;
if(!sieve[i+29] && !((i+29) % thisprime)) sieve[i+29] = 1;
}
}

{
int i = thisprime;
switch (i % 30) { // write down the next prime in 'thisprime'.
case 1:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 6;
case 7:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 4;
case 11:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 2;
case 13:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 4;
case 17:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 2;
case 19:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 4;
case 23:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 6;
case 29:
if(!sieve[i]) {thisprime = i; sieve[i] = 1; goto done;}
i += 1;
}
for(; i < SIEVE; i += 30) {
if(!sieve[i+1]) {thisprime = i+1; sieve[i+1] = 1; goto done;}
if(!sieve[i+7]) {thisprime = i+7; sieve[i+7] = 1; goto done;}
if(!sieve[i+11]) {thisprime = i+11; sieve[i+11] = 1; goto done;}
if(!sieve[i+13]) {thisprime = i+13; sieve[i+13] = 1; goto done;}
if(!sieve[i+17]) {thisprime = i+17; sieve[i+17] = 1; goto done;}
if(!sieve[i+19]) {thisprime = i+19; sieve[i+19] = 1; goto done;}
if(!sieve[i+23]) {thisprime = i+23; sieve[i+23] = 1; goto done;}
if(!sieve[i+29]) {thisprime = i+29; sieve[i+29] = 1; goto done;}
}
done:;
}
}
printf("%d\n", thisprime);
free(sieve);
return 0;
}
```

Notice that there is a switch construct — this is necessary because we aren’t guaranteed that the first value to sieve for a new prime (or squared prime) is going to be an even multiple of thirty. Consider sieving seven — the very first prime to consider. We start by considering 72 = 49. Notice 49 (mod 30) is congruent to 19. The switch statement incrementally moves the cursor from the 19th equivalence class to the 23rd, to the 29th before pushing it one integer more to 30 — 30 (mod 30) is zero — and so we are able to continue incrementing by thirty from then on in the loop.

The code listed is rigged to find the 990 000th prime as per the assignment and uses a sieve of predetermined size. Note that if you want to use my sieve code above to find whichever prime you like, you must also change the size of the sieve. If you take a look at How many primes are there? written by Chris K. Caldwell, you’ll notice a few equations that allow you to highball the nth prime of your choosing, thereby letting you calculate that prime with the overshot sieve size.

Note also that this sieve is not the most efficient. A classmate of mine implemented The Sieve of Atkin which is magnitudes faster than this implementation.

Eddie Ma

February 3rd, 2011 at 11:08 pm

## fridgelib â€” Andreâ€™s C Library

Brief: Fridge Library (fridgelib) is a light weight C library that’s basically what each C programmer eventually comes away with. Fridgelib contains a queue, stack and trie data type implementation– these implementations are cleaner than mine, hence I have devoured them into my code where necessary.

There is however one item that Andre’s told me I can add to fridgelib if I can ever tidy it up, and that’s my evil linked-array. It’s basically a doubly linked list whose elements are arrays of fixed size; the traversal is cut down by a multiple of that array size while indexing thereafter is still constant time given the modulus of the index. My evil implementation is a double-headed stack/queue/array which supported python-like negative-value indexing, slicing and iteration… actually, that’s when I decided it had grown too grotesque and left it to sit in my repository…

I’ll probably return to it later, to remove slicing and to more cleanly define the semantics of iteration.

Hmm– the only things that are really missing from fridgelib are the hashtable and either a nice red-black tree or a treap…

Eddie Ma

November 23rd, 2009 at 1:14 pm

## NNcmk: A Neural Network (Win32 & OSX)

Okay– I managed to finish that 3-layer neural network implementation the other day– actually, it was a while ago but I didn’t post about it from being busy. It’s a pretty standard network, but I’m proud to say it’s small and works for OSX and Win32. I have to put in a few #define directives to have it work with Linux as well.

I will have to document it too when I get a chance. The reason why I made a brand new executable (instead of using the source from my previous projects) is because I needed something that would take in launch-time parameters so that it didn’t need to be recompiled each time someone decides to use the binary on a new dataset with a different number of inputs. Right now, the thing has barely any solid parameters that can’t be touched at launch-time.

The NNcmk (Neural Network – Cameron, Ma, Kremer) package is C compilable, uses the previously developed in-house library for the NGN and will be available shortly after I’m satisfied that I’ve squashed all the bugs, fixed the output and have documented the thing completely. I think Chris has difficulty with it right now mostly because I didn’t specify exactly what parameters do what– I did at least provide a (DOS) batch file with an example run-in-train-mode / run-in-test-mode sequence…

Back to work on that paper right now though…

Eddie Ma

July 7th, 2009 at 12:09 pm

Posted in Machine Learning

The NGN is being worked on again. In implementing the Bagging/Balancing/Boosting/Verifying activity of the software, I’ve decided to break everything apart into more smaller modules — that gives me a better picture of progress while allowing me to keep motivated on smaller tasks.

Realistically, there are only three functions that are ever dynamically generated by the BNF generator; these all perform some logic related to the weight layers of the neural network– I think it is feasible now to completely close off and encapsulate the dynamically generated content in its own source files. Linking it in is probably philosophically more correct and also implementationally cleaner than trying to have the BNF generator pump out all of the template items that do not change between grammars.

In retrospect, I might drop the Verifying activity soon depending on how similar or different it is to Balancing– if it’s similar enough and could be expressed in many of the same inner functions, then I’ll keep it. Verifying will probably be so weak, it can be supplemented by an ancillary test run after training prior to an actual test run. I’m simply concerned that in the stark majority of cases, verification would mislead the system into convergence / infinite non-convergence. In a good system with much data distributed between training and verification along with correct tolerances, this is not the case– I have neither the luxury of time nor copious amounts of data. I’ll leave it in for now, and figure out what I want to do with it later.

In terms of completion, the way I’ve broken down the work has me coding out C modules until at least Monday, whereupon I must fix any API discrepancies between the modules and magically anneal them together. And that’s still before making the hard-coded changes to the NGN generator.

I think making the deadline is still possible, but making sure that everything is as small (time) as possible is essential.

Eddie Ma

June 10th, 2009 at 2:49 pm

Posted in Machine Learning

Tagged with ,