Ender314

I have been playing with genetic programming for several years, attempting to use it to create pseudo random number generators (RNGs) and hash functions. In the course of doing this, with various parameters, I frequently encountered the pseudorandom number generator given by combining a bitwise rotation and a multiply. Sample C# code below, with rotation constant b and multiplication constant c.

public ulong Next()
{
	_state = c * BitOperations.RotateLeft(_state, b);
	return _state;
}

This is a somewhat random function as is, but it has several shortcomings. First, it’s not quite random enough by itself unless you only use half of the output. Second, the period is not well known. The period of an RNG is the number of values it returns before it starts repeating. For most commercial RNGs, the period is known mathematically, or at least a lower bound on it is known. For the RNG above, The period might be as short as 1 or as long as 2^64-1, which is the range of possible non-zero values of a ulong (Clearly the value zero is unchanged by the rotate-multiply, so this generator would have to be seeded with a non-zero value).

Starting from some initial value of _state (an initial, or seed value), and repeatedly calling the function above, it will obviously eventually repeat. The set of values that it traverses before repeating is called the cycle, which has a length. The ideal case that we are searching for is a “full cycle”, where the function traverses every possible value before repeating. We will see that constants b and c which have full cycle are rare. Most have a number of cycles of varying length. Some have multiple cycles of length 1, and some sets of constants b and c have thousands of cycles all length 4 or less. Obviously that doesn’t make for a very good random generator if it keeps returning the same four values in a row, which is why we are trying to find those “full cycle” constants.

Recall that bitwise rotation is a common low-level operation in computers, and is quite simple. For comparison, if we were discussing 5 digit decimal numbers, then RotateLeft(12345,2)=34512 and RotateLeft(100,3)=1. Bitwise rotation is the same thing in base 2 arithmetic. Since computers represent everything in base 2 with zeroes and ones, bitwise rotation is natural, and a well-known way of mixing portions of a number in many cryptographic hash functions or ciphers.

Also, recall that computers use modular arithmetic for integers. That sounds fancier than it is, it really means that the result of any operation is considered as the remainder with the modulus. For instance, among the integers modulo 256 (which would be 8-bit integers), 255+5=4 (260 % 256), and 21*61=1 (1281 % 256). Modular arithmetic isn’t very well known by all developers, so I’ll highlight some results as needed. If you want to learn more you can read about it over on wikipedia. Modular arithmetic is the way integers work by default on computers. If you multiply two unsigned 32-bit integers, the results is given modulo 2^32.

So we are searching for a pair of integers (b,c) for which the function has a full cycle. It is important that the function f(x) is invertible, otherwise it might be possible for two different x’s to have the same result, in which case we couldn’t have a full cycle. Bitwise rotation is clearly invertible, and the multiplication by c will be invertible as long as c is odd. This is a result from modular arithmetic, in the integers modulo 2^n, every odd number has a unique inverse. So we immediately know that we need only consider odd values for c.

I brute forced integers of bit size 3-23, and came up with a list of rotate-multiply constants that have a full cycle. Fortunately, there are definitely such constants for every bit size, suggesting that such constants exist, which I call my first hypothesis.

Hypothesis 1: There are rotate, multiply constants b and c for any given integer size n such that there is a full cycle that contains every non-zero integer.

The basis for this is purely empirical. The table below shows the number of rotate-multiply pairs that exists at each bit length with a full cycle.

Bit LengthCount
31
41
52
64
74
81
94
103
116
123
137
147
1511
169
1713
1810
198
209
2110
2210
2311
Number of full length rotate-multiply pairs at a given bit length

As can be seen, it looks like there are more such pairs as the bit size increases. I didn’t see any a priori reason why such pairs even have to exist, so this is nice to see.

Now we can get to the first theorem, albeit a trivial one. The number I listed in the table above is actually half the number of rotate-multiply pairs that exist, because every such pair (b,c) has a twin pair (n-b, k), where k is the multiplicative inverse of c. You can see this by just inverting the function f(x)=c*RotateLeft(x,b) for a given bit size n. Let k be the multiplicative inverse of c. Then

xn+1=c*RotateLeft(xn,b)
kxn+1 = RotateLeft(xn,b)
RotateLeft(kxn+1, n-b)=xn

This is just another recurrence relation, although it’s a little different. It’s going backwards, and the multiply happens, then the rotate. This must have the same length as the original, although it’s a slightly different kind of formula. I noticed this empirically before I saw the math, but I’ll call that a theorem. The key is to notice that the relationship alternates rotate and multiply. You could look at the function as multiply first, then rotate, or rotate first, then multiply. You would get a different sequence, but of the same length.

I visualize this by thinking of the sequence obtained by starting from an initial value, rotating it, then multiplying it. For instance, for an 8-bit integer with rotate 3 and multiply 21, starting from 1, we get RotateLeft(1,3)=8, then 8 * 21 = 168, then we repeat the process, obtaining the sequence below.

1,8,168,69,169,77,81,138,82,146,250,215,163,29,97,11,231,…

The first recurrence relationship describes the sequence 1,168,169,81,82,250,163,97,231. The second relationship, going backwards, describes 11,29,215,146,138,77,69,8. So the sequences are related, but they must be of the same length since they are looking at the same chain of operations.

Theorem 1: For a given bit size n, if (b, c) is a rotate-multiply pair with a cycle length m, then (n-b, k), where k is the inverse of c, is another rotate-multiply pair with a cycle length m.

As an example, for 8-bit integers, (3,21) is a full-cycle pair of constants, and so is (5,61), since 3+5=8 and 61 is the multiplicative inverse of 21 modulo 256. The generated sequences are related by the equation above, but are not the same. The first sequence is 1, 168, 169, 81, 82, 250, … The second is 1, 160, 196, 56, 171, 225. This reduces the search space by half. Only the rotates less or equal to than n/2 need to be checked, as the twin pair can be easily deduced. Mark Overton has done some research on this topic, and his appendix C contains a rotate multiplier pair of (18: 3,731,015,275) for 32 bit integers, with period just shy of max. He notes that rotates greater than 16 are less random. Using this theorem, we can construct the twin pair, which would be “more random”, and better suited for his purposes – (14: 583,185,987). With a seed of 1, I have confirmed the period of both of these is 4,294,967,293, as the paper says.

Hypothesis 2: The multiplier c in a full-cycle pair is always congruent to 1 (mod 4)

This is another empirical hypothesis, but I observed it for every full-cycle pair for bit size 3 to 21. This is strong enough evidence for me to use it to reduce my search. Now the code ignores constants c that are congruent to 3 (mod 4). Note that c is congruent to 1 (mod 4) is just a fancy way of saying the constant c divided by 4 has remainder 1, or c % 4 = 1 in c-style syntax.

There will be more articles coming, as I am collecting data and looking through it for patterns that may help me find the constants b and c that work for 64 bit integers. Future articles will discuss cycle statistics, fixed points, roots of unity, and order in cyclic groups. Below is what I have gathered so far. These are full cycle bit length, rotate left, multiply pairs.

Bit LengthRotateMultiplyBit LengthRotateMultiply
31517560669
41917587353
521317659053
521717665537
613317676209
625717765537
632917795205
635317850845
716117882897
7165182225013
729184178077
726518518125
8321185112837
92105185131073
92257185192957
92289186143697
93241187182433
10163718968729
103513189237001
1041009192519821
11122519459753
11123719514177
1111813195128565
1141081195323697
1141165197211525
1151025198409841
1211625199228385
123725201634197
1253561201940945
1317897201958881
1324033202283877
133561203497933
1331965206727193
134637207167493
134905207524289
1345429207950113
141745721278437
1431629212737153
1439901214723317
14313377216321549
14519652161523857
14719572171740677
14720932181933937
15112957219518925
151163852110286961
1512012121101048577
15219292212097153
152163852231458649
154136092232107557
154163852261359401
1554781227547981
15586812272383549
15643052281358145
156311732281690877
16159772282158689
161313732293168081
161542052312502269
161558332351856189
161563732354194305
16213233236716037
162559212367071101
165174972381904041
165232692392413621
171390772394194305
1733871323105903105
1736553723107835097
1748298123116768185
Sets of bitlength, rotate, and multiply pairs that are full cycle

Ender314

In a previous post, I wrote about simple two-state generator that my genetic programming found, with an add, rotate, and subtract (below, in C#).

public ulong NextRandom()
{
	state2 = state2 + state1;
	state2 = BitOperations.RotateRight(state2, 1);
	state1 = state1 - 12076313562642528635;
	return state2;
}

This passed some tests of randomness, but a helpful reddit user pointed out that the lower four bits were quite weak, and failed PractRand quite quickly. I updated my objective function to include a test that worked on only the lower four bits, and quickly found a very similar solution. This seems to be a somewhat attractive solution to my current genetic algorithm, as it found something like this several times.

public ulong NextRandom()
{
	state1 ^= state1 >> 44;
	state1 += state2;
	state2 -= 15057989893456573886;
	state1 = BitOperations.RotateLeft(state1, 31);
	return state1;
}

Looks like I mostly just needed a xor-shift to do some mixing. I tweaked this slightly, making the subtract into an add and making the constant odd so that state2 would cycle through every ulong value before repeating.

public ulong NextRandom()
{
	state1 ^= state1 >> 44;
	state1 += state2;
	state2 += 15057989893456573885;
	state1 = BitOperations.RotateLeft(state1, 31);
	return state1;
}

I ran this through my testing suite. It passed the GCD test, Gorilla (7 bits), Gorilla (17 bits), and Linear Serial Correlation at 10 billion numbers tested. It passed the lower four bits test at 1 trillion numbers, but fails that test at 80,000 without the xor-shift. It passes the individual bit test at 100 billion, but fails at 200,000 without the xor-shift. This indicates how critical the xor-shift is.

I was curious about the possible period of this generator. It is at least 2^64, since state2 is only changed by adding an odd number. I had no idea when the whole state would be the same, so I tested a trimmed down version with 16-bit states. I tested four different additive constants – 65301, 55047, 58575, and 37709, all chosen at random. For each of these constants, I examined the period of the generator with a seed of 1 and every possible rotate and xor-shift constant.

The smallest period I found was 13,893,632. This is quite a bit larger than the minimum possible period, 65536. The maximum was 4,282,122,240, very close to the maximum of 4,294,967,296. The average was around 2 billion, just under half the theoretical maximum. Of course this doesn’t prove anything, but it does provide some evidence that the period of this generator may be significantly more than the minimum possible of 2^64.

In these tests I also examined the frequency of 16 bit numbers returned. In a good generator all the numbers should appear at roughly the same frequency. If I define the bias as (max-min)/avg, where max is the count of the most frequent result, min is the count of the most infrequent result, and avg is the expected count, then the average bias was 0.05, the maximum was 0.62, and the minimum was 0.003. The high-bias tests were the ones with the lower periods. For example, with an additive constant of 55047, a rotate of 6, and a xor-shift of 15, the maximum count was 326 and the minimum count was 176, indicating some integer was returned 326 times while another was returned only 176 times, a significant difference. The same additive constant with a rotate of 14 and xor-shift of 6 has a very low bias, with a maximum frequency of 65257 and a minimum of 65091, indicating all possible 16 bit values were returned at roughly the same frequency.

I’m calling it ARXA for now, since it has an add, rotate, xor-shift, and add. Hopefully that name isn’t taken. It’s interesting because it’s half-chaotic, but has some good randomness and a theoretical minimum period. It is also fairly fast with only a handful of operations and no multiplies. In a future post I may attempt to tune the constants and subject it to further testing. The genetic algorithm isn’t great at tuning constants because it makes no effort to improve the function once it “passes”.

Ender314

I frequently run my genetic programming software with different parameters, seeing what it will find. When asked to find a pseudo-random number generator under some given constraints, it is fairly good at finding a solution. The solution it finds is typically chaotic, though, with an unverifiable period. That’s not very useful practically, as the chaotic nature means that some seeds may have very short times before they start repeating. Take, for example, the generator below, produced by a short simulation:

RMU S2,51,954523823516132654;
XSR S2,13;
XOR OP,S2;

Translated into C# (where there is a class-level “state” variable), this is

public ulong NextRandom()
{
    state = BitOperations.RotateLeft(state, 51);
    state = state * 954523823516132654;
    state = state ^ (state >> 13);
    return state;
}

Recall that ^ is the XOR operator in C#. People keep asking me if that’s a “raise to a power” operator.

This generator passes some randomness tests at a million numbers generated, and can probably be tuned to pass some fairly strong tests if some of the parameters are changed. Unfortunately, it mutates the state in a fairly chaotic way, and I’m unaware of a way to determine what the shortest possible period is. There might be some seed value of state for which it repeats after 16 values, or a billion. Without guarantees about period, it’s impractical to use.

However, recently I ran the genetic simulation with a constraint of not using multiplication, and it found the solution below.

ADD S2,S1;
ROR S2,1;
SUB S1,12076313562642528635;
XOR OP,S2;

Translated into C# (this time with two class-level state variables), this is

public ulong NextRandom()
{
    state2 = state2 + state1;
    state2 = BitOperations.RotateRight(state2, 1);
    state1 = state1 - 12076313562642528635;
    return state2;
}

This generator is interesting because state1 changes from iteration to iteration by subtracting an odd number, and thus state1 will go through every possible ulong before repeating. I have no idea what the actual period of state2, and thus the whole generator is, but it is atleast the period of state1, which is 2^64. This is interesting because it is a rare case where I can prove a lower bound on the period by inspection, and the generator actually passes some decent tests of randomness. The generator is also remarkably simple, using only a single add, subtract, and rotate, so it should be competitive speedwise with some of the existing cutting edge PRNG’s.

I have tested this generator on a given seed to 100 billion numbers, using the GCD test, the gorilla test with word lengths 7 and 17, and the linear serial correllation test, and it passed all of the tests. I think the rotate constant it found was actually 4, not 1. I tested it at 1 in an attempt to see if that weakened it, but it apparently is quite strong at those parameters. I haven’t yet tuned any of the parameters, but it is remarkably simple. State1 is what Marsaglia called a Weyl sequence. Not very random. State2 is an accumulator that gets rotated between accumulations. Apparently that’s a fairly decent way to randomize things.

Ender314

Building a One-Way Compression Function

One-way compression functions are a common cryptographic primitive. A one way-compression function takes two equal-length inputs and generates an output of the same length. The output must appear random relative to the inputs, and it must be hard to find two inputs that generate a given output (the one-way criterion). Many hash functions are built from one-way compression functions, for instance using the Merkle-Damgard construction.

We begin the search for a good function using genetic programming. The function we are looking for takes two integers as input, and produces one integer as output. The criteria are similar to general randomness criteria for a hash function. Specifically, the sequence f(1,0), f(2,0), f(3,0), … must pass strong tests of randomness. Also, flipping a single bit of either input must change the output in a completely random way (the differential test).

After running for quite some time and exploring over 200,000 different functions, the genetic algorithm finds the solution below (in assembly and C#).

This function is remarkably simple, and has been tested out to a billion evaluations without any detectable pattern. The state variable y is changed at every iteration by exclusive-or with a constant. Since exclusive-or is it’s own inverse, this means the value of y alternates between the initial value and the exclusive or of the initial value with the given constant. This value is exclusive-or’d with x at each iteration, and x is scrambled with the rotate-multiply operation, which is known to be a very strong randomizer.

The first line inside the loop serves several purposes. First, it ensures that Hash(0,0) is not zero (a desirable property fort hashes). Second, it ensures that the second line in the loop actually changes the value of x significantly, even if the initial value of y doesn’t have a lot of 1’s (in the binary representation). For instance, if y was just 256, there would only be one non-zero bit in the binary representation, and without the first line, the second line “x ^= y;” would just flip one bit of x at each loop iteration. The first line ensures that, whatever the initial value of y, on multiple iterations it will have many non-zero bits, causing a lot of mixing with x.

As with many hashes, the particular value of the constants likely doesn’t matter a whole lot. The first constant could likely be any number with about half 1’s in the binary representation, the multiplier could be any odd number with roughly half 1’s in the binary representation, and the rotation constant could likely be any number not too close to 1 or 63. This function is a good randomizer, but in order to make it a good compression function, one would have to wrap it as below.

Public ulong Compress(ulong x, ulong y)
{ return x ^ y ^ Hash(x,y); }

By mixing the inputs with the output, it becomes more difficult to invert.

Ender314

The Differential Test

Random numbers have a tight relationship with cryptography. A good encryption algorithm will take plain text and a key, and encrypt it to a ciphertext that looks completely random. If the ciphertext is distinguishable from random to an attacker, that means there is a weakness, and one could potentially exploit that weakness to retrieve the key or the plaintext. In general, if any weakness is found, then the cipher is considered broken, because one may be able to “pull the thread” on that weakness and unravel the whole system.

There are many different ways to attack an encryption algorithm, but here I will just be briefly discussing chosen plaintext attacks. This is where the attacker is given black-box access to the encryption algorithm, and their goal is to “crack” the algorithm by carefully choosing a number of plaintexts, encrypting them, and then deducing the structure of cipher somehow, leading to a way to read an adversary’s ciphertext. If the cipher is strong, no such attack will be possible.

One particularly strong way of attacking a cipher is with a differential attack, named because the attacker is trying to track differences through the cipher. For example, encrypting some text, then encrypting the same text with one letter changed, then examining the differences between the two resultant cipher texts. For a strong cipher, the resultant pair of cipher texts will look completely random and not correlated in any way. For a weak cipher, some structure may be visible.

In the context of random number generators, we can use this technique to analyze counter-based random number generators. Those are generators where the internal state is simply a counter that increases on each call, and the generator returns some function of the counter. In this case, the output function must be very good at randomizing, and I generally call these hash-like generators. The test I have developed I call the differential test, and it is a quite hard test to pass. Jumping ahead to the result, the linked RandomHash generator, when weakened to only 8 rounds (instead of 64), passes normal tests of randomness to around 2 billion numbers. It fails the differential test at around 1 million numbers generated. That’s a factor of 2,000 difference in fitness, reflective of just how powerful the test is in distinguishing output from random.

The differential test is fairly straightforward to understand, although there is quite a bit of book keeping under the hood. Consider a sequence of random numbers a, b, c, …, and a related sequence A, B, C, …, where A= a xor 1, B = b xor 1, C = c xor 1, etc. We are testing a counter-based generator with an output function F(x). We examine the sequence

F(a) xor F(A), F(b) xor F(B), F(c) xor F(C), …

We then take this sequence and test it for randomness using the same tests we would use for any random number generator. This is, in effect, testing if flipping the lowest bit results in a hash that looks completely different. The differential test does this for every bit, so it is quite a heavy and time-intensive test. Functions that pass this test are good candidates for cryptographic-quality generators.

The differential test can give quite specific results that are interesting and of themselves, like “Fails differential on bit 62, for the gorilla test on bit 9.” This means that when bit 62 is flipped, there is a non-randomness in the way that bit 9 of the output changes.

The source code for the RandomHash generator. Note the 64-iteration loop in the Nextulong() function., which we discuss lowering to 8 to show weaknesses.
https://github.com/EnderPi/FlemishGiant/blob/main/Jabba/Framework/Random/RandomHash.cs

The source code for the differential test. Note the Process method, which repeatedly flips bits and hashes the resultant numbers, then passes the exclusive or of the result into a normal randomness test.
https://github.com/EnderPi/FlemishGiant/blob/main/Jabba/Framework/Random/Test/DifferentialTest.cs

Ender314

Testing Bitwise Relationships

In the course of doing randomness testing, I developed a powerful test for randomness, inspired by a technique for trying to break encryption (Linear Cryptanalysis). I call this test the Linear Serial Correlation test. It is excellent at detecting relationships between sequential pairs of numbers from a random number generator.

The basic idea is that for each bit of a random number and each bit of a second random number, those bits should be the same half the time, and different half the time, just like two separate, unrelated coin flips would be. Conveniently, this can be done at the bit level with exclusive-or.

There are 64 bits in each random number, so the test needs to keep track of 64 x 64 = 4,096 bit pairs. This is done with a two-dimensional array of integers, 64 x 64 in size. The first number from the generator is stored, then each bit of the second is exclusive-or’d with each bit of the first, and the result added to the appropriate position in the array. Then, the same for the second and third numbers from the generator, and so on.

After several hundred or more iterations, there are 4,096 integer counters, and a total number of pairs evaluated N. For a theoretically “good” random number generator, each counter in the array should be around half of N, just like a random coin flip would be. The test uses the same binomial methodology as in the previous article to determine a p-value.

A very small p-value indicates a statistically significant relationship between one of the bits in a number from the generator and one of the bits in the next number. Importantly, since 4,096 different p-values are examined, a Bonferroni correction is applied (link below).

The source code for the Serial Correlation Test. In particular, look at the CalculateResult and Process methods.
https://github.com/EnderPi/FlemishGiant/blob/main/Jabba/Framework/Random/Test/LinearSerialTest.cs

Wikipedia article discussing the Bonferroni correction:
https://en.wikipedia.org/wiki/Bonferroni_correction

Ender314

There is no limit to the number of tests of randomness that one could come up with. All tests generally proceed the same way – create some hypothetical test that would have a known outcome with a theoretical “perfect” random number generator, then run that test with the generator in question and compare the output with the theoretical expectation. One of the oldest batteries of randomness tests, the Diehard tests, included a number of tests with simple real-world situations, such as playing games of craps. All tests are not created equal, however, and there are a number of criteria to consider when evaluating a test battery.

  1. How fast is the test? Some tests, such as the 64-bit birthday test, are very slow due to repeated sorting and looping through arrays of several million numbers.
  2. What is the memory footprint of the test? Some tests have a small memory footprint, like the linear serial correlation test at 32 kb. Others, like the 24-bit gorilla test, require 4 gigabytes of ram to store the state arrays.
  3. What is the minimum amount of numbers that the test needs before it reports a pass/fail? The GCD test can report a pass/fail at only 100 numbers, while the 24-bit gorilla test requires around 2 billion.
  4. Is the test capable of detecting non-randomness quickly? Many tests are theoretically capable of detecting a wide variety of biases, but if the test takes 2 years to run then it is of little practical interest.
  5. Is the test independent from the other tests in the battery? Some tests are very similar and test similar biases in the generator. These tests duplicate effort and so typically only the better one should be retained.

I frequently talk about how “random” or “strong” a random number generator is in terms of it passing tests at a large amount of random numbers. For example, the EnderLcg RNG has passed a number of tests up to a trillion, and is expected to pass several at a quintillion. This is relevant because most tests become “stronger” as more numbers are tested. In this sense, “stronger” means “able to detect a smaller bias”.

Consider, for example, a coin with a small bias, say 55% heads, 45% tails, with a binomial distribution test (link below, but details not relevant to understand). If you flipped it 50 times, and got 27 heads and 23 tails, many people would intuitively conclude that a “fair” coin might produce that result. They’d be correct, the p-value for this is 0.57. Recall that the p-value is the chance that a result that extreme might be produced by chance, so this means there is a 57% chance that a “fair” coin would produce a result of equal or greater bias. As the coin is flipped more times, however, the p-value here will converge to zero (graph below).

As can be seen, the p-value converges to zero as the number of flips increases, indicating a very small probability that a fair coin would produce such a biased result. In practice, we set some small threshold where we consider a test result a fail if the p-value drops below that threshold. I use 1e-9 as a threshold for failure, but consider a test suspicious when the p-value drops below 0.001. The graph above is the reason why it is important to occasionally stress a random number generator by testing very long streams of numbers. Frequently a generator will pass a test at a smallish number of iterations, such as a billion, but fail when pushed further.

Reading on the binomial distribution
https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation

The Diehard tests
https://en.wikipedia.org/wiki/Diehard_tests

Ender314

The Gorilla Test

A powerful test for randomness was described by George Marsaglia, dubbed the “Gorilla Test.” The test is inspired by the idea of monkeys typing away randomly on typewriters. The general concept is straightforward – if you randomly generate, for example, 4-letter “words”, every possible word should occur at roughly the same frequency. In this context, “word” just means a sequence of letters, like “ajht”, “that”, or “zvwp”. If we consider just the 26 letter English alphabet, there are 26*26*26*26 = 456,976 possible “words”.

So if we generate 4,569,760 random 4-letter “words”, we should get about 10 of each of the 456,976 different words. If the counts for many of the words differ too much from 10, it is likely not very random. Specifically, a Pierson’s Chi-Squared test (link below) is performed, and a “p-value” is generated. The p-value is the probability that a result at least that extreme will occur due to random chance, hence a very small p-value, like 0.00001, indicates that the random generator is poorly distributed and fails the test. A p-value that isn’t too small, such as 0.15, indicates that the numbers appear random.

The actual gorilla test is one layer more complex than described above, although the concept is the same. The random number generators I consider all generate 64-bit unsigned integers. Each bit location of the 64-bit random numbers is treated independently. Consider 8-bit “words”, for example. 8 random 64-bit numbers are generated. The first bit of each of these 8 numbers create one 8-bit “word”. The second bit of each of these 8 numbers create another 8-bit “word”, and so on. These are all stored separately, since each bit is being tested independently.

To generate a reasonable test, the expected count of each value should be about 5, so we would need 5 * 256 * 8 = 10,240 random numbers to get a test statistic. Since each bit position is tested independently, there are 64 p-values generated. As Marsaglia noted, remarkably, many generators consistently fail for certain bit positions, indicating that some bits are more random than others. The Gorilla test is fairly hard to pass, and requires quite a bit of book keeping.

As the word size gets larger, the required memory space and minimum quantity of random numbers to get a result increases rapidly. For 16-bit words, the minimum quantity of random numbers to report a result is 5 * 65536 * 16 = 5,242,880. To keep track of the counts, a 64 x 65,536 two-dimensional array is required. The largest word size I routinely use is 24 bits, which requires 5 * 16,777,216 * 24 = 2,013,265,920 numbers to report a pass or fail, and requires a 16,777,216 x 64 two-dimensional array to store the counts. If the counts are stored as 32-bit integers, this amounts to about 4 gigabytes of ram for the state arrays.

Marsaglia’s paper is linked below. Notably, the test he describes is slightly different, although the theme is identical. The method we use is mentioned by Marsaglia as being infeasible due to memory constraints – “Unfortunately, there are usually too many possible k-letterwords to keep cell counts for each one.” This was true in 2002 when the paper was first published, but is no longer true in 2022. He describes a test for 26-bit “words”. Using the method described here, this requires around 16 gigabytes of ram for the state arrays (for 64-bit generators), which is possible with a high-end desktop machine.

Further reading on details of developing test statistics
https://en.wikipedia.org/wiki/Chi-squared_test#Pearson’s_chi-squared_test

Implementation of Gorilla Test. In particular, look at the public methods Initialize, Process, and CalculateResult.
https://github.com/EnderPi/FlemishGiant/blob/main/Jabba/Framework/Random/Test/GorillaTest.cs

Marsaglia’s 2002 Paper – Free to download.
https://www.researchgate.net/publication/5142801_Some_Difficult-to-Pass_Tests_of_Randomness

Ender314

Applying Output Scrambling Functions

The classic linear congruential generator is a well-known and very problematic generator. This is based on the simple linear recurrence Y(n+1) = m*Y(n) + b, for some constants m and b. If the constants m and b are chosen carefully, then this generator will cycle through all possible values before repeating. Unfortunately, the upper bits are much more random than the lower bits. The lowest bit alternates between 0 and 1, and each higher bit has a slightly higher period. As a first attempt at fixing the generator, a simple Y = Y ^ (Y >> c) for some constant c, can be applied to the output, which is a bijective operation. This is a well-known way of mixing some of the random high bits with the lower bits. This is another well-known pattern for random number generators, splitting the generator into two parts. The first part mutates the state, and the second part returns some function of the state. The results are below.

Unsurprisingly, a rightshift of 32 seems to be the best, yielding a fitness of around 5.4 billion. In this context, “fitness” means the quantity of random numbers required for the tests to detect a pattern. 5.4 billion is a significant improvement vs the fitness of 55 without the output scramble. The full period for a 64-bit generator is about 18 quintillion though, so we should strive to do better. A different bijective output is the rotate-multiply considered in the Romu generator. A fixed constant was chosen for multiplying, and the rotation constant was varied. The result is below.

This output scrambling function is much better, as it creates a generator that is strong enough that it can’t be tested at the optimal value. Linear extrapolation determines that the ideal rotation constant is 15, with a fitness of 55.8 trillion. It is remarkable that the lines have different slopes, this is because there are different failure mechanisms. On the left side, the failures are a mixture of GCD on power-of-two values like 1024, 2048, 4096, 8192, etc, and the gorilla test on some of the lower bits – 4, 6, and 9. Rather non-uniform failure mechanisms, but still a good linear fit. On the right side, the failures are overwhelmingly the gorilla test on bits 0 and 1, with high p-values, indicating overly uniform word distributions. This is much better, so next we consider applying both scrambling functions to the output, fixing the bitshift at 32 and varying the rotation constant. We apply the rotation-multiply first, then the bitshift-xor. The results are below.

As can be seen, this results in a very strong generator, with an optimum rotation constant of about 9 and a fitness value of almost 1 septillion. This is much larger than the possible period of the generator, indicating this generator will likely pass all tests until it starts repeating values. The tests used in this simulation were the GCD test, the gorilla test with word sizes 7 and 17, and the linear serial correlation test. A C# version of the generator is linked below. I can recommend this for most general purposes, as it is fast and passes good tests of randomness. https://github.com/EnderPi/FlemishGiant/blob/main/Jabba/Framework/Random/EnderLcg.cs

Ender314

I started doing genetic programming to create random number generators back in March of 2021. Started out simple, with expression trees for the state transition function and the output function, and Marsaglia’s GCD and Gorilla test from his paper on difficult to pass tests of randomness. I modified the tests slightly to take advantage of more RAM in modern times and to work on 64-bit generators. I set the genetic algorithm to favor fewer operations, and quickly found out that there was a solution it converged to, without fail, every time. A multiply, then a bitwise rotate. It makes sense, multiplication is known to be very good at scrambling the upper bits, but leaves a weakness in the lower bits, so a rotate would allow it to mix well.

On some further investigation, I found out that it almost didn’t matter what the multiplier was, as long as it was odd and had a decent mix of 1’s and 0’s. The rotate did matter, but it basically couldn’t be too big or too small. 11-53 or so worked fine, with prime numbers like 29 and 31 seeming to fair the best. The only problem was the period. How long would the function take to wrap around to where it started? How could I guarantee any seed wouldn’t have a short period? I brute-forced all possible multiply-rotate pairs for integers with 8-17 bits, starting at a seed of 1. For every size integer, I found constants that would produce a full period (excluding zero, of course). RotateLeft(21*x, 3) is full period for 8 bits, and RotateLeft(23269*x,5) is full period for 16 bits. The constants always come in pairs, since odd numbers have a multiplicative inverse, and rotating right and left are inverses. For example, RotateRight(61*x, 5) is the other full period constants for 8 bits. Also, it seemed that there were more full-period constants with larger integers. There were only one pair of constants for 8-bit integers that gave a full period, but 9 pairs for 16 bit.

Unfortunately, for every size integer, I also found constants with a period of only 2. In fact, the period was uniformly distributed from 2 to max, across all possible multiply, rotate pairs (CDF above). There were even pairs that made the state transition into a linear feedback shift register. There were plenty with a very large period starting at 1, but weren’t full cycle, implying some seeds had a very short cycle. So I did a little digging into math on permutations, and looked at the constants for smaller integers to see if I noticed a pattern. I wasn’t able to find a solution. I think it is a promising state transition function, since we have very few that are known to have full period for use in random number generators. Also, it passes decent tests of randomness without output scrambling, and empirical evidence at smaller integers suggests full-period constants do exist. However, I chose to move on to other work, perhaps returning later. I don’t like chaotic generators without guarantees. I recently discovered Mark Overton’s website, and it seems he did quite a bit of work on these, but he hasn’t solved the full cycle problem yet.

My genetic programming simulations don’t find these anymore, as I’ve developed a strong randomness test, based on linear cryptanalysis, that fails these virtually immediately. Below are the first numbers from the sequence for 8-bit integers.

1, 168, 70, 245, 200, 67, 251, 188, 99, 248, 194, 87, 25, 104, 68, 164, 163, …