Randomness Testing Guide

Block Frequency Test

Let's say we have the following binary string.

s=00000000001111111111s = 00000000001111111111

It is obviously not random since the zeros and ones are not mixed together. Therefore, we must check that there are roughly an equal number of zeros and ones in each part of the string. Let's break the string into two substrings.

s0=0000000000s_{0} = 0000000000
s1=1111111111s_{1} = 1111111111

A useful property of χ2\chi^{2} variables with mm and nn degrees of freedom is that their sum is also a χ2\chi^{2} variable with m+nm+n degrees of freedom.

Therefore, in the example above, we get the following χ2\chi^{2} value.

χ1+12=((05)25+(105)25)+((105)25+(05)25)=20\chi^{2}_{1+1} = \left(\frac{(0 - 5)^2}{5} + \frac{(10 - 5)^2}{5}\right) + \left(\frac{(10 - 5)^2}{5} + \frac{(0 - 5)^2}{5}\right) = 20

The p-value for a chi-squared of 20 with 2 degrees of freedom is 0.0000, which is less than 0.01, indicating that this sequence of numbers was probably not randomly generated.

All Zeros Then All Ones00000000001111111111

This test, known as the block frequency test, is a slightly better version of the basic frequency test. However, it fails on the following binary string that was obviously not randomly generated.

s=01010101010101010101s = 01010101010101010101
Alternating Zeros and Ones01010101010101010101

Notes

The technique of dividing the input string into "blocks" and testing each block independently works for any test of randomness that uses the chi-squared statistic.

The block frequency test, as implemented in this guide, doesn't work correctly with small blocks on large input strings. This is due to a floating-point precision issue caused by having a large number of degrees of freedom. (One of the numbers in the calculation of the p-value becomes NaN.)