[ThinAir Home] [Table of Contents] [Quick Start] [Choosing a Random Number Generator]

3 Characteristics of Random Number Sequences

What do you look for in a random number sequence? The easiest answer to this question is unpredictability. But how do you measure unpredictablility? One way to determine the unpredictability of a number sequence is to look for characteristics of a perfectly random sequence of numbers. What are the characteristics of a good random sequence? Which characteristics of a random sequence are important?

Once we know what characteristics are desirable in a random sequence, we can evaluate PRNGs in terms of the sequences they generate. Just as important, we can decide which characteristics do not affect our use of the PRNG. If a particular characteristic has no effect on your problem, there is no need to consider that characteristic when evaluating PRNGs.

3.1 Distribution

Probably the most fundamental characteristic of a sequence of numbers is its distribution. The PRNGs in the ThinAir library (with a few exceptions) produce uniformly distributed number sequences. Other distributions include normal, Poisson, geometric, binomial, and student-t. Other authors have published algorithms that convert uniformly distributed number sequences into other distributions. (See [4, pages 131-133])

3.2 Range

Another fundamental characteristic of a random number sequence is its range. What is the smallest number in the sequence? What is the largest? These two numbers define the range of the whole sequence. The range of a sequence is more important and more subtle than most people realize. If the range of a random number sequence is 0 to 64, it is useless for generating random percentages. One good example of a bad range problem appears in some simple card games. In these examples, a 16-bit random number generator is used to choose a layout of a deck of cards. The problem lies in the fact that the generator can produce at most 65,536 different numbers. A deck of cards can produce 80.66 * 1066 different layouts. If you are very careful in your use of the generator, this problem is solvable. If not, most of the possible layouts never occur.

3.3 Type

Another useful characteristic of a random sequence is its type. What kind of numbers make up this sequence. Are we talking about bits, integers, floating point numbers, strings, or something else entirely. Most of these types can convert into each other relatively easily. In general, PRNGs fall into three categories:

  1. PRNGs that generate a sequence of integers.
  2. PRNGs that generate a sequence of floating point numbers between 0.0 and 1.0.
  3. PRNGs that generate a stream of bits.

3.4 Period

All algorithm-based PRNGs repeat if they are allowed to run for long enough. This is caused by the fact that the generator has a finite amount of internal state, and by the fact that an algorithm cannot produce truly random output. Since we know the sequence must repeat, we would obviously like to have it repeat after as long a time as possible. The number of individual numbers generated by the PRNG before it begins to repeat is known as the period of the sequence. This sequence of numbers is often called a cycle. The amount of internal state retained by a PRNG defines the upper limit on the period. A PRNG with a period equal to the maximum predicted by its internal state is sometimes called a maximal-period generator.

Depending on the algorithm used, a PRNG may actually generate more than one cycle. The more individual cycles a generator has, the shorter the period each cycle has. The best generator would, of course, have a single cycle that was longer than we could ever use (effectively infinite). Some multiplicative congruential random number generators have two cycles, one being very long (approximately 232-1 numbers in the sequence), the other being a cycle of period 1 at 0.

3.5 Coverage

The term coverage describes how well a given sequence covers its output range. For example, a PRNG with an output range of 0 to 232-1 should generate every number between 0 and 232-1 if allowed to run for long enough. Not all PRNGs cover their entire output range.

If a PRNG has multiple cycles, the coverage issue is even more interesting. The generator may actually be capable of covering the whole range of output values. However, only certain values are covered in each cycle. The multiplicative congruential PRNGs mentioned above have a maximum coverage one less than expected in the main cycle. This is because an output of zero is never part of the main cycle.

Many tests have been devised to study coverage. Two of these are the equidistribution, or frequency, test and the gap test. The equidistribution test tries to verify that all of the numbers in the sequence are equally likely. The gap test studies the probabilities of gaps of different sizes between numbers in a particular range.

3.6 Runs

One interesting characteristic of a sequence of numbers concerns runs. When each successive number in a subsequence is larger than the previous number, we have a run up. When each successive number in a subsequence is smaller than the previous number, we have a run down. In a good random sequence we would expect the number of runs up to be pretty close to the number of runs down. We would also expect that longer runs would be less likely than shorter runs.

3.7 Serial Correlation

A good sequence of random numbers should not have any correlation between successive numbers. This means that you should not be able to predict the next number in the sequence by looking at previous numbers. The classic example of serial correlation in a PRNG is the linear congruential generator. When a linear congruential PRNG is used to generate pairs of numbers for use as x, y coordinates on a graph, a pattern often shows up in the graph. The pairs of numbers tend to fall on a set of parallel lines on the graph. This artifact is a simple result of the algorithm used.

One way to test for high serial correlation is the Poker test. The Poker test looks at the combinations of five successive numbers taken from the sequence. These numbers are analyzed for pairs, three-of-a-kind, full house, etc. The probabilities of these combinations should approach that of a random number stream.

3.8 Permutations

In a truly random number stream, any permutation of a set of numbers is as likely as any other permutation of the same numbers. If one permutation of a set of numbers is more likely than another, a few numbers in the set may supply enough information to help predict following numbers.

3.9 Spectral Characteristics

Even if a sequence of numbers does not exhibit any obvious patterns, it may have higher order artifacts. A truly random sequence would have no particular frequency component more pronounced than any other. Since any PRNGs must repeat after some length of time, there is at least one dominant frequency component. Studying the spectral characteristics tells us if there are any others.


For further information, contact G. Wade Johnson (gwadej@anomaly.org).

© 1997 G. Wade Johnson. All rights reserved.
http://www.anomaly.org/ThinAir/charactr.html