Random Notes About Python's Random Module

Posted on | 1410 words | ~7mins
python notes software software-engineering

TL;DR: Use the functions in the random module for modeling, simulations, games, sampling, etc. but use os.urandom, secrets, or random.SystemRandom for cryptographic applications. I know very little about cryptography and security, these are just my notes about stuff I recently learned.

It uses the Mersenne Twister algorithm, which is a pseudorandom number generator with a period of 2^19937-1. It is one of the most widely used PRNGs in the world, and suitable for many applications. But it is not cryptographically secure and not suitable for cryptographic applications.

There are two reasons for this:

  1. It’s possible to predict the next number in the sequence given a relatively small set of previous numbers (624)
  2. It is set to fallback on choosing a seed based on the system time if there is no source of system randomness available (/dev/urandom on linux, etc.)

It’s also pretty simple to implement, and there is a really detailed pseudocode implementation in the Wikipedia article. Here’s what it might look like in Python:

class MT19937:

  # buncha constants
  f = 1812433253
  w, n, m, r = 32, 624, 397, 31
  a = 0x9908B0DF
  u, d = 11, 0xFFFFFFFF
  s, b = 7, 0x9D2C5680
  t, c = 15, 0xEFC60000
  l = 18

  def __init__(self, seed: int):
    # init 
    self.MT = [0] * self.n  # state

    # masking for 32 bit ints 
    self.lower_mask = (1 << self.r) - 1
    self.upper_mask = (1 << self.r)

    # index
    self.index = self.n + 1

    # init state for the first time
    self.seed_mt(seed)
  
  def seed_mt(self, seed: int):
    # Initialize the generator from a seed
    self.index = self.n
    self.MT[0] = seed
    for i in range(1, self.n):
      self.MT[i] = self.f * (
        (self.MT[i-1] ^ (self.MT[i-1] >> (self.w-2))) + i)
      self.MT[i] &= self.lower_mask

  def extract_number(self):
    if self.index >= self.n:
      if self.index > self.n:
        raise Exception("Generator was never seeded")
      self._twist() 

    y = self.MT[self.index]
    y ^= (( y >> self.u) & self.d)
    y ^= (( y << self.s) & self.b)
    y ^= (( y << self.t) & self.c)
    y ^= ( y >> self.l)
    self.index += 1
    return y & self.lower_mask

  def _twist(self):
    for i in range(self.n):
      x = (self.MT[i] & self.upper_mask) | (
        self.MT[ (i+1) % self.n] & self.lower_mask )
      xA = x >> 1
      if x % 2 != 0:
        xA ^= self.a
      self.MT[i] = self.MT[(i + self.m) % self.n] ^ xA
    self.index = 0

Predicting the Next Number in the Sequence

The internal state of the generator is used to produce the next number in the sequence. Although the numbers generated won’t cycle for an astronomically large amount of time on current computers (the period is 2^19937-1, a number with 6000 digits!), reading a sequence of 624 numbers from the generator allows you to reconstruct the current internal state of the generator and to predict the next number in the sequence. For example, this is pretty straightforward to do.

Basically the generator produces a new number by taking the next number in its 624 uint32 internal state and ’tempering’ it with a bunch of bitwise operations:

 def temper(y) -> int:
    y ^= (( y >> MT19937.u) & MT19937.d)
    y ^= (( y << MT19937.s) & MT19937.b)
    y ^= (( y << MT19937.t) & MT19937.c)
    y ^= ( y >> MT19937.l)
    return y & ((1 << 32) - 1)

Where all the constants are defined in the class. The temper function is reversible, so it’s a little tedious but you can undo all the operations and get back to the original number:

def untemper(y: int) -> int:
  """ Untemper the output of the MT19937 RNG.
  Used the excellent explanation here:
    https://occasionallycogent.com/inverting_the_mersenne_temper/index.html
  to wrap my head around all this bit manipulation.
  """
  smask = (1 << MT19937.s) - 1
  umask = (1 << MT19937.u) - 1
  lower_mask = (1 << MT19937.w) - 1
  y ^= (y >> MT19937.l)
  y ^= ((y << MT19937.t) & MT19937.c)
  y ^= ((y << MT19937.s) & MT19937.b & (smask << MT19937.s))
  y ^= ((y << MT19937.s) & MT19937.b & (smask << (MT19937.s * 2)))
  y ^= ((y << MT19937.s) & MT19937.b & (smask << (MT19937.s * 3)))
  y ^= ((y << MT19937.s) & MT19937.b & (smask << (MT19937.s * 4)))
  y ^= (y >> MT19937.u) & (umask << (MT19937.u * 2))
  y ^= (y >> MT19937.u) & (umask << MT19937.u)
  y ^= (y >> MT19937.u) & umask
  return y & lower_mask

I used the blog post linked in the docstring of the untemper function to help me wrap my head around all the bit twiddling, check it out for a more detailed explanation and some cool diagrams showing what’s actually going on at each step.

So basically, after reading 624 values output by the generator, you can untemper them to get back to the original internal state of the generator. Then you can use that make a clone of the original generator, with the same internal state, so it will produce the same sequence of numbers as the original generator.

There is even a Python module available called RandCrack (and I’m sure a host of others) that clone the state and then offer the same interface as the random module. This alone should makes it obviously not suitable for cryptographic use.

Fallback to System Time

The second reason is that if there is no source of system randomness available, the seed is initialized using:

  • the current system time
  • the process ID
  • the current system monotonic time

as the seed. Those things have a lot less entropy than use a true random source to chose the initial state. For example, if a system were not using /dev/urandom or similar, you can make some reasonable assumptions about the system time, pid, and monotonic time that might have been used as the seed and use that brute-force the possible seed choices. You would have a reasonable chance of ending up with your generator in the same state as the target.

There’s still a lot of possibilities, and you won’t know the exact state of the generator unless you know how many times it’s been called since it was seeded, but it’s still a lot less entropy than reading the whole state in from /dev/urandom.

What can you use for a cryptographic applications?

Python has a module called secrets that provides a cryptographically secure source of randomness, and that’s the one that should be used to generate random bytes or bits for tokens, etc. There is also the lower level os.urandom, which is a direct wrapper around the system’s source of randomness (e.g. /dev/urandom on linux) - it reads information from ‘device drivers and other sources of environmental noise’ and uses that instead.

There is also the random.SystemRandom class in the random module. It provides the same interface as random (all the functions are really just calling methods a hidden instance of random.Random). It implements the same methods as are available as functions in random, but instead of using the Mersenne Twister algorithm PRNG under the hood, it calls os.urandom to generate random numbers using the system’s source of randomness. See the docs on that here: https://docs.python.org/3/library/random.html#random.SystemRandom

Conclusion

The Python random module is great for generating random numbers for modeling, simulations, games, etc. but it’s not suitable for cryptographic use. Its state can be obtained by observing a relatively small number of consecutive numbers from the generator, or in some cases by brute forcing the reduced space of possible seeds if the system is not using seed based on ’non-deterministic sources of randomness’ from the OS. So, for cryptographic applications, use the secrets module or os.urandom instead.

Why not always use secrets or os.urandom? They are slower than the random module, and they are, of course not deterministic. If you want to be able to reproduce the same sequence of random numbers - for example in a simulation, game, or for testing, you cannot do that with os.urandom (the higher level secrets and random.SystemRandom).

Whether or not the Mersenne Twister should be used as the default general purpose PRNG in so many languages and compilers is called into question in this review article. The author demonstrates a number of statistical tests which this family of algorithms fail, and suggests alternatives. To quote from the conclusion:

The current, dangerous ubiquity of the Mersenne Twister as basic PRNG in many environments is a historical artifact that we, as a community, should take care of.

Something to keep in mind next time you write you own programming language or compiler! Or really, when relying on a PRNG for anything important!