Deception when generating random numbers

February 24, 2020·
Victor van Pelt
Victor van Pelt
· 3 min read
The computer will draw a random number between A and B. 
Each number between A and B has an equal probability to be drawn.

Have you ever used these instructions to describe a number generating process to participants in your experiments? If your answer is yes, you may have lied to your participants. The main cause is the code you used to generate random numbers. Now perhaps you don’t mind lying to participants, but many researchers who conduct accounting experiments do (see Libby and Salterio 2019). In this post, I explain why drawing random numbers for experiments may lead to deception, and what you can do about it.

Is drawing a number truly random and is each number equally likely to be drawn?

In many experiments, we want to generate random numbers and allocate them to participants. In theory, we can use a continuous uniform distribution on the range A to B (see Maas et al. 2012). The probability that we draw each particular number is equal. We can either draw from an open-interval, which does not include the endpoints A and B, a closed-interval, which does include the endpoints A and B, or semi-open interval, which includes one endpoint but not the other. In Python, which is central to oTree, we use the following code to draw a number from a continuous uniform distribution on the range A through B.

import numpy as np
random_number = np.random.uniform(A,B)

While harmless at first sight, programming languages and computational methods, such as the one displayed above, merely approximate a random draw. Rather than using true random number generators (TRNGs), many use pseudo-random number generators (PRNGs). As the name suggests, PRNGs are not truly random because they use mathematical formulas or precalculated lists to produce sequences of numbers that appear random.

Die Roll
The basic difference between PRNGs and TRNGs is easy to understand in the context of rolling a die. Using PRNGs corresponds to someone rolling a die multiple times and writing down the results in a list. Whenever you ask for a die roll, you get the next number on the list. Effectively, the numbers drawn appear random, but they actually originate from a predetermined and planned list. Using TRNGs would correspond to someone actually physically rolling the die when you request the draw.

Using PRNGs leads to two issues for researchers who do not want to lie to participants in their experiments. First, the number generation process is not truly random. Instead, the number drawn is a product of a mathematical formula or precalculated list. Second, each number on the specified range is not equally likely to be drawn. Since the draws originate from a mathematical formula or a precalculated list, one number may have a higher likelihood of being drawn than another.

Help I am deceiving my participants!
What can I do?

At this point, you may be inclined to avoid using computers and go back to pencil-and-paper experiments. However, please keep using computers and PRNGs. A good deal of research has gone into pseudo-random number theory, and modern algorithms for generating pseudo-random numbers are so good that the numbers look exactly like they were really random. PRNGs can produce many numbers in a short time and numbers can be reproduced at a later stage if the sequence is known. To avoid deceiving participants, however, you may want to avoid communicating to participants that the number generation process is random and that each number has an equal probability to be drawn. Rather than the instructions presented at the start of this post, it may be better to inform participants that the computer will draw a number between A and B.

Victor van Pelt
Authors
Assistant Professor of Accounting