Grid of Random Integers 3

Inspired by a recent query at one of the Excel forums I occasionally visit, I would like to share a formula-based solution for the task of generating an nxn grid of random integers, where each of those integers is unique within that range.

For example, for the case of n=10, we might have, in A1:J10:

Grid of Random Integers

where I have formatted the cells in this range as custom type: 00 (applying a TEXT function to the formula would complicate matters, in the sense that this would interfere with the functioning of our FREQUENCY construction).

The array formula** in A1 is:

=SMALL(IF(FREQUENCY(($A2:$J$11,B1:$K1),ROW(INDIRECT("1:99"))-1)=0,ROW(INDIRECT("1:100"))-1),RANDBETWEEN(1,100-COUNTA($A2:$J$11,B1:$K1)))

where, in general, should we wish to place this matrix elsewhere within the worksheet, for the formula in the top-leftmost cell of the output the two ranges referenced in the union should consist of:

1) The 10×10 range beginning in the cell immediately below that top-leftmost cell and extending down and to the right

2) The 1×10 single-row array of cells immediately to the right of that top-leftmost cell

both with the relative/absolute referencing as given here.

How does it work?

The actual generating of the values is not so difficult; indeed, many readers may well have solved this sort of problem for a list of returns in a single row or column. What makes matters slightly more problematic is that the range to be queried for each cell is not some such linear range immediately preceding the cell in question (either above or to the left of it, for example), but a two-dimensional array.

We also need to be wary of circular references, which means that we cannot simply reference the range A1:J10 within some construction which seeks to determine which values have already been entered in other cells. Indeed, it would be ideal here if Microsoft had introduced the ability to reference contiguous ranges, though with an additional parameter perhaps which allowed for the exclusion of one or more cells from that range, something along the lines of F($A$1:$J$10,A1), which would operate over all cells within the range A1:J10, excluding A1.

Unfortunately no such operation exists, and so we must effectively set our range as being, not not which takes a contiguous range and excludes certain cells from it, but one which comprises a union of one or more ranges.

However, if we are going to be working with range unions, then we need to be aware of which functions we have at our disposal which can operate over such ranges, which is by no means all.

In fact, were we in fact satisfied with having this list of random integers in a single column (or row), the most straightforward construction would be one which uses COUNTIF. Unfortunately, this is one example of a function which cannot accept range unions for its first argument. Hence we must look to other functions when dealing with two-dimensional ranges, as here.

Fortunately, there exists the extremely useful FREQUENCY, which, appropriately constructed, can not only reproduce the results usually obtained via COUNTIF, but which can also operate over references comprising a union of several ranges.

Let’s take one of the formulas at random and have a look at how this construction operates. Taking the formula in C8, which would be:

=SMALL(IF(FREQUENCY(($A9:$J$11,D8:$K8),ROW(INDIRECT("1:99"))-1)=0,ROW(INDIRECT("1:100"))-1),RANDBETWEEN(1,100-COUNTA($A9:$J$11,D8:$K8)))

You can see the two ranges being referenced ($A9:$J$11 and D8:$K8) in the screenshot below:

Grid of Random Integers 2

(In case readers are wondering why those references extend beyond the required range (A1:J10), they might like to try amending the ranges in the original formula (i.e. for A1) from $A2:$J$11 and B1:$K1 to $A2:$J$10 and B1:$J1, then copying this formula across to somewhere in column J or row 10.)

This construction:

FREQUENCY(($A9:$J$11,D8:$K8),ROW(INDIRECT("1:99"))-1)

is, as I mentioned, a useful alternative to COUNTIF, in the sense that it may be used to return an array of the counts of certain values within a range. What’s more, the range over which we are performing these counts is not a single, contiguous range, but a union of two such ranges. Readers who are not familiar with such ranges should note the additional parentheses which form the necessary syntax here.

One of FREQUENCY’s perhaps (in my opinion) less-desirable features, i.e. that the array returned consists of one more element than that passed to it, must also be accounted for here. Fortunately, in this case this does not present too much of an obstacle: we simply pass an array of integers from 0 to 98, not 99, knowing that a count for that latter value will be returned by default as FREQUENCY’s final, additional array element.

Hence the above resolves to:

{0;0;0;0;0;1;0;1;0;0;0;1;0;1;0;0;0;0;0;0;1;0;1;1;0;0;1;0;0;0;0;0;0;0;0;0;1;1;1;1;0;0;0;0;1;1;0;0;0;0;0;0;0;0;0;0;0;0;1;0;0;0;1;0;0;1;0;0;0;0;0;1;1;0;0;0;0;0;1;0;0;0;0;0;0;0;0;1;0;1;1;0;0;0;1;1;1;0;0;1}

(The first two occurrences of a 1, for example, here corresponding to the fact that both “05” (cell J8) and “07” (C9) are present within at least one of our two ranges being passed.)

Clearly we are interested in the zeroes from this array, and so we set the above to zero in an IF statement, the value_if_true clause being simply an array of integers from 0 to 99, so that:

IF(FREQUENCY(($A9:$J$11,D8:$K8),ROW(INDIRECT("1:99"))-1)=0,ROW(INDIRECT("1:100"))-1)

which is:

IF({0;0;0;0;0;1;0;1;0;0;0;1;0;1;0;0;0;0;0;0;1;0;1;1;0;0;1;0;0;0;0;0;0;0;0;0;1;1;1;1;0;0;0;0;1;1;0;0;0;0;0;0;0;0;0;0;0;0;1;0;0;0;1;0;0;1;0;0;0;0;0;1;1;0;0;0;0;0;1;0;0;0;0;0;0;0;0;1;0;1;1;0;0;0;1;1;1;0;0;1}=0,ROW(INDIRECT("1:100"))-1)

i.e.:

IF({0;0;0;0;0;1;0;1;0;0;0;1;0;1;0;0;0;0;0;0;1;0;1;1;0;0;1;0;0;0;0;0;0;0;0;0;1;1;1;1;0;0;0;0;1;1;0;0;0;0;0;0;0;0;0;0;0;0;1;0;0;0;1;0;0;1;0;0;0;0;0;1;1;0;0;0;0;0;1;0;0;0;0;0;0;0;0;1;0;1;1;0;0;0;1;1;1;0;0;1}=0,{0;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52;53;54;55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70;71;72;73;74;75;76;77;78;79;80;81;82;83;84;85;86;87;88;89;90;91;92;93;94;95;96;97;98;99})

i.e.:

IF({TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;FALSE;TRUE;TRUE;TRUE;FALSE;TRUE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;FALSE;FALSE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;FALSE;FALSE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;TRUE;TRUE;FALSE},{0;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52;53;54;55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70;71;72;73;74;75;76;77;78;79;80;81;82;83;84;85;86;87;88;89;90;91;92;93;94;95;96;97;98;99})

and so finally:

{0;1;2;3;4;FALSE;6;FALSE;8;9;10;FALSE;12;FALSE;14;15;16;17;18;19;FALSE;21;FALSE;FALSE;24;25;FALSE;27;28;29;30;31;32;33;34;35;FALSE;FALSE;FALSE;FALSE;40;41;42;43;FALSE;FALSE;46;47;48;49;50;51;52;53;54;55;56;57;FALSE;59;60;61;FALSE;63;64;FALSE;66;67;68;69;70;FALSE;FALSE;73;74;75;76;77;FALSE;79;80;81;82;83;84;85;86;FALSE;88;FALSE;FALSE;91;92;93;FALSE;FALSE;FALSE;97;98;FALSE}

And so have succeeded in creating an array of all values which are not present in either the rows below or the cells to the right of the cell in question. All that remains is to choose one of the numerical values from this array at random.

One way in which we can achieve this is to pass the above array to SMALL with a suitably random k parameter. Since the number of numerical entries in our array at any given time is equal to 100 less the number of elements in the range union being referenced, we can use, for the top parameter in a RANDBETWEEN:

100-COUNTA($A9:$J$11,D8:$K8)

which takes advantage of the fact that COUNTA, like FREQUENCY, is a function which can operate over such unions.

Hence, in our example:

RANDBETWEEN(1,100-COUNTA($A9:$J$11,D8:$K8))

will be:

RANDBETWEEN(1,100-27)

the 27 coming from the 20 non-blank elements in $A9:$J$11 and the 7 in D8:$K8 (note that it is important that the extraneous cells surrounding our range of A1:J10 are deliberately left empty).

All in all, then, we have:

=SMALL({0;1;2;3;4;FALSE;6;FALSE;8;9;10;FALSE;12;FALSE;14;15;16;17;18;19;FALSE;21;FALSE;FALSE;24;25;FALSE;27;28;29;30;31;32;33;34;35;FALSE;FALSE;FALSE;FALSE;40;41;42;43;FALSE;FALSE;46;47;48;49;50;51;52;53;54;55;56;57;FALSE;59;60;61;FALSE;63;64;FALSE;66;67;68;69;70;FALSE;FALSE;73;74;75;76;77;FALSE;79;80;81;82;83;84;85;86;FALSE;88;FALSE;FALSE;91;92;93;FALSE;FALSE;FALSE;97;98;FALSE},RANDBETWEEN(1,73))

as required.

Another post to follow shortly!

3 comments

  1. @Jeff

    Not sure I understand. There is no data to post in this case. Just a single formula is entered in A1 and then copied across and down as indicated.

    Regards

  2. Aha. I was array entering the formula over the range, and getting a circ ref error. But I realise now this is not a multi-cell array formula. Oops!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s