Advanced Formula Challenge #11: Results and Discussion 11

Last week I set readers the challenge which can be found here.

Good results for this one: six answers received, six correct answers received – from Oscar, Daniel, diondan1, Bill, Ikkeman and Calvin. Plus one (unverifiable, though no doubt correct!) Google Sheets solution from Isai, as usual. 🙂

So congratulations to all of the above!

The majority of those solutions adopted a strategy of comparing the characters from two sets of arrays derived using MID over an array of start_num parameters, though a couple of solvers (Bill and Calvin) decided to first derive the ASCII codes for these characters and instead use these as the basis for the comparison.

The solutions of Ikkeman and Calvin differed from the rest in that they chose not to use constructions involving INDIRECT in order to generate the array of values to pass to MID, instead using INDEX and OFFSET respectively for this purpose.

In fact, the INDEX construction employed by Ikkeman is one which I occasionally use myself, though, after some further reflection, perhaps not often enough. Despite being slightly less intuitive than the standard INDIRECT set-up, the fact that it is only partially volatile (“volatile at workbook open only” is the official definition) should be enough to convince us to prefer it over that involving INDIRECT (or OFFSET), both of which are fully volatile. In fact, from now on I have decided that I will make it my preferred choice for generating arrays of integers. So thanks for convincing me, Ikkeman!

Of all the solutions offered, Bill’s was the only one which made an attempt to perform the comparisons between the two arrays within a single construction, adeptly manipulating the resulting array to then extract the desired results.

In fact, this was perhaps the main point of this challenge, and I was half hoping that this technique might have been employed by more solvers. Indeed, not to take anything away from anyone, but this particular challenge is perhaps not the most difficult of the series thus far: what it does offer, however, is the opportunity to combine the querying of two arrays, as Bill did, into a single construction.

I will explain what I mean by this by presenting my solution to the problem, which differs from Bill’s in only a few minor details (and in which I have now substituted the ROW/INDEX construction of Ikkeman for my previous ROW/INDIRECT one).

And that solution is, in B2, non-array:


How does it work?

First of all, let’s look at this alternative construction for generating our array of parameters to pass to MID, especially since it is one with which some readers may be unfamiliar, i.e.:


This construction returns precisely the same array as the more standard:


though, as mentioned, does not possess as much volatility as the latter.

For our string in A2 of “AABBCCCC”, which is clearly of length 8, we have:


which is:


(In fact, the Evaluate Formula tool prefers to return these two range references as absolute ones, i.e. $A$1 and $A$7, though this is not important.)



Now, what most solvers did here was to first pass this array to MID for its start_num parameter, so that:


would give:


and then perform a similar calculation but this time with an array of parameters of:


i.e. of values each one greater than those in the previous version, giving this time:




and then finally performing a comparison of these two arrays.

And absolutely nothing at all wrong with that: a good, logical solution to the problem.

However, it also happens that, rather than generating two separate arrays in this way, we can in fact generate them in a single step. We simply need to ensure that we will be able to then perform our required comparison – which is quite straightforward given two separate arrays – on the resulting, single array.

And this single-step “shortcut” is achieved by taking our array, i.e.:


and performing the addition of this array with the following:


Note that it is imperative that this second array be orthogonal to the first. Since the first is a single-column array, we thus need to ensure that the second is a single-row array.

This being the case, we see that:




in which the importance of the two arrays being orthogonal is here demonstrated, since this array consists of all results of adding each of the two entries in the second array, i.e. 0 and 1, to, in turn, each of the elements in the first.

Just to demonstrate what would happen if we had instead used:


for our second array, then, since this array is, just like our first, a single-column array, we would have:


but this is:


Not at all what we want!

In case readers are wondering why we obtain the above array, suffice to say that, if two arrays are of the same vector-type, i.e. both are single-row arrays or both are single-column arrays, then elements in one are “paired” with the corresponding (i.e. from the same position) elements in the other, and the operation (here addition) performed between those pairs.

However, since here our two arrays are of differing dimensions – the first containing 7 elements, the second only 2 – Excel artificially expands the smaller of the two so that it is of an equal dimension to the first, thus ensuring that it can perform the resulting addition legitimately.

Of course, it makes up those additional elements by filling the array with #N/As (what else should it do?), and so we end up performing, effectively:


which naturally results in the array given previously.

Apologies to my regular readers if they feel I am somewhat “banging the drum” on this issue of orthogonality with respect to operations performed between two arrays, though I really cannot emphasize enough the importance of understanding this concept, which, once mastered, opens up whole new vistas in what can be achieved with regards to array operations in Excel.

Anyway, to return to our construction, which is now:


we see that this part:


will give us our single-step array, viz:


which contains precisely the same elements as the two, separate arrays derived previously, though combined into a single 7-row-by-2-column array.

We now take the ASCII codes of each of these characters, such that:




Now, as I mentioned, the only slight drawback to having combined these two steps into a single one is that we now have to be sure that we can manipulate the above array accordingly, equivalent to performing the subtraction that we hinted at earlier between the elements in one array from those in the other.

Fortunately this is not too difficult, provided we are familiar with the workings of that most wonderful of functions, MMULT, since:


resolves as:


which is precisely the same array as we would have obtained had we performed a comparison between our two separate arrays (as did most solvers) similar to:


and then converted the resulting Boolean TRUE/FALSE returns to their numerical equivalents.

Finally, then, we simply query this array as to which of its elements are greater than or equal to 0 (since this in effect means that the difference between the ASCII code of one element and that of the previous element is greater than or equal to 0, i.e. the character equivalent to the former must be the same as or else later in the alphabet than that equivalent to the latter), so that:




i.e. TRUE, as required.

Readers should of course not feel that my use of the function GESTEP was in anyway necessary here! Indeed, a construction involving a simple comparator (>=0) would equally suffice (and be by far the more common choice!). I simply felt like giving this function an outing, given the rarity of that event. 🙂

Thanks again to all who contributed. Another challenge to follow shortly. Watch this space!


  1. @XLarium

    Good question! And I’ve no idea!! 🙂

    Perhaps I’ll run some tests in an attempt to find out. Maybe we’ll learn that we should all be using e.g. QUOTIENT in our formulas as well? 🙂


  2. I think you can omit one of the INDEX functions from your solution.

  3. @Sam Mathai Chacko

    Hi and welcome to the site! And many thanks for your contribution.

    This is possible, but it misses the point of including the INDEX function here, and that it is to lend rigour to the construction; in particular, with regards to row insertions within the worksheet.

    What happens with your set-up if, for example, a new row is inserted above row 1?

    Granted, such cases may be rare, but, given that it “costs” a mere additional INDEX function, I think that the slight extra calculation time is here more than justified.


  4. Hello –

    Please pardon me, if i misunderstood the above challenge..

    While going through the examples given above, the formula quoted above works perfect..however, I just tried to tweak the content of cell A3 from “DDDDEEFF” to “ADDDEEFP”, and the above formula was showing the result as TRUE but should have been otherwise..

    Please correct me, if I am missing something here..I then tried and came up with the following:


    This is hefty though..and I am absolutely sure that there will be a shorter way out in your kitty..

  5. Hi Asheesh.

    Can you clarify? Why do you say that the result for “ADDDEEFP” should be FALSE? Aren’t the letters in that string in ascending alphabetical order?


  6. Hi XOR LX – Yes, Indeed it is in ascending alphabetical order, however, my assumption of the challenge was that there should be no gap between 2 letters…and I guess that is where I went wrong..

  7. By Gap I mean, missing letters…eg, the above string says “ADDDEEFP”, and there are 2 letters missing between A & D i.e. B & C..

    Similarly, there are 9 letters missing between F & P..

    So according to my logic: It should say TRUE only if the above string read like ABCDDDEEFFGHIJKLMNOP

  8. Ah, I see!

    I’ve just looked back at the examples I gave in the original post and I can see now why you might have made that assumption. Apologies – I just chose those letters at random quickly and probably should’ve spent some more time on it so as to avoid the potential confusion.

    On a side note, I always try to explain precisely what is required in the text as well, so if something isn’t mentioned explicitly there then it can be assumed that it isn’t a factor.

    Re your solution, seems to work great! Nice work! I think we could also abbreviate it to e.g.:



  9. My bad, will be little more careful going forth..

    And thank you for the explanation..

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.