Coercing array returns from CSE-resistant formulas 9

We usually face no problems in cases where we wish to apply a formula to, not just one, but an array of values. And of course we do this by simply committing the formula as an array formula, i.e. with CSE.

However, not all formulas yield so easily, and some stubbornly resist any attempts at coercing an array of returns from them. Here I would like to discuss some techniques which, in addition to array-entry, can help coerce the desired result.

The principal method in such cases is to use a construction involving OFFSET, though a set-up using INDEX is equally viable; indeed, due to its non-volatility, perhaps even preferable. Some cases may require even more coercion than that, and others less. But the one thing they all share in common is that, on its own, array-entry just isn’t enough!

For example, given the data in A1:A5, the array formulas shown below do not give the expected results:

Pic3

Note that these examples have been constructed purely for the purpose of demonstrating the required techniques. I am well aware that there are better methods for summing A1:A5 than some of those given here!

In the first example, we wish to use a combination of INDIRECT and ADDRESS to sum the values in A1:A5. Of course, a non-array formula such as:

=INDIRECT(ADDRESS(1,1))

which resolves to:

=INDIRECT("$A$1")

i.e. 9.2, is quite straightforward, and so we might expect to be able to apply this construction, using CSE, to more than just a single cell.

However, the array-entered:

=SUM(INDIRECT(ADDRESS({1,2,3,4,5},1)))

does not resolve as we would hope. In fact, the above evaluates to:

=SUM(INDIRECT("$A$1"))

i.e. only the first element in our array is processed by the ADDRESS function, just as if we had entered it as a non-array formula.

(Here I would like to take the opportunity to warn readers of the dangers of “evaluating” formulas via repeated pressing of F9 in the formula bar. Here, going into the formula bar and highlighting just the part:

ADDRESS({1,2,3,4,5},1)

and pressing F9 does resolve to an array of cell references, i.e.:

{"$A$1","$A$2","$A$3","$A$4","$A$5"}

though this in no way should be taken for meaning that the formula itself will be thus resolved. And this is the “danger” of using this method to evaluate the steps in formulas. Unless you are fully aware of the potential inaccuracies inherent in “evaluating” this way, I strongly recommend you instead use the Evaluate Formula tool.)

In this case the required coercion is not especially complex, nor does it require a set-up involving OFFSET or INDEX. In fact, all that is required is the inclusion of one little N function, i.e. (and, again, array-entered):

=SUM(N(INDIRECT(ADDRESS({1,2,3,4,5},1))))

whose resolution is now:

=SUM(N(INDIRECT({"$A$1","$A$2","$A$3","$A$4","$A$5"})))

which is:

=SUM(N({9.2,1.1,5.5,7.4,3.3})))

i.e.:

=SUM({9.2,1.1,5.5,7.4,3.3})

which is 26.5, as required.

The use of N in such constructions is not so much for its purpose as a function (which, broadly-speaking, is to convert non-numerics to numerics, where possible) but more for its secondary (and extremely useful) property of being able to induce other functions to operate over an array of values where they would otherwise not, as here with ADDRESS.

Of course, we should be aware that we also used N since our range of values consisted here of numerics. But what if we instead wanted to generate an array using this set-up with INDIRECT and ADDRESS, but where the values in our range were not numerics, but text, and where the resulting array was not to be summed (obviously) but passed to some other function?

Fortunately, the god of Excel provided N with a sibling, the equally shortly-named T.

And, just like N, T also possesses this secondary property of being able to coerce array returns. Though, whereas N operates on numerics, T operates on text values.

For example, if instead our values in A1:A5 were “A”, “B”, “C”, “D” and “E”, then, for example:

=LOOKUP(REPT("z",255),T(INDIRECT(ADDRESS({1,2,3,4,5},1))))

which, thanks to LOOKUP also having the property of coercing array returns, would not even require CSE, and would happily operate over an array of five values, viz:

=LOOKUP(REPT("z",255),T(INDIRECT({"$A$1","$A$2","$A$3","$A$4","$A$5"})))

which becomes:

=LOOKUP(REPT("z",255),T({"A","B","C","D","E"}))

i.e.:

=LOOKUP(REPT("z",255),{"A","B","C","D","E"})

which of course is “E”.

But array coercion using N or T does not work with all functions. Taking our next example with DEC2BIN, readers can verify that the array-entered:

=SUM(DEC2BIN(A1:A5))

resolves unhelpfully to:

=SUM(#VALUE!)

i.e. #VALUE!, which is evidently not what we were hoping for.

And the attempt at including an N function, i.e.:

=SUM(N(DEC2BIN(A1:A5)))

fares no better, since here DEC2BIN resists N’s attempts at coercing an array of returns and instead continues to resolve as if the formula had been entered without CSE, i.e.:

=SUM(N(#VALUE!))

etc., etc.

In these cases, then, we need to seek alternative means by which to coerce our desired array-processing. And one method of doing this is to use OFFSET, as follows:

=SUM(0+DEC2BIN(OFFSET(A1,ROW(A1:A5)-MIN(ROW(A1:A5)),,,)))

(Of course, we could hard-code the rows parameter as {0,1,2,3,4} if we were certain that our range was fixed at A1:A5, though it’s always best to give generalized solutions, where appropriate.)

This time, we achieve our desired result, since the above resolves to:

=SUM(0+DEC2BIN(OFFSET(A1,{1;2;3;4;5}-1,,,)))

which is:

=SUM(0+DEC2BIN(OFFSET(A1,{0;1;2;3;4},,,)))

i.e.:

=SUM(0+DEC2BIN({9.2;1.1;5.5;7.4;3.3}))

and, although it is clear that we have succeeded in coercing an array of returns, for the sake of completeness I will give the remaining steps in the resolution of this construction:

=SUM(0+{"1001";"1";"101";"111";"11"})

which is:

=SUM({1001;1;101;111;11})

i.e. 1225, as desired. (Of course this construction is purely arbitrary and arguably meaningless: summing binary strings in such a way cannot possibly have a practical application! Still, the point is not the logical coherence of the set-up, but the example it offers us for demonstrating this technique.)

This technique with OFFSET is employable with a wide range of functions which initially offer resistance to array-entry. In our third example, with DOLLARDE, readers may wish to verify that the CSE formula:

=SUM(DOLLARDE(A1:A5,2))

again results in #VALUE!, though:

=SUM(DOLLARDE(OFFSET(A1,ROW(A1:A5)-MIN(ROW(A1:A5)),,,),2))

correctly processes the array of values and results in 32.5, as desired.

Likewise for our fourth example. Whilst the standard array-entered:

=IMSUM(COMPLEX(A1:A5,1))

again results in an error, the following:

=IMSUM(COMPLEX(OFFSET(A1,ROW(A1:A5)-MIN(ROW(A1:A5)),,,),1))

resolves as we would like to:

=IMSUM(COMPLEX(OFFSET(A1,{0;1;2;3;4},,,),1))

i.e.:

=IMSUM(COMPLEX({9.2;1.1;5.5;7.4;3.3},1))

i.e.:

=IMSUM({"9.2+i";"1.1+i";"5.5+i";"7.4+i";"3.3+i"})

which is 26.5+5i, as desired. (One for the pure mathematicians, that one!)

But this technique with OFFSET is not the only method which allows us to coerce these stubborn functions to yield an array of returns. Readers who have read my post here in which I outline a technique for coercing an array of returns from INDEX may have guessed that we can take advantage of that property here also.

As such, I will leave it to the reader to verify that the following alternative constructions give precisely the same results as the set-ups employing OFFSET:

=SUM(0+DEC2BIN(INDEX(A1:A5,N(IF(1,ROW(A1:A5)-MIN(ROW(A1:A5))+1)))))
=SUM(DOLLARDE(INDEX(A1:A5,N(IF(1,ROW(A1:A5)-MIN(ROW(A1:A5))+1))),2))
=IMSUM(COMPLEX(INDEX(A1:A5,N(IF(1,ROW(A1:A5)-MIN(ROW(A1:A5))+1))),1))

In fact, it could be argued that the absence of volatility makes this set-up with INDEX preferable to the OFFSET version, though one could also argue that the two extra function calls (N and IF) are not in INDEX’s favour. However, I would imagine that the former (volatility) is more of a detriment than the latter (minor increase in function calls), and so my personal preference would probably be with the INDEX versions. I leave it to readers to decide for themselves which set-up they prefer.

Let’s now take a look at our final example, then, which involves the little-used CELL function. Here we are attempting to use this function in a similar way to the INDIRECT/ADDRESS combination from our first example, i.e. using an info_type parameter of “address” to (hopefully) obtain an array of cell references to pass to SUM (after processing via INDIRECT, of course).

Naturally, the first attempt at CSE entry:

=SUM(INDIRECT(CELL("address",A1:A5)))

although this time not resulting in an error, nevertheless, just as in the INDIRECT/ADDRESS example, resolves to only:

=SUM(INDIRECT("$A$1"))

Perhaps surprisingly, however, our usual means of overcoming this issue, i.e.:

=SUM(INDIRECT(CELL("address",OFFSET(A1,ROW(A1:A5)-MIN(ROW(A1:A5)),,,))))

this time fares no better, since, despite the CSE entry, the OFFSET function here does not operate over all elements from the ROW construction. In fact, here, the part:

ROW(A1:A5)-MIN(ROW(A1:A5))

which obviously we would hope (and with CSE would almost certainly expect) to return:

{0;1;2;3;4}

does not in fact do so, and instead resolves to merely 0, i.e. the equivalent of:

ROW(A1)-MIN(ROW(A1:A5))

i.e. just the first element in the array ROW(A1:A5) is processed.

Strange. And not at all what we would expect from an array formula. And I have to confess that I’m not sure as to the precise reason why the CELL function exerts this property of “double-array-resistance”. So if anyone knows of a technical explanation, please let me know!

Suffice to say that the required coercion is now a little more involved, viz:

=SUM(INDIRECT(CELL("address",OFFSET(A1,N(INDEX(ROW(A1:A5)-MIN(ROW(A1:A5)),,)),,,))))

(Naturally here we are unable to use an equivalent INDEX construction since we are required to generate an array of cell references to pass to CELL.)

The above does now resolve to:

=SUM(INDIRECT(CELL("address",OFFSET(A1,N(INDEX({0;1;2;3;4},,)),,,))))

and the necessary “double-coercion” combination of N and INDEX gives firstly:

=SUM(INDIRECT(CELL("address",OFFSET(A1,N({0;1;2;3;4}),,,))))

and then:

=SUM(INDIRECT(CELL("address",OFFSET(A1,{0;1;2;3;4},,,))))

which is:

=SUM(INDIRECT({"$A$1";"$A$2";"$A$3";"$A$4";"$A$5"}))

and so on as in the first example involving the ADDRESS function.

Some interesting techniques to bear in mind, then. It’s worth remembering that, whatever the degree of coercion required, whether it necessarily involves one, two or even three additional functions, there’s a good chance that those coercing functions will come from the set of INDEX, N (or T) and OFFSET in some combination and with some particular syntax.

Until the next time!

9 comments

  1. Nice! I haven’t come across the N(IF(1,ROW())) construction before – this fixes a host of array evaluation issues. After reading through this, I have posted some updated formulas for returning unique arrays from ranges on an older formula challenge thread:
    http://forum.chandoo.org/threads/formula-challenge-011-distinct-duplicate-unique-array-from-text-and-numbers.11315/page-2

    One thing to beware of though – when using OFFSET and INDIRECT in conjunction with arrays, i believe it’s the prevailing wisdom that one should make sure to dereference formulas containing these with N() or T(). If you don’t do this instabilities and inconsistent results may occur. For example if i try changing the SUM to IRR in some of the formulas, i get a memory exception error. So to be safe always use SUM(N(…)) or similar.

  2. Ah! So you must be that very same “Lori” whom I referenced here!

    https://excelxor.com/2014/09/30/extracting-numbers-from-a-string-4-all-numbers-to-a-single-cell/

    In that case I’m honoured to have the input of someone of such obvious ingenuity and creativity here – welcome!

    I’m also glad that you’ve found something of interest and of practical use. Yes, this technique does have several important applications, particularly with regard to subjecting arrays to further functions where those arrays are not actual worksheet ranges, but e.g. the result of other subfunctions within the formula. In which case, as you point out in the link you provided, we are unable to use e.g. OFFSET.

    Perhaps its most important use is when employed in this way as a “re-dimensioner” of arrays, as many of my articles have demonstrated, e.g here:

    https://excelxor.com/2014/10/21/unique-alphabetical-list-from-several-columns/

    in which we simply transform a two-dimensional array within the formula itself using this method so that it can then be manipulated via functions which otherwise would not accept two-dimensional arrays.

    One thing which still alludes me is the ability to pass an array consisting of entire rows/columns of an array to a function, as in the SUBTOTAL vein, exploiting INDEX’s property of passing entire rows/columns when one parameter is zero. Unfortunately, neither of:

    =SUM(INDEX(A1:C10,0,N(IF(1,{1,2,3}))))

    and

    =SUM(INDEX(A1:C10,N(IF(1,{0,0,0})),N(IF(1,{1,2,3}))))

    are equivalent to:

    =SUM(SUBTOTAL(9,OFFSET(A1:A10,,{0,1,2},,)))

    I’ll keep trying!

    And thanks for your advice re dereferencing. This is not something I had considered, and will look into it now. Also for the link, an old one which I had not yet come across – fantastic use of MODE.MULT by the way!

    Thanks again and hope to see you around at some point in the near future, either here or elsewhere!

    Cheers

  3. Indeed i did post an NPV formula for extracting numbers from text on a forum some years back. Thanks for the detailed explanations and for drawing attention to that formula. I never really thought it would be used much but seems several people have picked up on it.

    You raise an issue with combining an array and a zero in the second and third arguments of the Index function. I don’t think there’s an easy way around that behavior, i posted something related here (also see linked post):

    http://stackoverflow.com/questions/17117624/excel-2003-2007-named-formula-appears-to-store-only-single-value-not-range-array?lq=1

    But now i find myself rethinking various preconceptions i had about formula evaluation. It seems that applying N(IF(1,.)) or N(INDEX(.)) to a function argument induces the formula evaluator to assign a single value type where it would otherwise assign an array type which in turn changes the return type from an array to a single value. This follows from applying to other functions that would normally return arrays such as Mmult, Trend, or Transpose.

    There also appear to be several other alternatives for the Dec2Bin, Complex and Dollarde examples in the post. Testing with –A1:A5 or IF({1},A1:A5) or MMULT(A1:A5,1) also seems to work, the latter not requiring CSE.

  4. Many thanks. I have used the MMULT construction myself before, though simply forgot about it here!

    Very interesting that a single mathematical operation can force the coercion (as well as -A1:A5, 0+A1:A5, for example, does the job), though N(A1:A5) will not, e.g. with DOLLARDE.

    I’m not quite at your level of understanding of the processes behind how formula evaluation operates, and some of your analysis over at stackoverflow is interesting and certainly new to me – perhaps gaining such a level of understanding might help me to understand constructions such as IF({1}, which is not at all obvious.

    Thanks again for your contributions.

    Regards

  5. I must admit that some of those links were not too clearly expressed. My understanding is not complete by any means but comes from writing udfs in the XLM/XLL frameworks where argument and return types are any combination of values (number, text, logical, error), references or arrays. So anyway here’s one more take (partly for my benefit)…

    Clearly, when a function argument is an array but it’s expecting a value the function is just applied to each value in the array: f({1,2,3}) = {f(1),f(2),f(3)}. Given this, we can divide into three cases depending on the output type of the function f():

    a) value: POWER({1,2,3},1) = {1,2,3}
    b) range: OFFSET(A1,{1,2,3},0) = {A2,A3,A4} ?
    c) array: INDEX({1,2;3,4},0,{1;2}) = {{1;3};{2;4}} = {1;3} or {1;2}?

    a) single value case – most functions. Examples 2-4 above show that coercion is needed for several functions to convert a range reference to an array.

    b) range case – Offset(), Indirect(). The formula evaluation tool won’t help much here but applying a function like Areas() to the output shows an array of ranges is returned. N() and T() can be used to coerce to values before summing as in Example 1.

    c) array case – Trend(), Linest(),… When coercing an array to a value Excel keeps only the first entry. This however gives two choices of array to return depending on order of coercion (inside arrays or outside array). If entered into one cell and evaluated, the first array is returned ({1;3}). But applying N(IF(1,.)) to the last argument (or entering into multiple cells) changes the order of evaluation and the first entry of each array is returned instead ({1;2})

    My guess is that Cell() is set to allow any return type including array (similar to some other XLM functions like Get.Cell or Get.Worbook) so falls into the last category (c). Perhaps this is why for example 5 a combination of (b) to return a range of references and the N(IF(1,)) [or equivalent N(INDEX(.,))] trick is needed to resolve to the required array. This double coercion would then induce the required output:

    CELL(“address”,{A1,A2,A3,A4,A5})={“$A$1″,”$A$2″,”$A$3″,”$A$4″,”$A$5”}

  6. Long time no chat folks, hope you’re all still around and doing well!

    Just wondering if you know a way of coercing the GCD function to return an array of results, I’ve tried all the methods above with no luck.

    Thanks,

    Alex

  7. Hi Alex,

    Long time indeed! Hopefully I can get some new posts out at some point to start the ball rolling again…

    What sort of array are you passing to GCD? Can you give a small example with expected results? My first thought is that GCD will only ever return a single result, so your question would appear to be akin to asking how you can get OR to return an array of values!

    Cheers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s