Non-Array TRANSPOSE 4

We sometimes look for non-array (i.e. non-CSE) versions of constructions which would normally require array-entry. Our reasons for doing so may be varied:

1) We may feel that it improves spreadsheet performance (sometimes true, sometimes not)

2) We perhaps have a dislike for having to use the required keystroke combination necessary for committing array formulas

3) We may simply be interested from a theoretical point of view

This post would certainly belong more to the third category than to the other two. The necessary coercions to force TRANSPOSE to operate as normal without array-entry are convoluted and impractical. Though that is not to say that they are not without interest, and the use of these coercions is of course not exclusive to TRANSPOSE. In fact, they can be employed within a number of functions to generate a return which would otherwise require array-entry.

As an example, take the dataset below.

Non-Array TRANSPOSE 2

Supposing we wish to identify at which point the running total from A2 downwards equals or exceeds a certain total, let’s say 5. Those running totals are shown in column B, though let’s assume for whatever reason that we do not wish to – or cannot – use such a helper column in our solution. And supposing we wish to return the number of cells which it was necessary to sum in order to reach that total.

Clearly here the answer is 3, since 1+2+3=6 is the first occasion on which our target of 5 has been surpassed.

There are two standard techniques for obtaining this result. One is to use an array set-up with SUBTOTAL and OFFSET. The other, which I prefer and use here, is to employ an array construction using MMULT, i.e.:

=MATCH(TRUE,MMULT(0+(ROW(A2:A5)>=TRANSPOSE(ROW(A2:A5))),A2:A5)>5,0)

Many functions are such that we can generally substitute the need for CSE by introducing an INDEX function, appropriately placed within the formula (see here for a discussion on this technique).

Other functions, including TRANSPOSE, do not yield so easily, and so in order to generate a non-array version of these functions we must look to other means.

The non-array equivalent to the above formula is:

=MATCH(TRUE,MMULT(0+(ROW(A2:A5)>=TRANSPOSE(N(INDEX(ROW(A2:A5),,)))),A2:A5)>5,0)

i.e. we require the addition of not just INDEX, but also the N function.

And if the array being passed to TRANSPOSE is not some other function, but an actual range reference, then the necessary coercion for the non-array equivalent is even more involved (and certainly not to be recommended, apart from perhaps to those who have a morbid fear of array formulas and so seek to avoid them at all costs!). For example, given the following, practical example:

Non-Array TRANSPOSE 4

we may wish to take the sum of the “cross”-products of B1 with A2, C1 with A3, D1 with A4 and E1 with A5, i.e.:

=(B1*A2)+(C1*A3)+(D1*A4)+(E1*A5)

(The parentheses aren’t strictly necessary, though I put them in for clarity.)

The answer is 70, as you can easily verify. Our first attempt at forming an abbreviated version of this formula might naturally be to try SUMPRODUCT:

=SUMPRODUCT(B1:E1*A2:A5)

though the result, 260, is not what we would have hoped for. Since the two arrays in question – B1:E1 and A2:A5 – are orthogonal, i.e. one is a single-column vector, the other a single-row vector, Excel quite rightly forms a total of all “cross”-products for those values, equivalent to:

Non-Array TRANSPOSE 3

where each of the entries is the product of the values from B1:E1 and A2:A5 corresponding to its position within the grid. The grand total of all these entries then gives the 260 we see as a result.

Clearly, then, in order to get the more “simple” product of B1:E1 and A2:A5, we need to first ensure that these two arrays are of the same vector-type, i.e. either both are single-row vectors or both are single-column vectors.

As some of you may know, we do this by transposing one of the arrays (either will do, though not both, since then we will be back where we started). However:

=SUMPRODUCT(B1:E1*TRANSPOSE(A2:A5))

gives a result of 52, not the required 70. The reason being (and you can see this if you go through it step by step using the Evaluate Formula tool) that SUMPRODUCT alone is not capable of coercing TRANSPOSE to operate on all elements within the range A2:A5. Hence, the above resolves (unhelpfully) to:

=SUMPRODUCT(B1:E1*2)

since only the first element in the range A2:A5 is returned by TRANSPOSE, and so we obtain 52 as a result.

By entering the formula as an array formula (in which case we don’t need SUMPRODUCT; SUM will suffice), we do obtain the necessary coercion, and this time:

=SUM(B1:E1*TRANSPOSE(A2:A5))

does resolve to 70, as desired.

But can we obtain this result without using either the long-hand multiplication method (which might quickly become impractical if we had a much larger dataset) or an array formula?

The answer is yes, tnough, as I said, the necessary coercions are far from obvious, and arguably no less “impractical”. Nevertheless, the non-array:

=SUMPRODUCT(B1:E1,TRANSPOSE(INDEX(A2:A5,N(IF(1,INDEX(ROW(A2:A5)-MIN(ROW(A2:A5))+1,,))))))

will return the required value of 70.

I leave it to the reader to investigate the techniques outlined here, both in their use with TRANSPOSE and also applied to other functions. Avoiding array-entry is not to be recommended merely for the sake of avoiding array-entry, though it can certainly (as here) reveal some interesting theoretical points about formula syntax, and it should be borne in mind that the techniques outlined are not without any practical use.

4 comments

  1. Great article! i agree that although it may not necessarily be advisable to do so in practice, looking for non-array formula options is an interesting exercise. The N(Index(.)) combination for forcing array evaluation would definitely never have occured to me. This knowledge also carries over to VBA where you can use compact statements like [index(text(A1:A5,”0.00″),)].

    From what i can tell, If(), Index() and Choose() are the only functions that can return “non-array” evaluated results in the sense that both TYPE(IF(1,{1,2,3})) and TYPE(INDEX({1,2,3},)) return a value (1) if evaluated normally and an array (64) if entered with CSE. Passing this mixed-type result to N() or T() (as well as a few other former analysis toolpak functions like Dollarde() and Imreal()) has the effect of converting back to an array and when placed in a function argument like Index() this forces the result to be array-evaluated:

    {INDEX(A1:A5,1);…;INDEX(A1:A5,5)}.

    On the other hand array-returning functions like Mmult() and Trend() when placed inside a function argument of Index() will just return:

    INDEX(A1:A5,{1;…;5})

    which then only evaluates the first element.

    One last point relating to the final formula in the post. I think this may be able to be simplified a little by using one of these three options inside the Transpose() part:

    N(IF({1},A1:A5)) or MMULT(A1:A5,1) or INDEX(A1:A5-0,)

  2. Fantastic, Lori!

    These 3 alternatives which you give are far more concise (and elegant)! Again, the MMULT I somehow simply failed to consider here. But as for IF({1},…very interesting. Can you tell me a little more about how this syntax with {1} operates? What effect does placing the value within an array produce?

    I have also neglected to take advantage of the TYPE function, which you have clearly put to good use as part of your technical deconstructions.

    Many thanks.

  3. Yes, even such familiar functions as If() and Choose() can still throw up surprises. I wasn’t aware until today of the mixed value/array property they share with Index() and that to enforce array-evaluation, N() or T() is needed. They both also have two modes of evaluation dependent on the first argument:

    A) Value case. “Lazy” evaluation method is used, so if array-evaluated, the formula

    =IF(1,A1:A5,{1,2})

    returns a range A1:A5 without needing to check the last argument.

    B) Array case. All arguments are evaluated and, if array-evaluated, an array is returned that is the minimum number of rows and columns that can accomodate all the array arguments. This is also what happens with other functions if you think about it. The formula:

    =IF({1},A1:A5,{1,2})

    returns a 5×2 array consisting of the values in A1:A5. To see that the last argument is evaluated only in the array case, replace {1,2} by a vba function called test() and inserting a break point. Without the {1,2} last argument the second formula still returns an array as can be checked using ISREF() for instance.

    Both of these still require conversion to text or numbers for automatic array-evaluation however.The only method i could find to convert a range of text or numbers to an array of values, for use inside transpose() without CSE, is:

    =CELL("contents",OFFSET(A1:A5,N(INDEX(ROW(A1:A5)-MIN(ROW(A1:A5)),)),))
  4. PS. i just realised after posting that last formula that your one from the initial post was superior for returning mixed data:

    =INDEX(A1:A5,N(IF(1,INDEX(ROW(A1:A5)-MIN(ROW(A1:A5))+1,,))))

    So if you constructed another example requiring mixed data it probably would be the best option after all! i suppose the cell(“Contents”,…) construction could be put to use in other context eg combined with INDIRECT you could return mixed numbers/text data across sheets.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s