VLOOKUP Across Several Worksheets (2) – Multiple Search Criteria 9

I recently made the post here, in which I presented a solution to the problem of returning a value based upon matching a single criterion in a given column across multiple worksheets.

In this follow-up post I will look at the analogous case in which we are not matching a single criterion, but several. As mentioned in the first instalment, I will look at two solutions to this problem, one in which we make use of an extra “helper” column in each of the relevant worksheets, and one in which we do without such aids.

To recap, given the following three worksheets:

Sheet1:

Sheet1

Sheet2:

Sheet2

Sheet3:

Sheet3

we will generate the return from the Amount column from whichever happens to be the first worksheet (in the sense of progressing from leftmost to rightmost) where the following two criteria are met: the entry in the Colour column is “Red” and the entry in the Year column is 2012.

We will return this value into a fourth, Master sheet, as below (rows 7 and 11 are those of interest here):

Master

(You can download the workbook here.)

In the first case, then, let’s look at how we can extend the solution given in the previous post to work in this situation as well.

As given above, we first insert an additional column into each of the worksheets. This necessarily (if we want to continue to use our VLOOKUP set-up without too much undue reworking, that is) requires that those columns be inserted to the left of those containing our search criteria. Here I have made the sensible choice to use column A for this purpose.

For each of the entries in this column, across all three worksheets, we perform a concatenation of the two fields of interest, i.e. the entries in the corresponding row from columns B and C.

Here I have used a pipe (|) as the separator between those entries, though obviously any sensible choice will here suffice.

With these helper columns in place, the required solution (in cell C7) is the following array formula:

=VLOOKUP(A7&"|"&B7,INDIRECT("'"&INDEX(Sheets,MATCH(TRUE,COUNTIF(INDIRECT("'"&Sheets&"'!A:A"),A7&"|"&B7)>0,0))&"'!A1:D10"),4,0)

where, just as previously, Sheets is the Defined Name:

={"Sheet1","Sheet2","Sheet3"}

In essence, apart from the concatenations, there is very little difference between this solution and that given for the case involving a single criterion. Hence I will leave the deconstruction to readers to perform themselves, if interested, using the previous post as reference.

What I would now like to consider is the case where we obtain the same result, though without the use of any helper columns. And here things are necessarily a touch more complex.

We first go to Name Manager and, making sure that the active cell in the worksheet is somewhere in row 11, define:

Arry1 as:

=MATCH(TRUE,COUNTIFS(INDIRECT("'"&Sheets&"'!B:B"),$A11,INDIRECT("'"&Sheets&"'!C:C"),$B11)>0,0)

and Arry2 as:

=ROW(INDIRECT("1:10"))-1

The required array formula in cell C11 is then:

=INDEX(INDIRECT("'"&INDEX(Sheets,Arry1)&"'!D1:D10"),MATCH(1,(T(OFFSET(INDIRECT("'"&INDEX(Sheets,Arry1)&"'!B1"),Arry2,,,))=$A11)*(N(OFFSET(INDIRECT("'"&INDEX(Sheets,Arry1)&"'!C1"),Arry2,,,))=$B11),0))

How does it work?

Let’s first look at Arry1, then, which is:

=MATCH(TRUE,COUNTIFS(INDIRECT("'"&Sheets&"'!B:B"),$A11,INDIRECT("'"&Sheets&"'!C:C"),$B11)>0,0)

This is a near-identical construction to that used in the solution given above using helper columns. The only difference is that, whereas in that solution we were using a construction involving COUNTIF, here we are using COUNTIFS.

Of course, this makes perfect sense: again, we must first determine in which of our three sheets we are going to be performing the search, though this time we need to base that on not just one, but two criteria. Hence COUNTIFS.

And again, as this construction is so similar to that given earlier, I will leave it to readers to dissect it as they wish. Suffice to say that the above will resolve to:

=MATCH(TRUE,COUNTIFS(INDIRECT({"'Sheet1'!B:B","'Sheet2'!B:B","'Sheet3'!B:B"}),"Red",INDIRECT({"'Sheet1'!C:C","'Sheet2'!C:C","'Sheet3'!C:C"}),2012)>0,0)

which, resolving the COUNTIFS over this array of three range references, is:

=MATCH(TRUE,{0,0,1}>0,0)

which is clearly 3, thus indicating that we will be performing our search in the third of our sheets as listed, i.e. Sheet3.

Hence, within our main formula, this part:

INDIRECT("'"&INDEX(Sheets,Arry1)&"'!D1:D10")

will resolve to:

INDIRECT("'"&INDEX(Sheets,3)&"'!D1:D10")

which is, expanding our Defined Name Sheets:

INDIRECT("'"&INDEX({"Sheet1","Sheet2","Sheet3"},3)&"'!D1:D10")

i.e.:

INDIRECT("'"&"Sheet3"&"'!D1:D10")

i.e.:

INDIRECT("'Sheet3'!D1:D10")

and so finally a reference to the range:

Sheet3!D1:D10

So we have generated the necessary range to pass to INDEX as its array parameter, meaning that our construction will now look like:

=INDEX(Sheet3!D1:D10,MATCH(1,(T(OFFSET(INDIRECT("'"&INDEX(Sheets,Arry1)&"'!B1"),Arry2,,,))=$A11)*(N(OFFSET(INDIRECT("'"&INDEX(Sheets,Arry1)&"'!C1"),Arry2,,,))=$B11),0))

By a similar argument, we can see that this part:

INDIRECT("'"&INDEX(Sheets,Arry1)&"'!B1")

will resolve to the reference:

Sheet3!B1

And, similarly, that this part:

INDIRECT("'"&INDEX(Sheets,Arry1)&"'!C1")

will resolve to the reference:

Sheet3!C1

After which our main construction is now:

=INDEX(Sheet3!D1:D10,MATCH(1,(T(OFFSET(Sheet3!B1,Arry2,,,))=$A11)*(N(OFFSET(Sheet3!C1,Arry2,,,))=$B11),0))

Since the two conditional constructions here, i.e.:

T(OFFSET(Sheet3!B1,Arry2,,,))=$A11

and:

N(OFFSET(Sheet3!C1,Arry2,,,))=$B11

are practically identical (I will explain the reason for the use of T in the first and N in the second shortly, as it is a small yet crucial difference), the deconstruction of one will be equally applicable to the other.

So let’s dissect the first of those, then.

Recall that we defined Arry2 as:

=ROW(INDIRECT("1:10"))-1

Since the range over which we will be performing our search extends from row 1 to a maximum of row 10 over our three worksheets, it makes sense to perform the search over that range. Of course, often we may be in a position in which the size of our dataset (in any or all of the worksheets in question) is not fixed but dynamic, and therefore subject to expansion/contraction in the number of rows containing data.

If this is the case then it would be appropriate to choose a suitably high upper bound for the number of rows over which to perform the search, an upper bound which is guaranteed to always be sufficient, though not necessarily so large as to have an unnecessarily detrimental impact upon workbook performance (recall that array formulas calculate over all cells within the range passed to them, whether technically beyond the “last-used” cell in that range or not – so using entire column references is certainly not to be recommended).

The important thing to note is that, whatever the choice of upper bound on the row number, it should equally be applied to Arry2 and to the end reference of the part !D1:D10 in our main construction (which should be one less than the value we choose for our upper bound in Arry2).

The above resolves to:

{1;2;3;4;5;6;7;8;9;10}-1

i.e.:

{0;1;2;3;4;5;6;7;8;9}

Hence we will be passing this range as the rows parameter to OFFSET, with Sheet3!B1 as the reference, which of course simply means that we will be generating references to each of:

Sheet3!B1
Sheet3!B2
Sheet3!B3

Sheet3!B10

Hence:

T(OFFSET(Sheet3!B1,Arry2,,,))=$A11

which is now:

T(OFFSET(Sheet3!B1,{0,1,2,3,4,5,6,7,8,9},,,))=$A11

becomes:

T({Sheet3!B1,Sheet3!B2,Sheet3!B3,Sheet3!B4,Sheet3!B5,Sheet3!B6,Sheet3!B7,Sheet3!B8,Sheet3!B9,Sheet3!B10})=$A11

Note that I have taken some liberties with my “evaluation” of OFFSET here. As anyone who is used to deconstructing formulas using the Evaluate Formula tool (which should be everyone!) will know, OFFSET does not necessarily “evaluate” such references as I give here. In fact, you are almost certain to see a string of “intermediate” #VALUE! entries in such cases.

The point being that, until it becomes clear which type of return is required of this function, OFFSET will not declare its hand. I say “which type” since this function is equally capable of returning a cell reference as it is of returning an actual cell value.

I haven’t got the time or space to go into a detailed technical explanation of this dual property of OFFSET. Suffice to say that this explains the inclusion of the additional function – T or N – which serves to “dereference” the array resulting from OFFSET, effectively instructing Excel to interpret the entries within that array as cell values, not references.

However, the choice of T is not purely arbitrary. Unfortunately, there is no generalized DEREF() function in Excel – we must make a choice here, and of course that choice will have consequences, since both T and N possess additional properties on top of this ability to “dereference”.

Since our entries in column B here consist of text entries, then T is the correct choice. The above will thus resolve to:

{"Colour","Red","Blue","Blue","Red","Violet","Blue","Yellow","Green","Red"}=$A11

which, inserting our value from A11, is:

{"Colour","Red","Blue","Blue","Red","Violet","Blue","Yellow","Green","Red"}="Red"

i.e.:

{FALSE,TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE}

Note that if we had instead here used N, not T, hoping to employ that function as a general “dereferencer”, we would be disappointed, since:

N(OFFSET(Sheet3!B1,Arry2,,,))

would become:

N({"Colour","Red","Blue","Blue","Red","Violet","Blue","Yellow","Green","Red"})

which is, unfortunately for us:

{0,0,0,0,0,0,0,0,0,0}

Note also that we cannot here apply the OFFSET function without any “dereferencing” function. Attempting:

OFFSET(Sheet3!B1,Arry2,,,)

resolves to:

{#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!}

(Warning: “evaluation” via repeated pressing of F9 within the formula bar will “correctly” resolve this construction to the required array, though this just serves to highlight the dangers of “evaluating” formulas using this method, which can, as here, lead to erroneous evaluations. Readers are advised to be aware of the limitations of and potential pitfalls inherent in using this method to deconstruct formulas. The Evaluate Formula tool is by far the “safer” option.)

After all that, then, we should be happy that, for the identical construction operating over column C of our worksheets, we should use N, given that the entries in those columns are numeric. Hence:

N(OFFSET(Sheet3!C1,Arry2,,,))=$B11

by a logic similar to that outlined above, will resolve to:

{0,2010,2010,2012,2012,2012,2012,2011,2014,2011}=2012

which is finally:

{FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE}

And we are now in a position to insert these results back into our construction, so that:

=INDEX(Sheet3!D1:D10,MATCH(1,(T(OFFSET(Sheet3!B1,Arry2,,,))=$A11)*(N(OFFSET(Sheet3!C1,Arry2,,,))=$B11),0))

is now:

=INDEX(Sheet3!D1:D10,MATCH(1,({FALSE,TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE})*({FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE}),0))

which resolves to:

=INDEX(Sheet3!D1:D10,MATCH(1,{0,0,0,0,1,0,0,0,0,0},0))

which is:

=INDEX(Sheet3!D1:D10,5)

and so finally 32, as desired.

9 comments

  1. Great use of the COUNTIFS function and explanation of the formula processing. Just one small clarification to a point that is made further down the post.

    It is correctly pointed out that “Unfortunately, there is no generalized DEREF() function” – at least as a worksheet function (there is such an xlm macro function that can be used in a defined name.) That said, as was demonstrated in another post, with a little extra effort the CELL function could be utilised when the type of data is not known in advance. For example:

    =INDEX(CELL("contents",OFFSET(Sheet1!B1,N(IF(1,{0;1;2})),)),)

    can return any data in the range B1:B3 eg: {“Blue”;2012;TRUE}. Or for data on separate sheets:

    =INDEX(CELL("contents",INDIRECT(T(IF(1,{"Sheet1!A1";"Sheet2!B2";"Sheet3!C3"})))),)
  2. Yes, of course. Excellent – thanks for pointing that out. You don’t seem to suffer from the same memory relapses I have re functions such as TYPE and CELL, which are clearly proving their worth here.

    This does indeed seem to solve the issue of a generalized dereferencer – in fact, it may well be worth writing a post on that subject with this in mind at some point soon.

    Many thanks again.

  3. I’m far from immune to memory lapses – they happen regularly! Also, the other post referred to the CELL function in a different context and with different input values, so no real need to worry about amnesia (yet!)

    Anyway, i’m sure i’m among many that have learned a number of valuable tricks here so thanks for taking the time to share these and also for the cordial responses.

  4. You’re welcome. And the learning is mutual, so thank you too for all that you have so far contributed – it’s much appreciated.

  5. @ashish

    Thanks very much, but is the content of that link relevant to performing VLOOKUP over multiple worksheets? It does not appear to be so.

    Regards

  6. I need help on this

    I have one sheet call 60% on this sheet when I type date and name of employee they will show the value on sheet(1) to sheet(60%).

    sheet(60%) date = B5:B11 name of employee = C3:Ak3
    sheet(1) date = A2 name of employee = B6:B10 value(service)=AP and another value(tips) = AQ

    I appreciate your help!.

  7. @chanh tran

    Sorry, but I don’t quite understand what you require.

    Perhaps if post me a link to an actual workbook then I might be able to help.

    Regards

  8. Pingback: #Excel Super Links #39 – shared by David Hager | Excel For You

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s