Advanced Formula Challenge #7: Results and Discussion 1

Last week I set readers the challenge which can be found here.

This is a trickier problem than it at first appears, and indeed there are several pitfalls which prevent us from using more “standard” techniques to arrive at a solution.

Perhaps the two main (hidden) obstacles, which were not immediately obvious from the examples I gave, are, firstly, the fact that we are prevented from using a construction involving a SEARCH-approach (e.g. by locating occurrences of each substring of the four types *????*, †????*, *????† and †????†, as John Jairo V attempted), since this of course presumes that there is only one occurrence of each of those substring types within our string, a presumption which cannot be made.

Secondly, we have to be careful with attempted solutions which involve tests for numericalness of a string which do not test each of the individual characters within that string.

Although it is true that, as I said, there will only ever be one occurrence of a string of the type XabcdY, where a, b, c and d are numbers, this should not lead us to the false conclusion that testing just the first of these characters as to whether it is a number will be sufficient to conclude that all of b, c and d are also numbers.

As I pointed out in the article here, this approach is risky, and not guaranteed to work, one example string which will cause this approach to fail (at least with a regional language setting of English) being “3Jan”.

As such, we need to be a little more subtle in our approach, and some of the techniques we will use are not at all obvious.

My solution is as follows:

=TRIM(RIGHT(SUBSTITUTE(LEFT(A2,MATCH(6,MMULT(0+(IFERROR(ABS(10.5-1/(1/(CODE(MID(MID(SUBSTITUTE(A2,"†","*"),ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42))),-10^3)<{-999,5,5,5,5,-999}),{1;1;1;1;1;1}),0)-1)," ",REPT(" ",LEN(A2))),2*LEN(A2)))

How does it work?

Let’s first have a look at the main MID construction in there, which in turn is itself wrapped in a further MID function. That part is:

MID(SUBSTITUTE(A2,"†","*"),ROW(INDIRECT("1:"&LEN(A2)-5)),6)

Most of the regular readers of my posts will by now be familiar with ROW/INDIRECT constructions:

ROW(INDIRECT("1:"&LEN(A2)-5))

which, since the length of the string in A2 in this example is 50, resolves to:

ROW(INDIRECT("1:"&50-5))

which is:

ROW(INDIRECT("1:"&45))

i.e:

{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45}

And so we have created an array of integers from 1 up to a value 5 less than the length of the string in A2, i.e. 45, to pass to MID as our start_num parameter. This of course makes perfect sense, since we are here interested in substrings consisting of six characters only (the last such string will occupy positions 45 to 50 inclusive).

Hence our MID construction now looks like:

MID(SUBSTITUTE(A2,"†","*"),{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45},6)

Since we are going to be checking for strings beginning and ending in either an asterisk or an obelisk, it seems logical to make our task simpler by first substituting all cases of one for the other, thus reducing the complexity of the resulting checks we have to make.

Resolving the SUBSTITUTE function in the above gives:

MID("Alpha *Beta* *123* Gamma Delta *2014* Epsilon *12*",{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45},6)

which finally resolves to the following array:

{"Alpha ";"lpha *";"pha *B";"ha *Be";"a *Bet";" *Beta";"*Beta*";"Beta* ";"eta* *";"ta* *1";"a* *12";"* *123";" *123*";"*123* ";"123* G";"23* Ga";"3* Gam";"* Gamm";" Gamma";"Gamma ";"amma D";"mma De";"ma Del";"a Delt";" Delta";"Delta ";"elta *";"lta *2";"ta *20";"a *201";" *2014";"*2014*";"2014* ";"014* E";"14* Ep";"4* Eps";"* Epsi";" Epsil";"Epsilo";"psilon";"silon ";"ilon *";"lon *1";"on *12";"n *12*"}

Now, to each of these substrings, we need to apply a further MID function so that we can then make appropriate tests to each of the six characters within those substrings.

Hence:

MID(MID(SUBSTITUTE(A2,"†","*"),ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1)

is simply:

{"A","l","p","h","a"," ";"l","p","h","a"," ","*";"p","h","a"," ","*","B";"h","a"," ","*","B","e";"a"," ","*","B","e","t";" ","*","B","e","t","a";"*","B","e","t","a","*";"B","e","t","a","*"," ";"e","t","a","*"," ","*";"t","a","*"," ","*","1";"a","*"," ","*","1","2";"*"," ","*","1","2","3";" ","*","1","2","3","*";"*","1","2","3","*"," ";"1","2","3","*"," ","G";"2","3","*"," ","G","a";"3","*"," ","G","a","m";"*"," ","G","a","m","m";" ","G","a","m","m","a";"G","a","m","m","a"," ";"a","m","m","a"," ","D";"m","m","a"," ","D","e";"m","a"," ","D","e","l";"a"," ","D","e","l","t";" ","D","e","l","t","a";"D","e","l","t","a"," ";"e","l","t","a"," ","*";"l","t","a"," ","*","2";"t","a"," ","*","2","0";"a"," ","*","2","0","1";" ","*","2","0","1","4";"*","2","0","1","4","*";"2","0","1","4","*"," ";"0","1","4","*"," ","E";"1","4","*"," ","E","p";"4","*"," ","E","p","s";"*"," ","E","p","s","i";" ","E","p","s","i","l";"E","p","s","i","l","o";"p","s","i","l","o","n";"s","i","l","o","n"," ";"i","l","o","n"," ","*";"l","o","n"," ","*","1";"o","n"," ","*","1","2";"n"," ","*","1","2","*"}

Note that the array of six elements which we passed as MID’s start_num parameter here was necessarily of a displacement orthogonal to the large array being processed.

Since that array is a single-column array consisting of 45 elements, we thus need to ensure that our array of start_num parameters is a single-row array. As such, the matrix resulting from this operation will be a 45-row-by-6-column array, the 270 entries in which corresponding to applying MID with a start_num parameter of, in turn, each of the values from 1 to 6 on each of the 45 entries in that large array.

And so we have created an array consisting of all six characters for each of the 45 substrings of length six from our string in A2. What we now need to do is, for each of these 45 groups, to test each of the six characters within them according to the criteria we have laid out.

And those criteria are that the first and last characters be an asterisk (recall that we first substituted out the obelisks) and that the 2nd, 3rd, 4th and 5th characters be numeric.

Now, we could of course apply six separate tests to that effect to each of the groups within our large array. However, such an approach would make any solution extremely cumbersome, not to mention lengthy.

What we really want to be able to do is to somehow perform each of these six test simultaneously over all 45 groups in our array.

Things are perhaps made a tad more amenable to such an approach by virtue of the fact that we have, in total, just two different tests to perform, i.e. one for an asterisk (to 2 of the characters) and one for numericalness (to the remaining – middle – 4 characters).

Now, in order to perform such simultaneous testing, we need to resort to some logical acrobatics. Although such an approach may appear a little convoluted, it is justified (in my opinion) by virtue of the fact that we are able to construct a formula set-up in perhaps a fifth of the characters that we would have were we to instead employ multiple, individual tests and then somehow collate those results.

Allow me to demonstrate how we can achieve such simultaneous criteria testing here.

First we take the ASCII code of each of our entries, so that:

CODE(MID(MID(SUBSTITUTE(A2,"†","*"),ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))

now gives us:

{65,108,112,104,97,32;108,112,104,97,32,42;112,104,97,32,42,66;104,97,32,42,66,101;97,32,42,66,101,116;32,42,66,101,116,97;42,66,101,116,97,42;66,101,116,97,42,32;101,116,97,42,32,42;116,97,42,32,42,49;97,42,32,42,49,50;42,32,42,49,50,51;32,42,49,50,51,42;42,49,50,51,42,32;49,50,51,42,32,71;50,51,42,32,71,97;51,42,32,71,97,109;42,32,71,97,109,109;32,71,97,109,109,97;71,97,109,109,97,32;97,109,109,97,32,68;109,109,97,32,68,101;109,97,32,68,101,108;97,32,68,101,108,116;32,68,101,108,116,97;68,101,108,116,97,32;101,108,116,97,32,42;108,116,97,32,42,50;116,97,32,42,50,48;97,32,42,50,48,49;32,42,50,48,49,52;42,50,48,49,52,42;50,48,49,52,42,32;48,49,52,42,32,69;49,52,42,32,69,112;52,42,32,69,112,115;42,32,69,112,115,105;32,69,112,115,105,108;69,112,115,105,108,111;112,115,105,108,111,110;115,105,108,111,110,32;105,108,111,110,32,42;108,111,110,32,42,49;111,110,32,42,49,50;110,32,42,49,50,42}

We then subtract 42 from each of these values:

{23,66,70,62,55,-10;66,70,62,55,-10,0;70,62,55,-10,0,24;62,55,-10,0,24,59;55,-10,0,24,59,74;-10,0,24,59,74,55;0,24,59,74,55,0;24,59,74,55,0,-10;59,74,55,0,-10,0;74,55,0,-10,0,7;55,0,-10,0,7,8;0,-10,0,7,8,9;-10,0,7,8,9,0;0,7,8,9,0,-10;7,8,9,0,-10,29;8,9,0,-10,29,55;9,0,-10,29,55,67;0,-10,29,55,67,67;-10,29,55,67,67,55;29,55,67,67,55,-10;55,67,67,55,-10,26;67,67,55,-10,26,59;67,55,-10,26,59,66;55,-10,26,59,66,74;-10,26,59,66,74,55;26,59,66,74,55,-10;59,66,74,55,-10,0;66,74,55,-10,0,8;74,55,-10,0,8,6;55,-10,0,8,6,7;-10,0,8,6,7,10;0,8,6,7,10,0;8,6,7,10,0,-10;6,7,10,0,-10,27;7,10,0,-10,27,70;10,0,-10,27,70,73;0,-10,27,70,73,63;-10,27,70,73,63,66;27,70,73,63,66,69;70,73,63,66,69,68;73,63,66,69,68,-10;63,66,69,68,-10,0;66,69,68,-10,0,7;69,68,-10,0,7,8;68,-10,0,7,8,0}

The reason for the choice of 42 is nothing to do with it purportedly being the answer to the ultimate question of life, the universe and everything, but rather due to the fact that 42 is the ASCII code for the asterisk (*).

Hence, by subtracting 42 from our array of values, we know that any zeroes in there must relate to asterisks within our substrings. And the reason we wish to associate zeroes with those characters is that we can then employ an abridged IFERROR technique in our solution.

Hence the “double-reciprocation”, which means that:

ABS(10.5-1/(1/(CODE(MID(MID(SUBSTITUTE(A2,"†","*"),ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))

which is:

ABS(10.5-1/(1/({23,66,70,62,55,-10;66,70,62,55,-10,0;70,62,55,-10,0,24;62,55,-10,0,24,59;55,-10,0,24,59,74;-10,0,24,59,74,55;0,24,59,74,55,0;24,59,74,55,0,-10;59,74,55,0,-10,0;74,55,0,-10,0,7;55,0,-10,0,7,8;0,-10,0,7,8,9;-10,0,7,8,9,0;0,7,8,9,0,-10;7,8,9,0,-10,29;8,9,0,-10,29,55;9,0,-10,29,55,67;0,-10,29,55,67,67;-10,29,55,67,67,55;29,55,67,67,55,-10;55,67,67,55,-10,26;67,67,55,-10,26,59;67,55,-10,26,59,66;55,-10,26,59,66,74;-10,26,59,66,74,55;26,59,66,74,55,-10;59,66,74,55,-10,0;66,74,55,-10,0,8;74,55,-10,0,8,6;55,-10,0,8,6,7;-10,0,8,6,7,10;0,8,6,7,10,0;8,6,7,10,0,-10;6,7,10,0,-10,27;7,10,0,-10,27,70;10,0,-10,27,70,73;0,-10,27,70,73,63;-10,27,70,73,63,66;27,70,73,63,66,69;70,73,63,66,69,68;73,63,66,69,68,-10;63,66,69,68,-10,0;66,69,68,-10,0,7;69,68,-10,0,7,8;68,-10,0,7,8,0})))

becomes, after resolving this double-recripocation:

ABS(10.5-{23,66,70,62,55,-10;66,70,62,55,-10,#DIV/0!;70,62,55,-10,#DIV/0!,24;62,55,-10,#DIV/0!,24,59;55,-10,#DIV/0!,24,59,74;-10,#DIV/0!,24,59,74,55;#DIV/0!,24,59,74,55,#DIV/0!;24,59,74,55,#DIV/0!,-10;59,74,55,#DIV/0!,-10,#DIV/0!;74,55,#DIV/0!,-10,#DIV/0!,7;55,#DIV/0!,-10,#DIV/0!,7,8;#DIV/0!,-10,#DIV/0!,7,8,9;-10,#DIV/0!,7,8,9,#DIV/0!;#DIV/0!,7,8,9,#DIV/0!,-10;7,8,9,#DIV/0!,-10,29;8,9,#DIV/0!,-10,29,55;9,#DIV/0!,-10,29,55,67;#DIV/0!,-10,29,55,67,67;-10,29,55,67,67,55;29,55,67,67,55,-10;55,67,67,55,-10,26;67,67,55,-10,26,59;67,55,-10,26,59,66;55,-10,26,59,66,74;-10,26,59,66,74,55;26,59,66,74,55,-10;59,66,74,55,-10,#DIV/0!;66,74,55,-10,#DIV/0!,8;74,55,-10,#DIV/0!,8,6;55,-10,#DIV/0!,8,6,7;-10,#DIV/0!,8,6,7,10;#DIV/0!,8,6,7,10,#DIV/0!;8,6,7,10,#DIV/0!,-10;6,7,10,#DIV/0!,-10,27;7,10,#DIV/0!,-10,27,70;10,#DIV/0!,-10,27,70,73;#DIV/0!,-10,27,70,73,63;-10,27,70,73,63,66;27,70,73,63,66,69;70,73,63,66,69,68;73,63,66,69,68,-10;63,66,69,68,-10,#DIV/0!;66,69,68,-10,#DIV/0!,7;69,68,-10,#DIV/0!,7,8;68,-10,#DIV/0!,7,8,#DIV/0!})

the point being that any zeroes from the original array result in #DIV/0! whilst non-zero values remain unchanged. Readers may see here for more on this technique if they wish.

Now, since we are interested in identifying not only asterisks in our substring but also numerics, we can use the fact that the ASCII codes for the digits 0-9 range from 48 to 57 inclusive.

The usual way in which we would test an array of values as to whether their ASCII code falls somewhere in this range would be to subtract the ASCII codes from 52.5, take the absolute of the resulting values and test these as to whether they are less than 5 (52.5-48=4.5 and 52.5-57=-4.5).

(Readers who have not seen this method of employing a single test using the absolute values rather than two separate tests – one involving less than and one involving greater than – should take note of the abbreviation it offers us.)

Here, however, we have already subtracted 42 from our array of ASCII codes and so, instead of the above-mentioned 52.5 we will need to use 10.5, which means that:

ABS(10.5-1/(1/(CODE(MID(MID(SUBSTITUTE(A2,"†","*"),ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))

becomes:

{12.5,55.5,59.5,51.5,44.5,20.5;55.5,59.5,51.5,44.5,20.5,#DIV/0!;59.5,51.5,44.5,20.5,#DIV/0!,13.5;51.5,44.5,20.5,#DIV/0!,13.5,48.5;44.5,20.5,#DIV/0!,13.5,48.5,63.5;20.5,#DIV/0!,13.5,48.5,63.5,44.5;#DIV/0!,13.5,48.5,63.5,44.5,#DIV/0!;13.5,48.5,63.5,44.5,#DIV/0!,20.5;48.5,63.5,44.5,#DIV/0!,20.5,#DIV/0!;63.5,44.5,#DIV/0!,20.5,#DIV/0!,3.5;44.5,#DIV/0!,20.5,#DIV/0!,3.5,2.5;#DIV/0!,20.5,#DIV/0!,3.5,2.5,1.5;20.5,#DIV/0!,3.5,2.5,1.5,#DIV/0!;#DIV/0!,3.5,2.5,1.5,#DIV/0!,20.5;3.5,2.5,1.5,#DIV/0!,20.5,18.5;2.5,1.5,#DIV/0!,20.5,18.5,44.5;1.5,#DIV/0!,20.5,18.5,44.5,56.5;#DIV/0!,20.5,18.5,44.5,56.5,56.5;20.5,18.5,44.5,56.5,56.5,44.5;18.5,44.5,56.5,56.5,44.5,20.5;44.5,56.5,56.5,44.5,20.5,15.5;56.5,56.5,44.5,20.5,15.5,48.5;56.5,44.5,20.5,15.5,48.5,55.5;44.5,20.5,15.5,48.5,55.5,63.5;20.5,15.5,48.5,55.5,63.5,44.5;15.5,48.5,55.5,63.5,44.5,20.5;48.5,55.5,63.5,44.5,20.5,#DIV/0!;55.5,63.5,44.5,20.5,#DIV/0!,2.5;63.5,44.5,20.5,#DIV/0!,2.5,4.5;44.5,20.5,#DIV/0!,2.5,4.5,3.5;20.5,#DIV/0!,2.5,4.5,3.5,0.5;#DIV/0!,2.5,4.5,3.5,0.5,#DIV/0!;2.5,4.5,3.5,0.5,#DIV/0!,20.5;4.5,3.5,0.5,#DIV/0!,20.5,16.5;3.5,0.5,#DIV/0!,20.5,16.5,59.5;0.5,#DIV/0!,20.5,16.5,59.5,62.5;#DIV/0!,20.5,16.5,59.5,62.5,52.5;20.5,16.5,59.5,62.5,52.5,55.5;16.5,59.5,62.5,52.5,55.5,58.5;59.5,62.5,52.5,55.5,58.5,57.5;62.5,52.5,55.5,58.5,57.5,20.5;52.5,55.5,58.5,57.5,20.5,#DIV/0!;55.5,58.5,57.5,20.5,#DIV/0!,3.5;58.5,57.5,20.5,#DIV/0!,3.5,2.5;57.5,20.5,#DIV/0!,3.5,2.5,#DIV/0!}

We will shortly be testing each of the six elements in each of the 45 groups in this large array as to whether they meet our criteria. Firstly, though, we need to resolve our #DIV/0!s. And we need to resolve them to a value such that, when we perform our test on them, we ensure that only those values satisfy whatever criterion we apply.

Hence my choice of an arbitrarily large negative value of -1000 for the IFERROR. Since, if we now resolve this IFERROR, we see that:

IFERROR(ABS(10.5-1/(1/(N(CODE(MID(MID(A2,ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))),-10^3)

now becomes:

{12.5,55.5,59.5,51.5,44.5,20.5;55.5,59.5,51.5,44.5,20.5,-1000;59.5,51.5,44.5,20.5,-1000,13.5;51.5,44.5,20.5,-1000,13.5,48.5;44.5,20.5,-1000,13.5,48.5,63.5;20.5,-1000,13.5,48.5,63.5,44.5;-1000,13.5,48.5,63.5,44.5,-1000;13.5,48.5,63.5,44.5,-1000,20.5;48.5,63.5,44.5,-1000,20.5,-1000;63.5,44.5,-1000,20.5,-1000,3.5;44.5,-1000,20.5,-1000,3.5,2.5;-1000,20.5,-1000,3.5,2.5,1.5;20.5,-1000,3.5,2.5,1.5,-1000;-1000,3.5,2.5,1.5,-1000,20.5;3.5,2.5,1.5,-1000,20.5,18.5;2.5,1.5,-1000,20.5,18.5,44.5;1.5,-1000,20.5,18.5,44.5,56.5;-1000,20.5,18.5,44.5,56.5,56.5;20.5,18.5,44.5,56.5,56.5,44.5;18.5,44.5,56.5,56.5,44.5,20.5;44.5,56.5,56.5,44.5,20.5,15.5;56.5,56.5,44.5,20.5,15.5,48.5;56.5,44.5,20.5,15.5,48.5,55.5;44.5,20.5,15.5,48.5,55.5,63.5;20.5,15.5,48.5,55.5,63.5,44.5;15.5,48.5,55.5,63.5,44.5,20.5;48.5,55.5,63.5,44.5,20.5,-1000;55.5,63.5,44.5,20.5,-1000,2.5;63.5,44.5,20.5,-1000,2.5,4.5;44.5,20.5,-1000,2.5,4.5,3.5;20.5,-1000,2.5,4.5,3.5,0.5;-1000,2.5,4.5,3.5,0.5,-1000;2.5,4.5,3.5,0.5,-1000,20.5;4.5,3.5,0.5,-1000,20.5,16.5;3.5,0.5,-1000,20.5,16.5,59.5;0.5,-1000,20.5,16.5,59.5,62.5;-1000,20.5,16.5,59.5,62.5,52.5;20.5,16.5,59.5,62.5,52.5,55.5;16.5,59.5,62.5,52.5,55.5,58.5;59.5,62.5,52.5,55.5,58.5,57.5;62.5,52.5,55.5,58.5,57.5,20.5;52.5,55.5,58.5,57.5,20.5,-1000;55.5,58.5,57.5,20.5,-1000,3.5;58.5,57.5,20.5,-1000,3.5,2.5;57.5,20.5,-1000,3.5,2.5,-1000}

And if we now perform a comparison, for each of these 45 groups, as to whether each of the six elements within that group is less than the corresponding element in the following array:

{-999,5,5,5,5,-999}

we know that only an asterisk in the original string will satisfy the criterion of having a value less than -999 in the array we have created. We also know, from the discussion earlier, that only a numeric will have a value less than 5.

All in all, then, if any group of six elements satisfies all of the above criteria, we can conclude with certainty that it must begin and end with an asterisk and also have for its middle four characters some number from 0-9.

Hence:

0+(IFERROR(ABS(10.5-1/(1/(N(CODE(MID(MID(A2,ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))),-10^3)<{-999,5,5,5,5,-999})

becomes:

0+({FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,TRUE;FALSE,FALSE,FALSE,FALSE,TRUE,FALSE;FALSE,FALSE,FALSE,TRUE,FALSE,FALSE;FALSE,FALSE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE;TRUE,FALSE,FALSE,FALSE,FALSE,TRUE;FALSE,FALSE,FALSE,FALSE,TRUE,FALSE;FALSE,FALSE,FALSE,TRUE,FALSE,TRUE;FALSE,FALSE,TRUE,FALSE,TRUE,FALSE;FALSE,TRUE,FALSE,TRUE,TRUE,FALSE;TRUE,FALSE,TRUE,TRUE,TRUE,FALSE;FALSE,TRUE,TRUE,TRUE,TRUE,TRUE;TRUE,TRUE,TRUE,TRUE,TRUE,FALSE;FALSE,TRUE,TRUE,TRUE,FALSE,FALSE;FALSE,TRUE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE;TRUE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,TRUE;FALSE,FALSE,FALSE,FALSE,TRUE,FALSE;FALSE,FALSE,FALSE,TRUE,TRUE,FALSE;FALSE,FALSE,TRUE,TRUE,TRUE,FALSE;FALSE,TRUE,TRUE,TRUE,TRUE,FALSE;TRUE,TRUE,TRUE,TRUE,TRUE,TRUE;FALSE,TRUE,TRUE,TRUE,TRUE,FALSE;FALSE,TRUE,TRUE,TRUE,FALSE,FALSE;FALSE,TRUE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE;TRUE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,TRUE;FALSE,FALSE,FALSE,FALSE,TRUE,FALSE;FALSE,FALSE,FALSE,TRUE,TRUE,FALSE;FALSE,FALSE,TRUE,TRUE,TRUE,TRUE})

Adding the zero to convert to numerics makes things a little easier to visualize:

{0,0,0,0,0,0;0,0,0,0,0,1;0,0,0,0,1,0;0,0,0,1,0,0;0,0,1,0,0,0;0,1,0,0,0,0;1,0,0,0,0,1;0,0,0,0,1,0;0,0,0,1,0,1;0,0,1,0,1,0;0,1,0,1,1,0;1,0,1,1,1,0;0,1,1,1,1,1;1,1,1,1,1,0;0,1,1,1,0,0;0,1,1,0,0,0;0,1,0,0,0,0;1,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,1;0,0,0,0,1,0;0,0,0,1,1,0;0,0,1,1,1,0;0,1,1,1,1,0;1,1,1,1,1,1;0,1,1,1,1,0;0,1,1,1,0,0;0,1,1,0,0,0;0,1,0,0,0,0;1,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,0;0,0,0,0,0,1;0,0,0,0,1,0;0,0,0,1,1,0;0,0,1,1,1,1}

To identify our string of six 1s in the above array, we use a standard MMULT construction, which means that:

MMULT(0+(IFERROR(ABS(10.5-1/(1/(N(CODE(MID(MID(A2,ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))),-10^3)<{-999,5,5,5,5,-999}),{1;1;1;1;1;1})

resolves to:

{0;1;1;1;1;1;2;1;2;2;3;4;5;5;3;2;1;1;0;0;0;0;0;0;0;0;1;1;2;3;4;6;4;3;2;1;1;0;0;0;0;1;1;2;4}

We then locate the position of our 6 using MATCH, so that:

MATCH(6,MMULT(0+(IFERROR(ABS(10.5-1/(1/(N(CODE(MID(MID(A2,ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))),-10^3)<{-999,5,5,5,5,-999}),{1;1;1;1;1;1}),0)

returns 32, as can easily be verified.

We then subtract 1 from this value, since we are now going to apply LEFT to our original string using this value as a parameter and clearly we do not wish to have a result which ends in an asterisk or obelisk.

Our original formula, which recall is:

=TRIM(RIGHT(SUBSTITUTE(LEFT(A2,MATCH(6,MMULT(0+(IFERROR(ABS(10.5-1/(1/(N(CODE(MID(MID(A2,ROW(INDIRECT("1:"&LEN(A2)-5)),6),{1,2,3,4,5,6},1))-42)))),-10^3)<{-999,5,5,5,5,-999}),{1;1;1;1;1;1}),0)-1)," ",REPT(" ",LEN(A2))),2*LEN(A2)))

has now become:

=TRIM(RIGHT(SUBSTITUTE("Alpha *Beta* *123* Gamma Delta "," ",REPT(" ",LEN(A2))),2*LEN(A2)))

and it is now a relatively straightforward task to extract our desired string using a standard technique. I will not give a detailed explanation of the remainder of the resolution of this formula, though readers may see here for a breakdown of how this technique with TRIM, SUBSTITUTE and REPT works if they wish.

Suffice to say that the above becomes:

=TRIM(RIGHT("Alpha                                 *Beta*                                 *123*                                 Gamma                                 Delta                                 ",2*LEN(A2)))

which is:

=TRIM("                            Delta                                 ")

i.e. “Delta”, as required.

Another challenge to follow shortly. Watch this space!

One comment

  1. Pingback: Search w/in text (complicated) - Page 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s