The challenge this week is as follows: given a string in A2, return the single-word substring which immediately precedes the only occurrence within that string of a substring of the form *XabcdY*, where *X* and *Y* are either an asterisk or an obelisk and *a*, *b*, *c* and *d* are integers from 0 to 9.

For example:

**Notes:**

- The string is delimited by single spaces
- Excluding the one substring of the form
*XabcdY, *each of the other substrings may or may not have an asterisk or an obelisk at each end (providing of course that this will not result in more than one substring of the form *XabcdY)*

Solution next week. Best of luck!

### Like this:

Like Loading...

*Related*

Hi, I started to do these advanced formula challenges yesterday in Google Sheets. It has been fun so thanks a lot.

The link to my spreadsheet can be found here:

https://docs.google.com/spreadsheets/d/19tTBOBCuDpI7MDYrOd-LZSjZdpHvsvIqrOs4vJ0ARNk/edit#gid=0

There is a separate tab at the bottom for each challenge. You can copy the file by going to File -> Make a copy after signing into your Google account.

Hi again!

I Create a Defined Name Arr, with this formula:

And I create the “main” formula in B2 (CSE):

I mention that I’m from Cali, Colombia and I love this website because there are many interesting techniques to learn. The formulas I produce in the Spanish, and translate them into English to post here. Sorry if I make mistakes in language or in my way of expressing myself. Thank you for sharing your knowledge!

Blessings, and please continue like this!

@Isai AlvaradoWelcome to the site!

And many thanks for the file. I’ll try to take a look at it at some point soon. Perhaps in future you could post some of your solutions here so that everyone can see them?

Also, I’m afraid I don’t use Googledocs, and when I try to download your workbook as a normal Microsoft Excel file then I get a workbook with just values, and no formulas. How can I convert it so that I can see your original formulas?

Thanks again!

@John Jairo VHi again, John. I just tried your solution and I don’t get all the correct results for the examples I gave.

For the string

Alpha †Beta* *123† Gamma Delta *2014† Epsilon *12*I get a result ofAlphausing your set-up, though the result should beDelta.For

2013 Kappa *Lambda† †Mu† 2000* *0001*your formula gives2013, though the result should be2000*.And for

Nu 1999 *Xi* Omicron *2011* Pi Rho 1997your formula givesNu, though the result should beOmicron.Remember, the desired extraction always immediately precedes the one case of a string where the 1st and 6th characters are either an asterisk or an obelisk, and the 2nd, 3rd, 4th and 5th characters are numbers.

Regards

For some reason (I don´t know why) the formula for the defined name can’t post here.

I put the dropbox link here for my file, and yo can see the formula working.

https://www.dropbox.com/s/mpebkhu9ig2gbue/Reto%20XOR.xlsx?dl=0

Blessings!

Ah. I understand what happened. Apologies on behalf of the comment editor here!

I have corrected it, both above and in your original post. I will have a look now at your solution.

Many thanks.

@John Jairo VNearly. Works on the examples I gave, though will not work on all strings, unfortunately.

You need to be careful with this technique for identifying numbers. For example, what if the string in A2 was:

Alpha *Jan3† †Beta* *123† Gamma Delta *2014† Epsilon *12*?

(Or whatever the equivalent is in Spanish for Jan3 – Ene3?)

Or:

*Beta† †2e10* †Zeta †2004† Eta Theta 2008 †Iota†The point being that dates and scientific notation will equally meet your criterion:

See here for a discussion on the drawbacks of this technique for identifying numbers in strings:

https://excelxor.com/2014/09/14/extracting-numbers-from-a-string-1-consecutive-numbers-at-start/

Regards

Hi!

I understood your point and is valid totally. I tried correct that part and (isnumber–…) and I couldn’t (it seems indirect not work well with arrays). Maybe you can show me how to do it with my construction… and later, show your answer (I imagine that your’s is better than mine).

Blessings!

What precisely did you try to resolve it?

Regards

@XOR LX

Unfortunately, the only way to see the full formulas (you can see part of them by clicking on the cells) is to make a copy by going to File -> Make a copy (after signing into your Google account). The solutions use a few functions that may not be in Excel, so exporting it may result in errors. Here is my solution for Challenge 7.

=regexextract(A2,”((?:†|\*)?\w+(?:†|\*)?)[^\w]*(?:†|\*)\d{4}(?:†|\*)”)

The regexextract function extracts matching substrings with regular expressions. Here I extract one or more consecutive word characters, \w+ ,with or without an asterisk or obelisk on either side, ((?:†|\*)?\w+(?:†|\*)?) . Also, it must be before 0 or more non-word characters, [^\w]*, and also 4 digits with an asterisk or obelisk on both sides, (?:†|\*)\d{4}(?:†|\*).

I try this (in the context of the formula that I post before:

= … MIN(IF(ISNUMBER(Arr);IF(COUNT(–MID(A2,ROW(INDIRECT(Arr+1&”:”&Arr+4)),1))=4; … )))

The indirect part is not working…

@John Jairo VActually, John, I’ve noticed a bigger problem with your set-up which I didn’t earlier. And that is that you’re using the SEARCH function which, as you probably know, returns the position of the

firstoccurrence of the search string within the string.This means that, if the pattern, say *????*, occurs

more than oncein the string, and the one containing the numbers is not actually the first such occurrence, but the second or third, then your set-up will not give the desired result. For example, this string:2013 Kappa *Lambda† †Mu† 2000* *Beta* *0001*I like your idea of using SEARCH, though, and of first forming an array of search strings with this concatenation. I’m just not sure you can tweak it to overcome this issue?

Regards

@Isai AlvaradoThanks, Shame about the Googledocs issue. I never really considered that platform when I set up this site, and I think it will be difficult to now expand it to incorporate the additional functions offered in the Google version.

So unless you can adapt your solutions there to also work for Microsoft Excel then I’m afraid I won’t be able to analyse/comment upon them. Which is a shame as I’m sure you’ve put a lot of hard work in to developing your answers.

Also, re regex, are you referring to a VBA approach here? If so, then sadly I’m going to have to disappoint you a second time: I set up this site with the aim of presenting ideas/techniques

using worksheet formulas alone. (I know, it doesn’t say that anywhere on this site – perhaps I should add something to the subtitle…)I’m fully aware that such a philosophy means that I am not necessarily giving the most “efficient” techniques for a given problem, and that there are many “better” ways of obtaining the solutions to the challenges I set, e.g. VBA, Pivot Tables, etc., though I guess I just want to see what is possible using formulas alone, to push the boundaries of what can be achieved with that as a caveat.

And I happen to think that it’s a lot more than most people imagine, too.

Anyway, sorry again! I hope I haven’t

completelydissuaded you from visiting again! 🙂 Your thoughts and solutions are always welcome – if only they could be purely (Excel) formula-based!Regards

@XOR LX

Hi,

Thanks for the responses. All my answers are pure worksheet formulas (no pivot tables/VBA/JavaScript/etc.). However, they use some functions not in Excel, so I am sorry about that. Also, it is ok if you do not comment haha. I will continue to have fun coming up with answers to your fun challenges, although I like sharing what Google Sheets can do. When it comes to extracting text, the Google Sheets functions REGEXEXTRACT and REGEXREPLACE are very useful.

Ah, apologies then Isai!

I really know very little about the Google version. I had always presumed that it was practically identical to Excel, but it sounds like there are some significant differences. It also sounds as if some of those differences may be “improvements” over the Excel functions, especially the two REGEX examples you give.

Perhaps I will one day learn to use this version myself. In the meantime, I’m glad that you’re still going to be visiting here and enjoying the challenges!

Cheers

@John Jairo VNo. The issue is not so much that INDIRECT cannot process arrays – it can. It’s more that you cannot use the “:” syntax to create an

arrayof range references.I think also that you will find it difficult to further manipulate the Defined Name you have created,

Arr, mainly because this is not aone-dimensionalrange, but a 2-row-by-2-column array, e.g. of the form: {a,b;c,d}.Of course, this could be first re-dimensioned so that it

isa one-dimensional array, e.g. to {a;b;c;d}, a 4-row-by-1-column array, so that you can then manipulate it more easily.However, you would still not be able to employ the “:” syntax above. You could, however, after it has been re-dimensioned, apply a further MID function to the resulting array with an array of

start_numparameters from 1 to 4 and test that all of these are numerical.However, you still have the larger issue that I outlined in my previous comment, since using SEARCH in this way limits you to considering the

first occurrence onlyof each of the four string types *????*, †????*, *????† and †????†.Pingback: Advanced Formula Challenge #7: Results and Discussion | EXCELXOR