Advanced Formula Challenge #8: Facetious? Moi? 25

The challenge this week is as follows: given a single paragraph of text in A1, which may or may not contain punctuation, a single formula in B1 to identify the number of words within that text which contain all five vowels of the English alphabet precisely once each and in an order of appearance, from left to right, of a, e, i, o, u.

For the below example the result would be 8, as highlighted in red.

Advanced Formula Challenge 8 Facetious Moi v2

Edit: I have now amended the text to make it clearer that certain words are not to be considered in the count: those in black, for example, do not meet the requirements as outlined above.

The workbook can be downloaded here.

Solution next week. Best of luck!

25 comments

  1. Hi again Excel XOR. I’m Posting here again with a possible solution for this challenge.

    Defined Names:

    Arr1:

    =TRIM(Sheet1!$A$1)

    Arr2:

    =ROW(INDIRECT("1:"&LEN(Arry1)+1))

    Arr3:

    =ROW(INDIRECT("1:"&LEN(Arry1)+1-LEN(SUBSTITUTE(Arry1," ",""))))

    And the main formula (confirm with CSE):

    =SUM(--ISNUMBER(SEARCH("a*e*i*o*u",MID(" "&Arry1&" ",SMALL(IF(CODE(MID(" "&Arry1,Arry2,1))=32,Arry2),Arry3),SMALL(IF(CODE(MID(Arry1&" ",Arry2,1))=32,Arry2),Arry3)-SMALL(IF(CODE(MID(" "&Arry1,Arry2,1))=32,Arry2),Arry3)+1))))

    Blessings!

  2. Here is my solution for Google Sheets:

    =ArrayFormula(sum(–(regexmatch(split(A1," "),"(?i)^[^aeiou]*a[^aiou]*e[^aeou]*i[^aeiu]*o[^aeio]*u[^aeiou]*$"))))

    I split the text by every space ” ” so each word is in a separate cell (whether or not punctuation is attached should not affect the result). Then I check for a regexmatch. Regexmatch checks to see if no vowel is found before a single “a”, then checks to see that no vowel other than “e” is found next and so on until the end, where it checks to make sure no vowel is found after they have been used up. If the pattern matches, regexmatch returns TRUE. The (?i) makes the regexmatch case-insensitive. Then it is all summed up.

    The spreadsheet can be found here:

    https://docs.google.com/spreadsheets/d/19tTBOBCuDpI7MDYrOd-LZSjZdpHvsvIqrOs4vJ0ARNk/edit#gid=1599143470

  3. Actually I made an error in logic and updated my formula in the spreadsheet:

    =ArrayFormula(sum(--(regexmatch(split(A1," "),"(?i)^[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*$"))))

    Now it makes sure no vowel is found before each single instance of each vowel.

  4. @John Jairo V

    Have you accounted for my statement: “…which contain all five vowels of the English alphabet precisely once each“?

    “Facetiousness” or “Raeticodactylus”, for example, should not be considered as part of this exercise.

    I have now amended the text so that it contains a few examples of such “false alarms” which are not to be considered.

    Regards

  5. I hope this is not a shortest challenge for this is anything but short! If you disregard checking for single occurrence of vowels I can happily get the count of words that contain all in correct order using 174 characters. I’m a bit miffed that I have not found any way within my scope of knowledge to avoid using 5 nested substitutions to facilitate testing for duplicates.

    Anyway, here it is. I believe it answers the question posed.

    =SUM(IF(LEN(IF(ISNUMBER(SEARCH("a*e*i*o*u",TRIM(MID(SUBSTITUTE(" "&$A1," ",REPT(" ",LEN($A1))),LEN($A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN($A1))),1)),TRIM(MID(SUBSTITUTE(" "&$A1," ",REPT(" ",LEN($A1))),LEN($A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN($A1))),""))-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(LOWER(IF(ISNUMBER(SEARCH("a*e*i*o*u",TRIM(MID(SUBSTITUTE(" "&$A1," ",REPT(" ",LEN($A1))),LEN($A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN($A1))),1)),TRIM(MID(SUBSTITUTE(" "&$A1," ",REPT(" ",LEN($A1))),LEN($A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN($A1))),"")),"a",""),"e",""),"i",""),"o",""),"u",""))=5,1,0))

    Confirm with Ctrl + Shift + Enter

    If it helps to have the file uploaded then please let me know.

  6. Just noticed and removed a few stray $’s that are not required…..

    =SUM(IF(LEN(IF(ISNUMBER(SEARCH("a*e*i*o*u",TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",LEN(A1))),LEN(A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN(A1))),1)),TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",LEN(A1))),LEN(A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN(A1))),""))-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(LOWER(IF(ISNUMBER(SEARCH("a*e*i*o*u",TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",LEN(A1))),LEN(A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN(A1))),1)),TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",LEN(A1))),LEN(A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN(A1))),"")),"a",""),"e",""),"i",""),"o",""),"u",""))=5,1,0))
  7. @Snakehips

    No – this one isn’t a “shortest” formula challenge. In fact, usually they aren’t; you just happened to stumble upon the very first one last week. In any case, the post title will tell you if it is or not.

    So you needn’t worry about shaving off a few dollar signs here and there!

    And congratulations! 🙂 Long it may be, but your solution is fully sound! Well done!

    N.B. It can be abbreviated somewhat, as you suspected, which I’ll demonstrate on Sunday if you or anyone else doesn’t first. And in future you may also want to consider storing some of the formula sections as Defined Names, particularly those which repeat several times within the formula, e.g. this part:

    TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",LEN(A1))),LEN(A1)*ROW(INDIRECT("1:"&LEN(A1) - LEN(SUBSTITUTE(A1," ","")) + 1)),LEN(A1)))

    stored as e.g. Arry3, after which your formula would be a “little” more succinct-looking, i.e.:

    =SUM(IF(LEN(IF(ISNUMBER(SEARCH("a*e*i*o*u",Arry3,1)),Arry3,""))-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(LOWER(IF(ISNUMBER(SEARCH("a*e*i*o*u",Arry3,1)),Arry3,"")),"a",""),"e",""),"i",""),"o",""),"u",""))=5,1,0))

    Though of course you should probably use some absolute referencing if you’re going down the Defined Name route.

    Well done again and thanks a lot!

  8. Ok… I think this time ¡I got it!

    Defined Names:

    Name:

    =TRIM(Sheet1!$A1)

    Arry;

    =TRIM(MID(SUBSTITUTE(Name," ",REPT(" ",LEN(Name))),(ROW(INDIRECT("1:"&1+LEN(Name)-LEN(SUBSTITUTE(Name," ",""))))-1)*LEN(Name)+1,LEN(Name)))

    And the main formula (Without CSE):

    =SUMPRODUCT(ISNUMBER(SEARCH("a*e*i*o*u",Arry))*(MMULT(--(LEN(SUBSTITUTE(LOWER(Arry),{"a","e","i","o","u"},""))=LEN(Arry)-1),{1;1;1;1;1})=5))

    Blessings!

  9. Could be better (faster process):

    =SUMPRODUCT(--(ISNUMBER(SEARCH("a*e*i*o*u",Arry)));--(MMULT(--(LEN(SUBSTITUTE(LOWER(Arry),{"a","e","i","o","u"},""))=LEN(Arry)-1),{1;1;1;1;1})=5))

    Blessings!

  10. This time perfect!! Almost exactly what I have to the last letter!

    Very good technique, especially the use of MMULT (which I imagine is what Snakehips missed).

    Congratulations!

  11. Well done JJV.

    @XOR LX

    I certainly did miss the MMULT.

    Not that it wasn’t in my toolbox of course. No, rather it’s just one of a MMULTitude of mmysterious Excel ffffunctions that remain at the bottom of the ttoolbox, neglected and unused. Mainly because I don’t know which end to hhold and wwwhich end to hit with.

    May I say that remaining old brain cells willing, I should be somewhat wiser if I take time to digest the excellent, clear and concise explanations that you present on this site.

    Keep up the good work!

  12. I had no time to look at this challenge so my proposition is little to late. I use almost the same technique like John. This formula could be a little faster than John’s.

    I use named range “words” build on the formula i showed maybe one (or two) years earlier somewhere in internet (i can’t remember where – maybe it was on ExcelIsFun’s channel on YT)

    Name: words

    =LOWER(TRIM(MID(SUBSTITUTE($A$1," ",REPT(" ",LEN($A$1))),(ROW(INDIRECT("1:"&LEN($A$1)-LEN(SUBSTITUTE($A$1," ",""))+1))-1)*LEN($A$1)+1,LEN($A$1))))

    And the main formula is

    =SUM(IF(ISNUMBER(SEARCH("*a*e*i*o*u*",words)),IF(LEN(words)-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(words,"a",""),"e",""),"i",""),"o",""),"u",""))=5,1)))

    Regards

  13. Pingback: Advanced Formula Challenge #8: Results and Discussion | EXCELXOR

  14. @Snakehips

    Thanks for your kind words. Despite the general tone of this site being quite “highbrow”, I do attempt (and hope) that my explanations and ramblings are done in such a way as you describe, so it’s very nice to know that at least someone thinks so. Cheers.

    MMULT is, in my opinion, perhaps the least understood (and also the most “feared”?) of all Excel functions. If the various forums around the internet are anything to go by, the proportion of advanced Excel users who have seemingly mastered the vast majority of functions, yet whose solutions almost never seem to involve an MMULT construction, is surprisingly high. And yet, once you take the time to look into the workings of this function (obviously a background in mathematics will be advantageous here) then it’s really not so complex as people might think, and the uses of MMULT are many and varied.

    Perhaps I should write a post dedicated to that function?

  15. @Bill Szysz

    Thanks, Bill, and apologies if my solution came out before I had time to post your attempt.

    It appears in essence that you have chosen the same set-up as Snakehips, though I am interested by your comment that this multiple SUBSTITUTE construction would be “faster” than the MMULT set-up that both John and I used. Do you have some calculation timings to support this?

    Besides, even if it does turn out to be more efficient in terms of calculation speed, surely that advantage is outweighed by its length? What if, for example, we were considering not just 5 characters, but e.g. 25? Would you still consider a set-up with that number of nested SUBSTITUTE functions to be more efficient than a single SUBSTITUTE array-processed over the same 25 characters and passed to MMULT?

    Perhaps it would be indeed still faster, but you’d certainly struggle to fit it onto a single page of this forum! Besides, even if it shaved off a few microseconds, I’d still have a preference for an MMULT construction which uses about 300 characters less!

  16. I know this is an old thread. I love to come here to learn …. and feel my jaw drop.

    Of interest I noticed more than one formula uses the construction.

    SUBSTITUTE($A$1," ",REPT(" ",LEN($A$1)))

    With a string this long I always get #VALUE! errors when I attempt that. This time was no exception. (I usually end up fiddling around trial and error trying to find a workable “padding”.)

    What am I missing?

  17. Hi Dave,

    Glad to see the old threads still generating interest to some!

    You mean that the construction you give returns #VALUE!? Or when wrapped in some external function(s)? If the latter, can you give the precise construction you are using?

    Regards

  18. Hello,

    Thank you for the quick reply. Sorry about the scant detail. The part I cited seems to be the ‘culprit’ in the formulas I tried which are:

    John Jairo V’s of OCTOBER 24, 2014 AT 14:13

    Named formulas in Name Manager:

    name:

    =TRIM(Sheet1!$A$1)

    arry:

    =TRIM(MID(SUBSTITUTE(name," ",REPT(" ",LEN(name))),(ROW(INDIRECT("1:"&1+LEN(name)-LEN(SUBSTITUTE(name," ",""))))-1)*LEN(name)+1,LEN(name)))

    Final formula.

    =SUMPRODUCT(--(ISNUMBER(SEARCH("a*e*i*o*u",arry))),--(MMULT(--(LEN(SUBSTITUTE(LOWER(arry),{"a","e","i","o","u"},""))=LEN(arry)-1),{1;1;1;1;1})=5))

    I had to change the “;” to “,” following ISNUMBER. I am guessing regional settings?

    That returns #VALUE!

    Bill Szysz’s of OCTOBER 26, 2014 AT 15:38

    Named formulas in Name Manager:

    words:

    =LOWER(TRIM(MID(SUBSTITUTE(Sheet1!$A$1," ",REPT("  ",LEN(Sheet1!$A$1))),(ROW(INDIRECT("1:"&LEN(Sheet1!$A$1)-LEN(SUBSTITUTE(Sheet1!$A$1," ",""))+1))-1)*LEN(Sheet1!$A$1)+1,LEN(Sheet1!$A$1))))

    Final Formula array entered.

    =SUM(IF(ISNUMBER(SEARCH("*a*e*i*o*u*",words)),IF(LEN(words)-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(words,"a",""),"e",""),"i",""),"o",""),"u",""))=5,1)))

    That returns 0.

    An F9 and Fx of the part cited shows an array of #VALUE! errors. I always assumed that ‘over-padding’ exceeded allowable character limits and that was the problem. Less padding of course results in ‘chopping’ words in the middle.

    I am mystified why and how these are working for everyone else.

  19. My posting of Bill Szysz’s final formula got “chopped” at posting. What I have in my workbook is copy / pasted from his posting.

  20. You’re correct, though you must be using an example string longer than that I give in the original challenge.

    In fact, that string must be greater than 649 characters in length, since then, this part:

    SUBSTITUTE($A$1," ",REPT(" ",LEN($A$1)))

    will evaluate to a string of more than 65,536 characters, which I believe must be an upper limit.

    Thanks a lot for pointing this out!

    Regards

  21. Ah, of course. Apologies. Since the REPT function is lending all the extra characters, it’s not so much the number of characters within the original string but rather the number of spaces (i.e. words) which is the consideration here, since each space is being replaced with a much larger string of spaces.

    Perhaps the string you are testing contains more spaces (words) than that which I gave?

    Regards

  22. Ah, wait. Are you in fact using the precise string as I posted? In that case, I’m not sure how to explain the errors you are getting.

    What version of Excel are you using?

    Regards

  23. Found the problem.

    Yes I was using the string that is in the download … version 2.

    When I dug further I found

    REPT(" ",LEN(A1))

    alone returned #VALUE!. (!!?!!)

    On a hunch I copied A1, pasted it into a new workbook and saved it as the original file name.

    Everything works fine now. Geesh!

    Thank you for following up with me on this one.

    I thought I was losing my mind.

    By-the-way. This is a really cool problem. Many hidden “gotchas”. Found an alternative to parse by padding … a bit convoluted but it works. Discovered some new things. So nothing lost but tooth enamel.

    I think I’ll go look at your solution now.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s