Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I believe you should be able to use a regular expression like this:

r"([aeiou][bcdfghjklmnpqrstvwxz])+"

for matching vowel followed by consonant and:

r"([bcdfghjklmnpqrstvwxz][aeiou])+"

for matching consonant followed by vowel. For reference, the + means it will match the largest repetition of this pattern that it can find. For example, applying the first pattern to "ababab" would return the whole string, rather than single occurences of "ab".

If you want to match one or more vowels followed by one or more consonants it might look like this:

r"([aeiou]+[bcdfghjklmnpqrstvwxz]+)+"

Hope this helps.

Store the length of the match when you find it. For future reference, you should try not to change the question like this after asking it. – Katriel May 21, 2011 at 8:44 @katrielalex - while unfortunate, it is very common for the initial answers to help clarify the issue for the poster, so the question will evolve and morph a bit in response. On parsing questions, I've had this back-and-forth take dozens of steps - see python-forum.org/pythonforum/… – PaulMcG May 21, 2011 at 14:41
>>> import re
>>> consec_re = re.compile(r'^(([aeiou][^aeiou])+|([^aeiou][aeiou])+)$')
>>> consec_re.match('bale')
<_sre.SRE_Match object at 0x01DBD1D0>
>>> consec_re.match('bail')
                Doesn't seem to work when there are uneven vowels/consonants. consec_re.match('hiben') fails for instance.
– Josh Smeaton
                May 21, 2011 at 6:49
                -1: This matches any non-vowels rather than consonants. For example, consec_re.match('ba7e') returns a match.
– Blair
                May 21, 2011 at 6:59
                This doesn't actually work.  How do I find words that have the MOST consecutive vowel-consonant matching sequences?
– Parseltongue
                May 21, 2011 at 7:26
                @Blair - if you're expecting your input to contain non-words, then yes, it'd be better to hard code the consonants. If you're matching against individual word strings, however, then it would work fine.
– Amber
                May 21, 2011 at 16:16
                @Parseltongue: If you wanted to find the most, you should have stated that in your question...
– Amber
                May 21, 2011 at 16:17

If you map consonantal digraphs into single consonants, the longest such word is anatomicopathological as a 10*VC string.

If you correctly map y, then you get complete strings like acetylacetonates as 8*VC and hypocotyledonary as 8*CV.

If you don’t need the string to be whole, you get a 9*CV pattern in chemicomineralogical and a 9*VC pattern in overimaginativeness.

There are many 10* words if runs of consecutive consonants or vowels are allowed to alternate, as in (C+V+)+. These include laparocolpohysterotomy and ureterocystanastomosis.

The main trick is to first map all consonants to C and all vowels to V, then do a VC or CV match. For Y, you have to do lookaheads and/or lookbehinds to determine whether it maps to C or V in that position.

I could show you the patterns I used, but you probably won’t be pleased with me. :) For example:

 (?<= \p{IsVowel} )     [yY] (?= \p{IsVowel} )  # counts as a C
 (?<= \p{IsConsonant} ) [yY]                    # counts as a V
                        [yY] (?= \p{IsVowel} )  # counts as a C

The main trick then becomes one of looking for overlapping matches of VC or CV alternations via

 (?= ( (?:  \p{IsVowel}       \p{IsConsonant} )  )+ ) )
 (?= ( (?:  \p{IsConsonant}   \p{IsVowel}     )  )+ ) )

Then you count all those up and see which ones are the longest.

However, since Python support doesn’t (by default/directly) support properties in regexes the way I just used them for my own program, this makes it even more important to first preprocess the string into nothing but C’s and V’s. Otherwise your patterns look really ugly.

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.