One feature that seems to be missing in the re module (or any tools that I know for searching text) is "diacritical insensitive search". I would like to have a match for something like this:
re.match("franc", "français")
in about the same whay we can have a case incensitive search:
re.match("(?i)fran", "Français").
Another related and more general problem (in the sense that it could easily be used to solve the first problem) would be to translate a string removing any diacritical mark:
nodiac("Français") -> "Francais"
The algorithm to write such a function is trivial but there are a lot of mark we can put on a letter. It would be necessary to have the list of "a"'s with something on it. i.e. "à,á,ã", etc. and this for every letter. Trying to make such a list by hand would inevitably lead to some symbols forgotten (and would be tedious).