After reading the irreal post on this subject and Jeremy Friesen’s I wondered if it was possible to generally remove combining diacritics from a string in Elisp. It turns out it is and here’s the code:
(defun my/remove-diacritics (in-string)
"Return a string like IN-STRING but without diacritics."
(concat (mapcar (lambda (char)
(car (get-char-code-property char 'decomposition)))
in-string)))
Using get-char-code-property
to get the decomposition
property gives a list of characters the first of which should be the letter the original character was formed from e.g.
ELISP> (get-char-code-property ?\ẽ 'decomposition)
(101 771)
101 or 0x65 is “LATIN SMALL LETTER E” in unicode and 771 is “COMBINING TILDE”.
I don’t know if this works in all cases but it does mean you don’t need to create a string or list of all the characters you want to convert and the appropriate conversions.