How should the UTF-8 mode be activated?
If your application is soft converted and does not use the standard locale-dependent C multibyte routines (mbsrtowcs(), wcsrtombs(), etc.) to convert everything into wchar_t for processing, then it might have to find out in some way, whether it is supposed to assume that the text data it handles is in some 8-bit encoding (like ISO 8859-1, where 1 byte = 1 character) or UTF-8. Once everyone uses only UTF-8, you can just make it the default, but until then both the classical 8-bit sets and UTF-8 may still have to be supported. The first wave of applications with UTF-8 support used a whole lot of different command line switches to activate their respective UTF-8 modes, for instance the famous xterm -u8. That turned out to be a very bad idea. Having to remember a special command line option or other configuration mechanism for every application is very tedious, which is why command line options are not the proper way of activating a UTF-8 mode. The proper way to activate UTF-8 is the POSIX
If your application supports both some 8-bit character set (ISO 8859-*, KOI-8, etc.) and UTF-8, then it has to find out in some way whether it is supposed to use the UTF-8 mode or not. Hopefully, in a few years everyone will only be using UTF-8 and you can just make it the default, but until then both the classical 8-bit sets and UTF-8 have to be supported. Current applications use a whole lot of different command line switches to activate their respective UTF-8 mode, for instance: • xterm command line option “-u8” and X resource “XTerm*utf8: 1” • gnat/gcc command line option “-gnatW8” • stty command line option “iutf8” • mined command line option “-U” • xemacs elisp package to convert between UTF-8 and the internally used MULE encoding • vim ‘fileencoding’ option • less environment variable LESSCHARSET=utf-8 Having to remember a special command line option or other configuration mechanism for every application is very tedious, so some standardization is urgently required here. If you