How should Unicode be used under Linux?
Before UTF-8 emerged, Linux users all over the world had to use various different language-specific extensions of ASCII. Most popular were ISO 8859-1 and ISO 8859-2 in Europe, ISO 8859-7 in Greece, KOI-8 / ISO 8859-5 / CP1251 in Russia, EUC and Shift-JIS in Japan, BIG5 in Taiwan, etc. This made the exchange of files difficult and application software had to worry about various small differences between these encodings. Support for these encodings was usually incomplete, untested, and unsatisfactory, because the application developers rarely used all these encodings themselves. Because of these difficulties, the major Linux distributors and application developers now foresee and hope that Unicode will eventually replace all these older legacy encodings, primarily in the UTF-8 form. UTF-8 will be used in * text files (source code, HTML files, email messages, etc.) * file names * standard input and standard output, pipes * environment variables * cut and paste selection buffers * telnet,