What is the performance difference between UTF-8 and UTF-16?
Most of the time, the memory throughput of the hard drive and RAM is the main performance constraint. UTF-8 is 50% smaller than UTF-16 for US-ASCII, but UTF-8 is 50% larger than UTF-16 for East and South Asian scripts. There is no memory difference for Latin extensions, Greek, Cyrillic, Hebrew, and Arabic. For processing Unicode data, UTF-16 is much easier to handle. You get a choice between either one or two units per character, not a choice among four lengths. UTF-16 also does not have illegal 16-bit unit values, while you might want to check for illegal bytes in UTF-8. Incomplete character sequences in UTF-16 are less important and more benign. If you want to quickly convert small strings between the different UTF encodings or get a UChar32 value, you can use the macros provided in utf.h and its siblings utf8.h and utf16.h. For larger or partial strings, please use the conversion API.
Related Questions
- How much difference is there in performance between the Ping G" G5 and G10 drivers - ?
- What is the difference between a performance audit and a financial audit?
- What is the difference between a performance audit and a financial audit?
- What is the performance difference between UTF-8 and UTF-16?
- What is the difference between UTF-8 and UTF-16?
- What is the difference between UTF-8 and UTF-16?