Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Why aren Arabic combining modifier letters separately encoded?

April 26, 2017Arabic combining encoded letters modifier separately

0

Posted

Why aren Arabic combining modifier letters separately encoded?

1 Answer

0

Posted

The reasons for encoding the new letterforms as a unit and not encoding combining modifier forms separately are historic, due to the evolution of the Unicode Standard. While vowels, Koranic marks, and other pronunciation marks have been encoded as combining marks, the consonantal base letters have consistently been encoded in Unicode as a unit. To change this practice would open the door to multiple representations for the same letters. The Unicode Standard provides a unique normalized representation for text, even when both precomposed and decomposed forms exist. This model is used for Latin and other scripts. However, to provide stability for the wide range of products that use Unicode, the normalized forms cannot change. For this reason, decomposed characters for Arabic cannot be added without having duplicate representations, which would cause serious implementation problems, including security issues. Thus, the decision was made to keep the representation of Arabic base letterform