Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How are label delimiters handled in implementations of IDNA?

April 26, 2017delimiters handled idna implementations Label

0

Posted

How are label delimiters handled in implementations of IDNA?

1 Answer

0

Posted

The processing of UTS #46 matches what is commonly done with label delimiters by browsers, whereby characters containing periods are transformed into the NFKC format before labels are separated. This allows the domain name to be mapped in a single pass, rather than label by label. However, except for the four label separators provided by IDNA2003, all input characters that would map to a period are disallowed. For example, U+2488 ( ⒈ ) DIGIT ONE FULL STOP has a decomposition that maps to a period, and is thus disallowed. The exact list of characters can be seen with the Unicode utilities using a regular expression: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=\p{toNFKC=/\./} The question also arises as to how to handle escaped periods (such as %2E).