How do I represent extended characters in URIs (URNs/URLs)?
[This section needs to be updated] URIs (Universal Resource Identifiers) can contain non-ASCII characters. The way considered the best to encode such URIs is the following: • Normalize the characters if needed. • Encode the string in UTF-8. • Replace each byte of the UTF-8 string that is greater than 127 (and any other byte that is considered unsafe as described in RFC 2396) by its escaped form %HH where HH is the hexadecimal value of the given byte.