Why doesn IDNA2008 (or for that matter IDNA2003 or UTS #46) restrict allowed domains on the basis of language?
It is extremely difficult to restrict on the basis of language, because the letters used in a particular language are not well defined. The “core” letters typically are, but many others are typically accepted in loan words, and have perfectly legitimate commercial and social use. It is a bit easier to maintain a clear distinction based on script differences between characters: every Unicode character has a defined script (or is Common/Inherited). Even there it is problematic to have that as a restriction. Some languages, such as Japanese, require multiple scripts. And in most cases, mixtures of scripts are harmless. One can have http://SONY日本.com with no problems at all—while there are many cases of “homographs” (visually confusable characters) within the same script that a restriction based on script doesn’t deal with. The rough consensus among the IETF IDNA working group is that script/language mixing restrictions are not appropriate for the lowest-level protocol. So in this respect,