Can UTF-8 characters be invalid in XML?
From the Xerces documentation (http://xml.apache.org/xerces2-j/faq-common.html#faq-2): “There are many Unicode characters that are not allowed in an XML document, according to the XML spec. Typical disallowed characters are control characters, even if you escape them using the Character Reference form: &#xxxx; . See the XML spec, sections 2.2 and 4.1 for details. If the parser is generating this error, it is very likely that there is a character in the file that you cannot see. You can generally use a UNIX command like “od -hc” to find it.
From the Xerces documentation (http://xml.apache.org/xerces2-j/faq-common.html#faq-2): “There are many Unicode characters that are not allowed in an XML document, according to the XML spec. Typical disallowed characters are control characters, even if you escape them using the Character Reference form: &#xxxx; . See the XML spec, sections 2.2 and 4.1 for details. If the parser is generating this error, it is very likely that there is a character in the file that you cannot see.