How to get international unicode characters from a a form input field/servlet parameter into a string?
Author: Arthur Tang (http://www.jguru.com/guru/viewbio.jsp?EID=34155), Dec 13, 2002myparam.getBytes(“8859_1”) will definitely kill double byte string if page encoding is not 8859_x or latins.I found that all of the big 3 browsers(IE,NS/M,O) have very poor support of sending encoding information of their request. So you will not get the encoding anyway from the request.I took the approach of setting the uniform encoding, such as utf-8, to all pages in your app (and hope the users do not change the browser’s encoding between pages, fortunately most of user do not know what it is and won’t change it), or page before submit. So, the request’s encoding will be as you specified (utf-8). AND, most important is to set the request char encoding (setCharacterEncoding()) to ‘your’ encoding (utf-8) before you get the parameter (getParameter). This will interpret the submitted request in utf-8. Otherwise, the getParameter will split the double byte chars into some string cannot interpret again.