XML-Defined Character Sets (XML in a Nutshell, 2nd Edition)

5.4. XML-Defined Character Sets

Many XML processors understand other legacy encodings. For instance, processors written in Java often understand all character sets available in a typical Java virtual machine. For a list, see http://java.sun.com/products/jdk/1.3/docs/guide/intl/encoding.doc.html. Furthermore, some processors may recognize aliases for these encodings; both Latin-1 and 8859_1 are sometimes used as synonyms for ISO-8859-1. However, using these names limits your document's portability. We recommend that you use standard names for standard encodings. For encodings whose standard name isn't given by the XML 1.0 specification, use one of the names registered with the Internet Assigned Numbers Authority (IANA) and listed at ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets. However, knowing the name of a character set and saving a file in that set does not mean that your XML parser can read such a file. XML parsers are only required to support UTF-8 and UTF-16. They are not required to support the hundreds of different legacy encodings used around the world.