Unicode and HTML: Difference between revisions

Content deleted Content added
m Fixing bare references Wikipedia:Bare_URLs
Tags: AWB Reverted
m revert WP:CITEVAR violation
Line 60:
Many HTML documents are served with inaccurate encoding information, or no encoding information at all. In order to determine the encoding in such cases, many browsers allow the user to manually select an encoding name from a list. They may also employ an encoding auto-detection algorithm that works in concert '''with''' or — ''in the case of the BOM and in case of HTML served as XML'' — '''against''' the manual override.
 
For HTML documents which are <code>text/html</code> serialized, manual override may apply to all documents, or only those for which the encoding cannot be ascertained by looking at declarations and/or byte patterns. The fact that the manual override is present and widely used hinders the adoption of accurate encoding declarations on the Web; therefore the problem is likely to persist. But note that Internet Explorer, Chrome and Safari — for both XML and <code>text/html</code> serializations — do not permit the encoding to be overridden whenever the page includes the BOM.<ref>{{cite web| url = [http://www.w3.org/Bugs/Public/show_bug.cgi?id=12897| title = Bug 12897 - In some parsers, UTF-8 BOM trumps the HTTP charset attribute (Encoding sniffing algorithm)}}]</ref>
 
For HTML documents serialized with the preferred XML label — <code>application/xhtml+xml</code>, manual encoding override is not permitted. To override the encoding of such an XML document would mean that the document stopped being XML, as it is a fatal error for XML documents to have an encoding declaration with detectable errors. Currently, Gecko browsers such as Firefox, abide to this rule, whereas the bulk of the other common browsers that support HTML as XML, such as Webkit browsers (Chrome/Safari) <ref>{{cite web| url = [https://bugs.webkit.org/show_bug.cgi?id=66189| title = Bug 66189 - XML parser doesn't emit FATAL ERROR for all, detectable encoding errors}}]</ref> do allow the encoding of XHTML documents to be manually overridden.
 
==Web browser support==