Revision as of 09:12, 24 March 2010 edit Incnis Mrsi (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 11,646 edits →“W3C vs HTTP” referenced info was stealthy removed ← Previous edit		Revision as of 12:02, 12 October 2010 edit undo SmackBot (talk \| contribs) 3,734,324 edits m →Previous Discussion: Subst: {{unsigned}} (& regularise templates) Next edit →
Line 13: This page states that a numeric entity reference always refers to a unicode character code point. The w3c (http://www.w3.org/TR/html401/charset.html) states that a numeric entity reference is a code point in the document's character set. This appears to be a contradiction and this page appears to be wrong. If this page is in fact correct, then this page may want to explain why the document's character set is Unicode. {{<small><span class="autosigned">—Preceding [[Wikipedia:Signatures\|unsigned]] comment added by [[User:71.141.135.56\|71.141.135.56]] ([[User talk:71.141.135.56\|talk]] • [[Special:Contributions/71.141.135.56\|contribs]]) 23:08 UTC, 8 January 2007}}</span></small><!-- Template:Unsigned --> : In HTML, the ''document character set'' is ''always'' the [[Universal Character Set]]: "HTML uses ... the Universal Character Set (UCS), defined in [ISO10646]. ... The character set defined in [ISO10646] is character-by-character equivalent to Unicode ([UNICODE])."[http://www.w3.org/TR/html401/charset.html]. However, many different ''encodings'' of the UCS can be used: [[UTF-8]], [[UTF-16]], [[ISO-8859-1]], [[US-ASCII]], [[SHIFT_JIS]], and so on. Numeric character references always refer to the document character set, i.e., the UCS. The distinction between character set and character encoding is a bit tricky, so you're right, it could be explained better in the article. [[User:Indefatigable\|Indefatigable]] 21:39, 9 January 2007 (UTC)

Talk:Character encodings in HTML: Difference between revisions