Content deleted Content added
Incnis Mrsi (talk | contribs) |
m →Previous Discussion: Subst: {{unsigned}} (& regularise templates) |
||
Line 13:
This page states that a numeric entity reference *always* refers to a unicode character code point. The w3c (http://www.w3.org/TR/html401/charset.html) states that a numeric entity reference is a code point in the document's character set. This appears to be a contradiction and this page appears to be wrong. If this page is in fact correct, then this page may want to explain why the document's character set is Unicode.
: In HTML, the ''document character set'' is ''always'' the [[Universal Character Set]]: "HTML uses ... the Universal Character Set (UCS), defined in [ISO10646]. ... The character set defined in [ISO10646] is character-by-character equivalent to Unicode ([UNICODE])."[http://www.w3.org/TR/html401/charset.html]. However, many different ''encodings'' of the UCS can be used: [[UTF-8]], [[UTF-16]], [[ISO-8859-1]], [[US-ASCII]], [[SHIFT_JIS]], and so on. Numeric character references always refer to the document character set, i.e., the UCS. The distinction between character set and character encoding is a bit tricky, so you're right, it could be explained better in the article. [[User:Indefatigable|Indefatigable]] 21:39, 9 January 2007 (UTC)
|