Talk:Character encodings in HTML: Difference between revisions

Content deleted Content added
Line 16:
 
: In HTML, the ''document character set'' is ''always'' the [[Universal Character Set]]: "HTML uses ... the Universal Character Set (UCS), defined in [ISO10646]. ... The character set defined in [ISO10646] is character-by-character equivalent to Unicode ([UNICODE])."[http://www.w3.org/TR/html401/charset.html]. However, many different ''encodings'' of the UCS can be used: [[UTF-8]], [[UTF-16]], [[ISO-8859-1]], [[US-ASCII]], [[SHIFT_JIS]], and so on. Numeric character references always refer to the document character set, i.e., the UCS. The distinction between character set and character encoding is a bit tricky, so you're right, it could be explained better in the article. [[User:Indefatigable|Indefatigable]] 21:39, 9 January 2007 (UTC)
 
== “W3C vs HTTP” referenced info was stealthy removed ==
 
Let us discuss an edit [http://en.wikipedia.org/w/index.php?title=Character_encodings_in_HTML&diff=348594877] of user Ms2ger. Because he forged the '''m''' label (for which I put him [[user talk:Ms2ger#Bold m-edit in Character encodings in HTML|a formal warning]]), this controversial edit attracted no attention. But a crucially important reference [http://www.w3.org/TR/html4/charset.html#h-5.2.2 to the W3C], which prove its disappointment in HTTP/1.1 charset detection, was removed without any compensation. Should we restore that piece of text, or let us write all article from scratch for the third time? [[User:Incnis Mrsi|Incnis Mrsi]] ([[User talk:Incnis Mrsi|talk]]) 11:37, 22 March 2010 (UTC)