Unicode and HTML: Difference between revisions

Content deleted Content added
No edit summary
Tags: Visual edit Mobile edit Mobile web edit
 
(8 intermediate revisions by 7 users not shown)
Line 4:
{{essay-like|date=December 2011}}
{{refimprove|date=January 2011}}
{{Rewrite|date=July 2018}}
}}
{{SpecialChars}}
{{Html series}}
Web pages authored using '''HyperText Markup Language''' ([[HTML email|HTML]]) may contain multilingual text represented with the '''Unicode universal character set'''. Key to the relationship between Unicode and HTML is the relationship between the "document character set", which defines the set of characters that may be present in aan HTML document and assigns numbers to them, and the "external character encoding", or "charset", used to encode a given document as a sequence of bytes.
 
In RFC 1866, the initial HTML 2.0 standard, the document character set was defined as ISO-8859-1 (later HTML standard defaults to [[Windows-1252]] encoding). It was extended to [[ISO 10646]] (which is basically equivalent to Unicode) by {{IETF RFC|2070}}. It does not vary between documents of different languages or created on different platforms. The external character encoding is chosen by the author of the document (or the software the author uses to create the document) and determines how the bytes used to store and/or transmit the document map to characters from the document character set. Characters not present in the chosen external character encoding may be represented by character entity references.
Line 170 ⟶ 169:
 
==Frequency of usage==
According to internal data from [[Google]]'s web index, in December 2007 the [[UTF-8]] Unicode encoding became the most frequently used encoding on web pages, overtaking both [[ASCII]] (US) and [[ISO/IEC 8859-1|8859-1]]/[[Windows-1252|1252]] (Western European).<ref>[[Mark{{Cite Davisweb (Unicode)|Marktitle=Moving Davis]]:to [httpUnicode 5.1 |url=https://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html Moving to Unicode 5.1]|access-date=2024-10-10 |website=Official Google blog,Blog 5 May 2008|language=en}}</ref>
 
==See also==
Line 198 ⟶ 197:
 
[[Category:HTML]]
[[Category:Unicode|HTML]]