Content deleted Content added
Tags: Mobile edit Mobile web edit |
m Reverted 1 edit by 46.114.32.79 (talk) to last revision by AnomieBOT (TW) |
||
Line 1:
<br />
{{short description|relationship between Unicode characters and HTML}}
<br />
{{Rewrite|date=July 2018}}
{{Multiple issues|
{{primary sources|date=December 2011}}
{{essay-like|date=December 2011}}
{{refimprove|date=January 2011}}
}}
{{SpecialChars}}
{{Html series}}
Web pages authored using '''hypertext markup language''' ([[HTML email|HTML]]) may contain multilingual text represented with the '''Unicode universal character set'''. Key to the relationship between Unicode and HTML is the relationship between the "document character set" which defines the set of characters that may be present in a HTML document and assigns numbers to them and the "external character encoding" or "charset" used to encode a given document as a sequence of bytes.
In RFC 1866, the initial HTML 2.0 standard, the document character set was defined as ISO-8859-1. It was extended to [[
The relationship between [[Unicode]] and HTML tends to be a difficult topic for many computer professionals, document authors, and [[World Wide Web|web]] users alike. The accurate representation of text in [[web page]]s from different [[natural language]]s and [[writing system]]s is complicated by the details of [[character encoding]], [[markup language]] syntax, [[Computer font|font]], and varying levels of support by [[web browser]]s.
▲In , the initial HTML 2.0 standard, the document character set was defined as ISO-8859-1. It was extended to [[Torx|I]] (which is basically equivalent to Unicode) by . It does not vary between documents of different languages or created on different platforms. The external character encoding is chosen by the author of the document (or the software the author u
== HTML document characters ==
|