Content deleted Content added
No edit summary |
m "normal ASCII" is POV; s/European users/other languages/; character references are not entities (strictly speaking) |
||
Line 2:
[[zh:HTML的字符编码]]
[[HTML]] has been in use since [[1991]], but the first standardized version with a reasonably complete treatment of international characters was version 4.0, not published until 1997. Considerable care must be exercised when creating HTML pages with special characters outside the range of
== The document character set ==
When HTML documents are served to the viewer, there are two ways to tell the browser what specific character encoding is used. First, [[HTTP]] headers can be sent by the server along with each page. A typical header looks like this:
Line 12 ⟶ 11:
</code></blockquote>
The other method is for the HTML document to include this information at its top, inside the <code>HEAD</code> element.
<blockquote><code>
Line 20 ⟶ 19:
Either method advises the receiver that the file being sent uses the character set specified. Of course, it would be a very bad idea to send incorrect information. For example, a server where multiple users may place files created on different machines cannot promise that all the files it sends will conform (some users may have machines with different character sets). For this reason, many servers simply do not send the information at all, to avoid making any false promises.
Browsers receiving a file with no character set information must make a blind assumption. The safest is probably to assume [[ISO 8859-1]], but it is also common for browsers to assume the character set native to the machine on which they are running. The consequence of choosing incorrectly is that characters outside the printable ASCII range (32 to 126) may appear incorrectly. This presents few problems for English-speaking users, but
For maximum compatibility, it is increasingly common for multilingual websites to use the [[UTF-8]] encoding of the [[ISO 10646]]/[[Unicode]] character set, which provides a superset of almost all existing character sets.
It is important to point out that successful viewing of a page is not
== Character Entity References ==▼
In addition to native character encodings, characters can also be encoded as '''HTML entities''', using the encoding format derived from the use of
Many symbolic character entities have been defined. For example, the character 'λ' can be encoded as <code>&lambda;</code>. This use of the '&' character as an [[escape
Decimal and hexadecimal HTML
Note that unnecessary use of HTML character references may significantly reduce the readability of HTML. If the character encoding for a web page is chosen appropriately then HTML character references are usually only required for a few special characters. The characters '''&''', '''<''' and '''>''' always need to be encoded, as noted above.
== External
* [http://www.html-collection.com HTML Beginner Page] This page is for HTML beginner. [http://www.html-sammlung.de (German Page)]
|