Character encodings in HTML: Difference between revisions

Content deleted Content added
m Year link
m Extra blank line removed, wiki headings, spelling
Line 3:
[[HTML]] has been in use since [[1991]], but the first standardized version with a reasonably complete treatment of international characters was version 4.0, not published until 1997. Considerable care must be exercised when creating HTML pages with special characters outside the range of normal [[ASCII]] to ensure two goals: the integrity of the information stored in the HTML document, and proper display of the document by the largest possible variety of browsers.
 
<h2>== The Document Character Set</h2> ==
 
When HTML documents are served to the viewer, there are two ways to tell the browser what specific character encoding is used. First, [[HTTP]] headers can be sent by the server along with each page. A typical header looks like this:
Line 25:
It is important to point out that successful viewing of a page is not necessarilty an indication that it is encoded correctly. If the creator of a page and the reader are both assuming some machine-specific character set, and the server does not send any identifying information, then the reader will nonetheless see the page as the creator intended, but other readers with different native sets will not.
 
<h2>== Character Entity References</h2> ==
 
In addition to native character encodings, characters can also be encoded as '''HTML entities''', using the encoding format derived from the use of character entities in [[SGML]].
Line 32:
 
Decimal and hexadecimal HTML entities can also be used, based on the [[Unicode]] numeric code for the character encoded. For example, &lambda; can also be represented as a decimal-coded entity as <code>&amp;#955;</code>.
 
Note that unnecessary use of HTML character references may significantly reduce the readability of HTML. If the character encoding for a web page is chosen appropriately then HTML character references are usually only required for a few special characters. The characters '''&amp;''', '''&lt;''' and '''&gt;''' always need to be encoded, as noted above.
 
== External Links: ==
* [http://www.html-sammlung.de html-sammlung.de] german page with tippstips for html beginners
* [http://www.html-collection.com html-collection.com] page with tippstips for html beginners