Content deleted Content added
m Task 18 (cosmetic): eval 27 templates: hyphenate params (8×); |
|||
Line 6:
==Specifying the document's character encoding==
There are several ways to specify which character encoding is used in the document. First, the [[web server]] can include the character encoding or "<code>charset</code>" in the [[Hypertext Transfer Protocol]] (HTTP) <code>Content-Type</code> header, which would typically look like this:<ref>{{citation |url=http://tools.ietf.org/html/rfc7231#section-3.1.1.5|chapter=Content-Type |title=Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content|publisher=[[IETF]] |date=June 2014 |
Content-Type: text/html; charset=ISO-8859-4
This method gives the HTTP server a convenient way to alter document's encoding according to [[content negotiation]]; certain HTTP server software can do it, for example Apache with the [[List of Apache modules|module]] <code>mod_charset_lite</code>.<ref>[http://httpd.apache.org/docs/2.0/en/mod/mod_charset_lite.html Apache Module mod_charset_lite]</ref>
Line 16:
</syntaxhighlight>
[[HTML5]] also allows the following syntax to mean exactly the same:<ref name=html5charset>{{citation |url=http://www.w3.org/TR/html5/document-metadata.html#specifying-the-documents-character-encoding |chapter=Specifying the document's character encoding |title=HTML5 |publisher=[[World Wide Web Consortium]] |date=14 December 2017 |
<!-- Please don't add a closing "/": that is unnecessary here. -->
<syntaxhighlight lang="html4strict">
Line 22:
</syntaxhighlight>
[[XHTML]] documents have a third option: to express the character encoding via [[XML]] declaration, as follows:<ref>{{citation |url=http://www.w3.org/TR/REC-xml/#sec-prolog-dtd |chapter=Prolog and Document Type Declaration |title=XML |first1=T. |last1=Bray |
<syntaxhighlight lang="xml">
<?xml version="1.0" encoding="ISO-8859-1"?>
Line 133:
===XML character references===
Unlike traditional HTML with its large range of character entity references, in [[XML]] there are only five predefined character entity references. These are used to escape characters that are markup sensitive in certain contexts:<ref>{{citation |url=http://www.w3.org/TR/REC-xml/#sec-references |chapter=Character and Entity References |title=XML |first1=T. |last1=Bray |
*<code>&amp;</code> → & ([[ampersand]], U+0026)
|