Content deleted Content added
Citation bot (talk | contribs) Add: s2cid, doi, chapter-url, authors 1-2. editors 1-2. Removed or converted URL. | Use this bot. Report bugs. | Suggested by Whoop whoop pull up | #UCB_webform 2402/3621 |
|||
Line 9:
There are two general ways to specify which character encoding is used in the document.
First, the [[web server]] can include the character encoding or "<code>charset</code>" in the [[Hypertext Transfer Protocol]] (HTTP) <code>Content-Type</code> header, which would typically look like this:<ref>{{citation |chapter-url=http://tools.ietf.org/html/rfc7231#section-3.1.1.5|chapter=Content-Type |title=Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content|publisher=[[IETF]] |date=June 2014 |doi=10.17487/RFC7231 |access-date=2014-07-30|editor-last1=Fielding |editor-last2=Reschke |editor-first1=R |editor-first2=J |last1=Fielding |first1=R. |last2=Reschke |first2=J. |s2cid=14399078 }}</ref>
Content-Type: text/html; charset=utf-8
This method gives the HTTP server a convenient way to alter document's encoding according to [[content negotiation]]; certain HTTP server software can do it, for example Apache with the [[List of Apache modules|module]] <code>mod_charset_lite</code>.<ref>{{cite web| url = http://httpd.apache.org/docs/2.0/en/mod/mod_charset_lite.html| title = Apache Module mod_charset_lite}}</ref>
Line 21:
</syntaxhighlight>
[[HTML5]] also allows the following syntax to mean exactly the same:<ref name=html5charset>{{citation |chapter-url=http://www.w3.org/TR/html5/document-metadata.html#specifying-the-documents-character-encoding |chapter=Specifying the document's character encoding |title=HTML5 |publisher=[[World Wide Web Consortium]] |date=14 December 2017 |access-date=2018-05-28}}</ref>
<!-- Please don't add a closing "/": that is unnecessary here. -->
<syntaxhighlight lang="html4strict">
Line 27:
</syntaxhighlight>
[[XHTML]] documents have a third option: to express the character encoding via [[XML]] declaration, as follows:<ref>{{citation |chapter-url=http://www.w3.org/TR/REC-xml/#sec-prolog-dtd |chapter=Prolog and Document Type Declaration |title=XML |first1=T. |last1=Bray |author-link1=Tim Bray |first2=J. |last2=Paoli |first3=C. |last3=Sperberg-McQueen |author-link3=Michael Sperberg-McQueen |first4=E. |last4=Maler |first5=F. |last5=Yergeau |publisher=[[W3C]] |date=26 November 2008 |access-date=8 March 2010}}</ref>
<syntaxhighlight lang="xml">
<?xml version="1.0" encoding="utf-8"?>
Line 139:
===XML character references===
Unlike traditional HTML with its large range of character entity references, in [[XML]] there are only five predefined character entity references. These are used to escape characters that are markup sensitive in certain contexts:<ref>{{citation |chapter-url=http://www.w3.org/TR/REC-xml/#sec-references |chapter=Character and Entity References |title=XML |first1=T. |last1=Bray |author-link1=Tim Bray |first2=J. |last2=Paoli |first3=C. |last3=Sperberg-McQueen |author-link3=Michael Sperberg-McQueen |first4=E. |last4=Maler |first5=F. |last5=Yergeau |publisher=[[W3C]] |date=26 November 2008 |access-date=8 March 2010}}</ref>
*<code>&amp;</code> → & ([[ampersand]], U+0026)
|