Character encodings in HTML: Difference between revisions

Content deleted Content added
Qalle2 (talk | contribs)
Line 143:
Unlike traditional HTML with its large range of character entity references, in [[XML]] there are only five predefined character entity references. These are used to escape characters that are markup sensitive in certain contexts:<ref>{{citation |chapter-url=http://www.w3.org/TR/REC-xml/#sec-references |chapter=Character and Entity References |title=XML |first1=T. |last1=Bray |author-link1=Tim Bray |first2=J. |last2=Paoli |first3=C. |last3=Sperberg-McQueen |author-link3=Michael Sperberg-McQueen |first4=E. |last4=Maler |first5=F. |last5=Yergeau |publisher=[[W3C]] |date=26 November 2008 |access-date=8 March 2010}}</ref>
 
{| class="wikitable"
*<code>&amp;amp;</code> → & ([[ampersand]], U+0026)
*| <code>&amp;ltamp;</code> ||align="center"| & || [[ampersand]] < (less-than sign,|| U+003C)0026
|-
*<code>&amp;gt;</code> → > (greater-than sign, U+003E)
*| <code>&amp;quotlt;</code> ||align="center"| < || less-than sign (quotation mark,|| U+0022)003C
|-
*<code>&amp;apos;</code> → ' (apostrophe, U+0027)
*| <code>&amp;gt;</code> ||align="center"| > (|| greater-than sign, || U+003E)
|-
| <code>&amp;quot;</code> ||align="center"| " || quotation mark || U+0022
|-
*| <code>&amp;apos;</code> ||align="center"| ' (|| apostrophe, || U+0027)
|}
 
All other character entity references have to be defined before they can be used. For example, use of <code>&amp;eacute;</code> (which gives é, Latin lower-case E with acute accent, U+00E9 in Unicode) in an XML document will generate an error unless the entity has already been defined. XML also requires that the <code>x</code> in hexadecimal numeric references be in lowercase: for example <code>&amp;#xA1b</code> rather than <code>&amp;#XA1b</code>. [[XHTML]], which is an XML application, supports the HTML entity set, along with XML's predefined entities.