Content deleted Content added
Undid revision 1239926119 by 197.98.201.67 (talk) |
m →External links: i have removed the broken link |
||
(3 intermediate revisions by 3 users not shown) | |||
Line 118:
==Character references==
{{Main|
In addition to native character encodings, characters can also be encoded as ''character references'', which can be ''numeric character references'' ([[decimal]] or [[hexadecimal]]) or ''character entity references''. Character entity references are also sometimes referred to as ''named entities'', or ''HTML entities'' for HTML. HTML's usage of character references derives from [[SGML]].
Line 124:
===HTML character references===
<!--Linked from [[Template:Auxiliary template common notice]]-->
A ''[[numeric character reference]]'' in HTML refers to a character by its [[Universal Character Set]]/[[Unicode]] ''[[code point]]'', and uses the format
:<code>&#''nnnn'';</code>
Line 136:
For codes from 0 to 127, the original 7-bit [[ASCII]] standard set, most of these characters can be used without a character reference. Codes from 160 to 255 can all be created using [[List of XML and HTML character entity references|character entity names]]. Only a few higher-numbered codes can be created using entity names, but all can be created by decimal number character reference.
[[List of XML and HTML character entity references|Character entity references]] can also have the format <code>&''name'';</code> where ''name'' is a case-sensitive alphanumeric string. For example, "λ" can also be encoded as <code>&lambda;</code> in an HTML document. The character entity references <code>&lt;</code>, <code>&gt;</code>, <code>&quot;</code> and <code>&amp;</code> are predefined in HTML and SGML, because <code><</code>, <code>></code>, <code>"</code> and <code>&</code> are already used to delimit markup. This notably did not include XML's <code>&apos;</code> (') entity prior to [[HTML5]]. For a list of all named HTML character entity references along with the versions in which they were introduced, see [[List of XML and HTML character entity references]].
Unnecessary use of HTML character references may significantly reduce HTML readability. If the character encoding for a web page is chosen appropriately, then HTML character references are usually only required for markup delimiting characters as mentioned above, and for a few special characters (or none at all if a native [[Unicode]] encoding like [[UTF-8]] is used). Incorrect HTML entity escaping may also open up security vulnerabilities for injection attacks such as [[cross-site scripting]]. If HTML attributes are left unquoted, certain characters, most importantly [[whitespace character|whitespace]], such as space and tab, must be escaped using entities. Other languages related to HTML have their own methods of escaping characters.
Line 167:
== External links ==
* [http://www.w3.org/TR/REC-html40/sgml/entities.html Character entity references in HTML4]
* [http://www.sitepoint.com/article/guide-web-character-encoding/ The Definitive Guide to Web Character Encoding]
|