Numeric character reference: Difference between revisions

Content deleted Content added
Tag: Reverted
m convert special characters found by Wikipedia:Typo Team/moss (via WP:JWB)
 
(6 intermediate revisions by 6 users not shown)
Line 3:
A '''numeric character reference''' ('''NCR''') is a common [[markup (computer programming)|markup]] construct used in [[SGML]] and SGML-derived markup languages such as [[HTML]] and [[XML]]. It consists of a short sequence of [[character (computing)|character]]s that, in turn, represents a single character. Since [[SGML|WebSgml]], [[XML]] and [[HTML 4]], the code points of the [[Universal Character Set]] (UCS) of [[Unicode]] are used. NCRs are typically used in order to represent characters that are not [[plain text#Encoding|directly encodable]] in a particular document (for example, because they are international characters that do not fit in the 8-bit [[Character encoding|character set]] being used, or because they have special syntactic meaning in the language). When the document is interpreted by a markup-aware reader, each NCR is treated as if it were the character it represents.
 
==Examples==
SIGMA
In SGML, HTML, and XML, the following are all valid numeric character references for the Greek capital letter Sigma
{| class="wikitable"
|+ Numerical character reference of {{unichar|03A3|GREEK CAPITAL LETTER SIGMA}}<br/>({{hexadecimal|0931}} = 931<sub>10</sub>)
|-
! [[Unicode#Upluslink|Unicode character]]
! Numerical base
! Numerical reference in markup
! Effect
|-
| U+03A3 || Decimal || &amp;#931; || Σ
|-
| U+03A3 || Decimal || &amp;#0931; || Σ
|-
| U+03A3 || Hexadecimal || &amp;#x3A3; || Σ
|-
| U+03A3 || Hexadecimal || &amp;#x03A3; || Σ
|-
| U+03A3 || Hexadecimal || &amp;#x3a3; || Σ
|}
 
In SGML, HTML, and XML, the following are all valid numeric character references for the Latin capital letter AE
{| class="wikitable"
|+ Numerical character reference of {{unichar|00C6|Latin capital letter AE}}
|-
! [[Unicode#Upluslink|Unicode character]]
! Numerical base
! Numerical reference in markup
! Effect
|-
| U+00C6 || Decimal || &amp;#198; || Æ
|-
| U+00C6 || Hexadecimal || &amp;#xC6; || Æ
|}
 
In SGML, HTML, and XML, the following are all valid numeric character references for the Latin small letter sharp s ß
{| class="wikitable"
|+ Numerical character reference of {{unichar|00DF|Latin small letter sharp s}}
|-
! [[Unicode#Upluslink|Unicode character]]
! Numerical base
! Numerical reference in markup
! Effect
|-
| U+00DF || Decimal || &amp;#223; || ß
|-
| U+00DF || Hexadecimal || &amp;#xDF; || ß
|}
 
List of numeric character references for the printable [[ASCII]] characters:
{| class="wikitable"
! [[Unicode#Upluslink|Unicode character]]
! Character<br />Reference<br />(decimal)
! Character<br />Reference<br />(hexadecimal)
! Effect
|-
| U+0020 || &amp;#32; || &amp;#x20; || (space)
|-
| U+0021 || &amp;#33; || &amp;#x21; || !
|-
| U+0022 || &amp;#34; || &amp;#x22; || "
|-
| U+0023 || &amp;#35; || &amp;#x23; || #
|-
| U+0024 || &amp;#36; || &amp;#x24; || $
|-
| U+0025 || &amp;#37; || &amp;#x25; || %
|-
| U+0026 || &amp;#38; || &amp;#x26; || &
|-
| U+0027 || &amp;#39; || &amp;#x27; || '
|-
| U+0028 || &amp;#40; || &amp;#x28; || (
|-
| U+0029 || &amp;#41; || &amp;#x29; || )
|-
| U+002A || &amp;#42; || &amp;#x2A; || *
|-
| U+002B || &amp;#43; || &amp;#x2B; || +
|-
| U+002C || &amp;#44; || &amp;#x2C; || ,
|-
| U+002D || &amp;#45; || &amp;#x2D; || -
|-
| U+002E || &amp;#46; || &amp;#x2E; || .
|-
| U+002F || &amp;#47; || &amp;#x2F; || /
|-
| U+0030 || &amp;#48; || &amp;#x30; || 0
|-
| U+0031 || &amp;#49; || &amp;#x31; || 1
|-
| U+0032 || &amp;#50; || &amp;#x32; || 2
|-
| U+0033 || &amp;#51; || &amp;#x33; || 3
|-
| U+0034 || &amp;#52; || &amp;#x34; || 4
|-
| U+0035 || &amp;#53; || &amp;#x35; || 5
|-
| U+0036 || &amp;#54; || &amp;#x36; || 6
|-
| U+0037 || &amp;#55; || &amp;#x37; || 7
|-
| U+0038 || &amp;#56; || &amp;#x38; || 8
|-
| U+0039 || &amp;#57; || &amp;#x39; || 9
|-
| U+003A || &amp;#58; || &amp;#x3A; || :
|-
| U+003B || &amp;#59; || &amp;#x3B; || ;
|-
| U+003C || &amp;#60; || &amp;#x3C; || <
|-
| U+003D || &amp;#61; || &amp;#x3D; || =
|-
| U+003E || &amp;#62; || &amp;#x3E; || >
|-
| U+003F || &amp;#63; || &amp;#x3F; || ?
|-
| U+0040 || &amp;#64; || &amp;#x40; || @
|-
| U+0041 || &amp;#65; || &amp;#x41; || A
|-
| U+0042 || &amp;#66; || &amp;#x42; || B
|-
| U+0043 || &amp;#67; || &amp;#x43; || C
|-
| U+0044 || &amp;#68; || &amp;#x44; || D
|-
| U+0045 || &amp;#69; || &amp;#x45; || E
|-
| U+0046 || &amp;#70; || &amp;#x46; || F
|-
| U+0047 || &amp;#71; || &amp;#x47; || G
|-
| U+0048 || &amp;#72; || &amp;#x48; || H
|-
| U+0049 || &amp;#73; || &amp;#x49; || I
|-
| U+004A || &amp;#74; || &amp;#x4A; || J
|-
| U+004B || &amp;#75; || &amp;#x4B; || K
|-
| U+004C || &amp;#76; || &amp;#x4C; || L
|-
| U+004D || &amp;#77; || &amp;#x4D; || M
|-
| U+004E || &amp;#78; || &amp;#x4E; || N
|-
| U+004F || &amp;#79; || &amp;#x4F; || O
|-
| U+0050 || &amp;#80; || &amp;#x50; || P
|-
| U+0051 || &amp;#81; || &amp;#x51; || Q
|-
| U+0052 || &amp;#82; || &amp;#x52; || R
|-
| U+0053 || &amp;#83; || &amp;#x53; || S
|-
| U+0054 || &amp;#84; || &amp;#x54; || T
|-
| U+0055 || &amp;#85; || &amp;#x55; || U
|-
| U+0056 || &amp;#86; || &amp;#x56; || V
|-
| U+0057 || &amp;#87; || &amp;#x57; || W
|-
| U+0058 || &amp;#88; || &amp;#x58; || X
|-
| U+0059 || &amp;#89; || &amp;#x59; || Y
|-
| U+005A || &amp;#90; || &amp;#x5A; || Z
|-
| U+005B || &amp;#91; || &amp;#x5B; || [
|-
| U+005C || &amp;#92; || &amp;#x5C; || \
|-
| U+005D || &amp;#93; || &amp;#x5D; || ]
|-
| U+005E || &amp;#94; || &amp;#x5E; || ^
|-
| U+005F || &amp;#95; || &amp;#x5F; || _
|-
| U+0060 || &amp;#96; || &amp;#x60; || '
|-
| U+0061 || &amp;#97; || &amp;#x61; || a
|-
| U+0062 || &amp;#98; || &amp;#x62; || b
|-
| U+0063 || &amp;#99; || &amp;#x63; || c
|-
| U+0064 || &amp;#100; || &amp;#x64; || d
|-
| U+0065 || &amp;#101; || &amp;#x65; || e
|-
| U+0066 || &amp;#102; || &amp;#x66; || f
|-
| U+0067 || &amp;#103; || &amp;#x67; || g
|-
| U+0068 || &amp;#104; || &amp;#x68; || h
|-
| U+0069 || &amp;#105; || &amp;#x69; || i
|-
| U+006A || &amp;#106; || &amp;#x6A; || j
|-
| U+006B || &amp;#107; || &amp;#x6B; || k
|-
| U+006C || &amp;#108; || &amp;#x6C; || l
|-
| U+006D || &amp;#109; || &amp;#x6D; || m
|-
| U+006E || &amp;#110; || &amp;#x6E; || n
|-
| U+006F || &amp;#111; || &amp;#x6F; || o
|-
| U+0070 || &amp;#112; || &amp;#x70; || p
|-
| U+0071 || &amp;#113; || &amp;#x71; || q
|-
| U+0072 || &amp;#114; || &amp;#x72; || r
|-
| U+0073 || &amp;#115; || &amp;#x73; || s
|-
| U+0074 || &amp;#116; || &amp;#x74; || t
|-
| U+0075 || &amp;#117; || &amp;#x75; || u
|-
| U+0076 || &amp;#118; || &amp;#x76; || v
|-
| U+0077 || &amp;#119; || &amp;#x77; || w
|-
| U+0078 || &amp;#120; || &amp;#x78; || x
|-
| U+0079 || &amp;#121; || &amp;#x79; || y
|-
| U+007A || &amp;#122; || &amp;#x7A; || z
|-
| U+007B || &amp;#123; || &amp;#x7B; || {
|-
| U+007C || &amp;#124; || &amp;#x7C; || {{pipe}}
|-
| U+007D || &amp;#125; || &amp;#x7D; || }
|-
| U+007E || &amp;#126; || &amp;#x7E; || ~
|}
 
==Discussion==
Line 44 ⟶ 289:
For example, as mentioned above, the correct numeric character reference for the [[Euro sign]] "€" <code>U+20AC</code> when using [[Unicode]] is decimal <code>&amp;#8364;</code> and hexadecimal <code>&amp;#x20AC;</code>. However, if using tools supporting obsolete implementations of HTML, the reference <code>&amp;#128;</code> (Euro sign in the [[CP-1252]] code page) or <code>&amp;#164;</code> (Euro sign in [[ISO/IEC 8859-15]]) may work.
 
As another example, if some text was created originally using the [[MacRoman]] character set, the [[quotation mark glyphs|left double quotation mark]] {{char|"}} will be represented with code point xD2. This will not display properly in a system expecting a document encoded as UTF-8, ISO 8859-1, or CP-1252, where this code point is occupied by the letter [[Ò]]. The correct numeric character reference for {{char|"}} in HTML 4 and newer is <code>&amp;#x201C;</code>, because [[Unicode#Upluslink|U+]]201C is its UCS code. In some systems, the [[List of XML and HTML character entity references|named character reference]] <code>&amp;ldquo;</code> may also be available.
 
==See also==
Line 51 ⟶ 296:
==References==
{{Reflist}}
 
 
{{Unicode navigation}}