Numeric character reference: Difference between revisions

Content deleted Content added
Shlomital (talk | contribs)
m XML 1.0 restriction does not apply to character data
Ahoerstemeier (talk | contribs)
m linkfix bit
Line 14:
Markup languages are typically defined in terms of [[ISO 10646]] or [[Unicode]] characters. That is, a document consists, at its most fundamental level of abstraction, of a sequence of [[character (computing)|character]]s, which are abstract units that exist independently of any [[character encoding|encoding]].
 
Ideally, when the characters of a document utilizing a markup language are encoded for storage or transmission over a network as a sequence of [[bitsbit]]s, the encoding that is used will be one that supports representing each and every character in the document, if not in the whole of Unicode, directly as a particular bit sequence.
 
Sometimes, though, for reasons of convenience or due to technical limitations, documents are encoded with an encoding that cannot represent some characters directly. For example, the widely used encodings based on [[ISO 8859]] can only represent, at most, 256 unique characters as one 8-bit [[byte]] each.