Revision as of 12:24, 20 September 2004 edit Shlomital (talk \| contribs) 657 edits m XML 1.0 restriction does not apply to character data ← Previous edit		Revision as of 15:48, 3 October 2004 edit undo Ahoerstemeier (talk \| contribs) 110,683 edits m linkfix bit Next edit →
Line 14: Markup languages are typically defined in terms of [[ISO 10646]] or [[Unicode]] characters. That is, a document consists, at its most fundamental level of abstraction, of a sequence of [[character (computing)\|character]]s, which are abstract units that exist independently of any [[character encoding\|encoding]]. Ideally, when the characters of a document utilizing a markup language are encoded for storage or transmission over a network as a sequence of [[~~bits~~bit]]s, the encoding that is used will be one that supports representing each and every character in the document, if not in the whole of Unicode, directly as a particular bit sequence. Sometimes, though, for reasons of convenience or due to technical limitations, documents are encoded with an encoding that cannot represent some characters directly. For example, the widely used encodings based on [[ISO 8859]] can only represent, at most, 256 unique characters as one 8-bit [[byte]] each.

Numeric character reference: Difference between revisions