Content deleted Content added
Line 13:
Like HTML documents, an XHTML document is a sequence of Unicode characters. However, an XHTML document is an [[XML]] document, which, while not having an explicit "document character" layer of [[abstraction]], nevertheless relies upon a similar definition of permissible characters that cover most, but not all, of the Unicode/UCS character definitions. The sets used by HTML and XHTML/XML are slightly different, but these differences have little effect on the average document author.
Regardless of whether the document is HTML or XHTML, when stored on a [[file system]] or transmitted over a network, the document's characters are ''encoded'' as a sequence of [[bit]] [[octet (computing)|octet]]s (''[[byte]]s'') according to a particular character encoding. This encoding may either be a [[Unicode Transformation Format]], like [[UTF-8]], that can directly encode any Unicode character, or a legacy encoding, like [[Windows-1252]], that cannot. However, even when using encodings that do not support all Unicode characters, the encoded document may make use of [[numeric character references]]. For example <code>&
=== Character encoding===
|