Revision as of 21:51, 23 September 2011 edit Mandarax (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers 388,559 edits m Typo patrol, typos fixed: agianst → against, documetns → documents, the the → the using AWB (7794) ← Previous edit		Revision as of 19:03, 28 October 2011 edit undo 28bytes (talk \| contribs) Autopatrolled, Bureaucrats, Administrators 32,645 edits m →HTML document characters: template for better display Next edit →
Line 13: Like HTML documents, an XHTML document is a sequence of Unicode characters. However, an XHTML document is an [[XML]] document, which, while not having an explicit "document character" layer of [[abstraction]], nevertheless relies upon a similar definition of permissible characters that cover most, but not all, of the Unicode/UCS character definitions. The sets used by HTML and XHTML/XML are slightly different, but these differences have little effect on the average document author. Regardless of whether the document is HTML or XHTML, when stored on a [[file system]] or transmitted over a network, the document's characters are ''encoded'' as a sequence of [[bit]] [[octet (computing)\|octet]]s (''[[byte]]s'') according to a particular character encoding. This encoding may either be a [[Unicode Transformation Format]], like [[UTF-8]], that can directly encode any Unicode character, or a legacy encoding, like [[Windows-1252]], that cannot. However, even when using encodings that do not support all Unicode characters, the encoded document may make use of [[numeric character references]]. For example <code>&#x263A;</code> ({{unicode\|☺}}) is used to indicate a smiling face character in the Unicode character set. === Character encoding===

Unicode and HTML: Difference between revisions