Revision as of 21:04, 19 February 2016 edit Gpvos (talk \| contribs) Extended confirmed users 4,719 edits copyedit; not valid for other modern operating systems such as Linux ← Previous edit		Revision as of 22:07, 11 April 2016 edit undo Natg 19 (talk \| contribs) Extended confirmed users 198,280 edits Disambiguated: Notepad (software) → Microsoft Notepad Next edit →
Line 6: Modern Windows versions like [[Windows XP]] and [[Windows Server 2003]], and prior to them [[Windows NT]] (3.x, 4.0) and Windows 2000 are shipped with [[Windows API\|system libraries]] which support string [[character encoding\|encoding]] of two types: UTF-16 (often called "Unicode" in Windows documentation) and an 8-bit encoding called the "[[Windows code page\|code page]]" (or incorrectly referred to as ''ANSI code page''). 16-bit functions have names suffixed with -W (from [[wide character\|"wide"]]), for example, lstrlenW(). Code page oriented functions use the suffix -A, e.g., lstrlenA(), for "ANSI". This split was necessary because many languages, including C, do not provide a clean way to pass both 8-bit and 16-bit strings to the same API or put them in the same structure. Windows also provides the 'M' API which in some locales provided multi-byte encodings, but in most locales is the same as 'A'. Most such 'A' and 'M' functions are implemented as a [[Wrapper function\|wrapper]] that translates the code page to UTF-16 and calls the 'W' function. The <code>IsTextUnicode</code> function uses a [[heuristic algorithm]] on a [[byte string]] passed to it to detect whether this string represents UTF-16 text. For very short texts, this function, used by some applications like [[~~Notepad~~Microsoft ~~(software)~~Notepad\|Notepad]], often gives incorrect results. This gave rise to legends about the existence of [[Easter egg (computing)\|"Easter eggs"]] like [[Bush hid the facts]]. === Windows CE ===

Unicode in Microsoft Windows: Difference between revisions