Unicode in Microsoft Windows: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Clarify}}
m Disambiguate Wrapper to Wrapper function using popups; formatting: whitespace (using Advisor.js)
Line 1:
{{refimprove|date=June 2011}}
Microsoft started to consistently implement [[Unicode]] in their products quite early.{{clarify|date=July 2012}}. [[Windows NT]] was the first operating system that used Unicode in [[system call]]s. Using at first [[UCS-2]] encoding scheme, it was upgraded to [[UTF-16]] starting with [[Windows 2000]], allowing a representation of additional planes with surrogate pairs.
 
== In various Windows families ==
=== Windows NT based systems ===
Modern operating systems [[Windows XP]] and [[Windows Server 2003]], and prior to them as [[Windows NT 4]] and Windows 2000 are shipped with the [[Windows API|system libraries]], which supported string [[character encoding|encoding]] of both types: Unicode and current [[Windows code page|code page]], still incorrectly referred to as ''ANSI code page''. Unicode functions have names suffixed with -W (from [[wide character|"wide"]]), for example, lstrlenW(). Code page oriented functions uses suffix -A, e.g., lstrlenA(). This allows Windows NT OS family simultaneously run programs capable of using Unicode, and older, 8-bit encoding programs. Most of such ANSI-functions are implemented as a [[Wrapper function|wrapper]] over the corresponding Unicode functions.
 
The <code>IsTextUnicode</code> function uses an [[heuristic algorithm]] on a [[byte string]] passed to it to detect whether this string represents an Unicode text. For very short texts, this function, used by some applications like [[Notepad (software)|Notepad]], often gives incorrect results. This gave rise to legends about the existence of [[Easter egg (computing)|"Easter eggs"]] like [[Bush hid the facts]].