Unicode in Microsoft Windows: Difference between revisions

Content deleted Content added
Codename Lisa (talk | contribs)
Line 16:
In 2001, Microsoft released a special supplement to Microsoft’s old [[Windows 9x]] systems. It includes a dynamic link library unicows.dll (only 240 KB) containing the 16-bit flavor (the ones with the letter W on the end) of all the basic functions of Windows API.
 
== UTF-8 ==
== Various encoding schemes ==
Although the locale can be set so the "M" encodings handle ''some'' multi-byte encodings, it is not possible to set them to support [[UTF-8]] (attempts to use the locale id passed to MultiByteToWideChar for UTF-8 are ignored). As many libraries, including the standard C and C++ library, only allow access to files using the "M" api, it is not possible to open all Unicode-named files with them. TheseThus librariesUnicode couldis benot fixedsupported by makingWindows themin convertsoftware UTF-8using to UTF-16, or the 'a' api improved to accept UTF-8, but Microsoft has so far done neitherportable fixAPI.
Although Windows used the UTF-16LE encoding scheme internally, in [[NTFS]] file system, in [[Portable Executable|executables]] and often in [[text files]], Unicode's [[byte oriented]] encodings [[UTF-8]] and even [[UTF-7]] are supported as well. An application which has to pass UTF-8 or UTF-7 to or from a "w" [[Windows API]] should call the functions [[MultiByteToWideChar]] and WideCharToMultiByte.<ref>{{cite web |url=http://stackoverflow.com/questions/166503/utf-8-in-windows |title=UTF-8 in Windows |publisher=[[Stack Overflow]] |accessdate=July 1, 2011}}</ref> Many applications imminently have to support UTF-8 because it is the most used of Unicode encoding schemes in various [[network protocol]]s, including the [[Internet Protocol Suite]].
 
There are proposals to add api to portable libraries such as [[Boost]] to do the necessary conversion, by adding new functions for opening and renaming files. These functions would pass filenames through unchanged on Unix, but translate them to UTF-16 on Windows.
Although the locale can be set so the "M" encodings handle some multi-byte encodings, it is not possible to set them to support [[UTF-8]] (attempts to use the locale id passed to MultiByteToWideChar for UTF-8 are ignored). As many libraries, including the standard C and C++ library, only allow access to files using the "M" api, it is not possible to open all Unicode-named files with them. These libraries could be fixed by making them convert UTF-8 to UTF-16, or the 'a' api improved to accept UTF-8, but Microsoft has so far done neither fix.
 
AlthoughMany Windowsapplications usedimminently thehave to support UTF-16LE8 encodingbecause schemeit internally,is inthe [[NTFS]]most fileused system,of inUnicode [[Portableencoding Executable|executables]]schemes andin often invarious [[textnetwork filesprotocol]]s, Unicode'sincluding [[byte oriented]] encodingsthe [[UTF-8]]Internet andProtocol even [[UTF-7Suite]] are supported as well. An application which has to pass UTF-8 or UTF-7 to or from a "w" [[Windows API]] should call the functions [[MultiByteToWideChar]] and WideCharToMultiByte.<ref>{{cite web |url=http://stackoverflow.com/questions/166503/utf-8-in-windows |title=UTF-8 in Windows |publisher=[[Stack Overflow]] |accessdate=July 1, 2011}}</ref> ManyTo applicationsget imminentlypredictable havehandling toof supporterrors UTF-8and becausesurrogate halves it is themore mostcommon usedfor ofsoftware Unicodeto encodingimplement schemestheir inown variousversions [[networkof protocol]]s,these including the [[Internet Protocol Suite]]functions.
<references />