Code page 932 (Microsoft Windows): Difference between revisions

Content deleted Content added
No edit summary
Line 1:
{{hatnote|This is Microsoft's Code Page 932 and IBM's Code Page 943. For IBM's Code Page 932, see [[Code page 932]].}}
 
'''Microsoft Windows code page 932''' ('''Windows-932''' or [[Code page 932|ambiguously]] '''CP932'''), known by IBM as '''[[code page]] 943''' ('''CP943''')<ref name="ibm943">{{cite web | url=http://www-01.ibm.com/software/globalization/ccsid/ccsid943.html | title=Code Page 943 | publisher=IBM}}</ref> and known by the [[Internet Assigned Numbers Authority|IANA]] as '''Windows-31J''',<ref name="iana31j">{{cite web | url=https://www.iana.org/assignments/character-sets/character-sets.xhtml | publisher=IANA | title=Character Sets}}</ref> also called '''MS-Kanji''',<ref>{{cite web | url=https://docs.python.org/3.6/library/codecs.html#standard-encodings | title=7.2.3. Standard Encodings | publisher=Python Software Foundation | work=Python 3.6 Documentation | accessdate=19 September 2017}}</ref> is Microsoft's extended variant of [[Shift JIS]]. It contains standard 7-bit [[ASCII]] codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding. It is a combination of [[Code page 897]] and [[Code page 941]].<ref name="ibm943"/>
 
IBM offer the same extended double-byte codes in their '''[[code page]] 943''' ('''IBM-943''' or '''CP943'''), which is a combination of [[Code page 897]] and [[Code page 941]].<ref name="ibm943">{{cite web | url=http://www-01.ibm.com/software/globalization/ccsid/ccsid943.html | title=Code Page 943 | publisher=IBM}}</ref>
 
The "Windows-31J" name is IANA's and not recognized by Microsoft, which has historically used "shift_jis" instead. In Japanese editions of Windows, this code page is referred to as "ANSI", since it is the operating system's default 8-bit encoding, even though [[ANSI]] was not involved in its definition.
Line 13 ⟶ 15:
Some of these rows were subsequently used differently by [[JIS X 0213]]. For example, compare row 89 in JIS X 0213 (beginning 硃, 硎, 硏…)<ref>{{cite web | url=https://www.itscj.ipsj.or.jp/iso-ir/233.pdf | title=233: Japanese Graphic Character Set for Information Interchange, Plane 1 | publisher=IPSJ}}</ref> to row 89 as used by JIS X 0208 with IBM/NEC extensions (beginning 纊, 褜, 鍈…).<ref>{{cite web | url=https://encoding.spec.whatwg.org/jis0208.html | title=Index jis0208 visualization | publisher=WHATWG | work=Encoding Standard}}</ref>
 
Windows-31J includes standard 7-bit [[ASCII]] codes for single-byte sequences with the high bit set to 0. Hence, codes 0x5C and 0x7E are mapped to U+005C REVERSE SOLIDUS (<code>\</code>) and U+007E TILDE (<code>~</code>) respectively,<ref>{{cite web | url=http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT | title=CP932.TXT | publisher=Unicode Consortium}}</ref><ref>{{cite web | url=https://msdn.microsoft.com/en-us/library/cc194889.aspx | title=Lead byte NULL — Code page 932 | publisher=Microsoft}}</ref> as they are in ASCII ([[ISO 646|ISO-646]]-US). This is often a source of confusion because in many Japanese fonts, code 0x5C is displayed as a [[JPY|Yen]] symbol, which would normally be represented as U+00A5 YEN SIGN (<code>¥</code>) in Unicode. This stems from the fact that 0x5C is mapped to U+00A5 in [[Code page 895|ISO-646-JP]] and consequently [[JIS X 0201]], of which standard [[Shift JIS]] is an extension. However, code 0x5C in Windows-31J behaves as a reverse solidus (backslash) in all respects (e.g. in [[filename|file paths]] on Windows systems) other than how it is displayed by some fonts.
 
IBM-943, however, like [[Code page 932|IBM-932]], is a superset of [[Code page 897|IBM-897]]<ref name="ibm943"/>, which assigns 0x5C to the Yen symbol (¥) and 0x7E to the [[Macron|Overline]] (¯),<ref>{{cite web | url=ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00897.txt | title=CP00897.txt | publisher=IBM}}</ref> as in [[JIS X 0201]].
 
==See also==