Microsoft Windows code page 932 (Windows-932 or ambiguously CP932), known by IBM as code page 943 (CP943)[1] and known by the IANA as Windows-31J,[2] also called MS-Kanji,[3] is Microsoft's extension of Shift JIS. In addition to the standard JIS X 0201:1997 and JIS X 0208:1997 characters, it includes NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119). It is a combination of Code page 897 and Code page 941.
Windows-31J is often mistaken for standard Shift JIS: while similar, the distinction is significant for computer programmers wishing to avoid mojibake. The "Windows-31J" name, however, is IANA's and not recognized by Microsoft, which has historically used "shift_jis" instead. In Japanese editions of Windows, this code page is referred to as "ANSI", since it is the operating system's default 8-bit encoding, even though ANSI was not involved in its definition.
Code page 943 contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.
Notice that in the CP932.TXT
mapping table linked below, code 0x5C is mapped to U+005C REVERSE SOLIDUS (\
), as it is in ASCII (ISO-646-US). This is often a source of confusion because in many Japanese fonts, this code is displayed as a Yen symbol, which would normally be represented as U+00A5 YEN SIGN (¥
) in Unicode. This stems from the fact that 0x5C is mapped to U+00A5 in ISO-646-JP and consequently JIS X 0201. However, on Windows systems, code 0x5C in code page 943 behaves as a reverse solidus (backslash) in all respects (e.g. in file paths) other than how it is displayed by some fonts.
See also
References
- ^ IBM Code Page 943
- ^ IANA Character Sets
- ^ "7.2.3. Standard Encodings". Python 3.6 Documentation. Python Software Foundation. Retrieved 19 September 2017.