Code page 932 (Microsoft Windows): Difference between revisions

Content deleted Content added
Line 7:
== Differences from standard Shift JIS ==
 
Windows-31J is often mistaken for standard Shift JIS: while similar, the distinction is significant for computer programmers wishing to avoid [[mojibake]]. In addition to the standard [[JIS X 0201]]:1997 and [[JIS X 0208]]:1997 characters, it includes "NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119)".<ref name="iana31j" /> Such "formerly proprietary extensions from IBM and NEC", while not part of the JIS standards, are included in the [[W3C]]/[[WHATWG]] encoding standard used by [[HTML5]].<ref>{{cite web | url=https://encoding.spec.whatwg.org/#index-jis0208 | title=Index jis0208 | publisher=WHATWG | work=Encoding Standard}}</ref>

Some of these rows were subsequently definedused differently inby [[JIS X 0213]]. For example, compare row 89 in JIS X 0213 (beginning 硃, 硎, 硏…)<ref>{{cite web | url=https://www.itscj.ipsj.or.jp/iso-ir/233.pdf | title=233: Japanese Graphic Character Set for Information Interchange, Plane 1 | publisher=IPSJ}}</ref> to row 89 as used by IBM/NEC extensions (beginning 纊, 褜, 鍈).<ref>{{cite web | url=https://encoding.spec.whatwg.org/jis0208.html | title=Index jis0208 visualization | publisher=WHATWG | work=Encoding Standard}}</ref>
 
Windows-31J includes standard 7-bit [[ASCII]] codes for single-byte sequences with the high bit set to 0. Hence, codes 0x5C and 0x7E are mapped to U+005C REVERSE SOLIDUS (<code>\</code>) and U+007E TILDE (<code>~</code>) respectively, as they are in ASCII ([[ISO 646|ISO-646]]-US).<ref>{{cite web | url=http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT | title=CP932.TXT | publisher=Unicode Consortium}}</ref> This is often a source of confusion because in many Japanese fonts, code 0x5C is displayed as a [[JPY|Yen]] symbol, which would normally be represented as U+00A5 YEN SIGN (<code>¥</code>) in Unicode. This stems from the fact that 0x5C is mapped to U+00A5 in [[Code page 895|ISO-646-JP]] and consequently [[JIS X 0201]], of which standard [[Shift JIS]] is an extension. However, code 0x5C in Windows-31J behaves as a reverse solidus (backslash) in all respects (e.g. in [[filename|file paths]] on Windows systems) other than how it is displayed by some fonts.