Content deleted Content added
Changing short description from "Legacy Shift JIS character encoding of the Japanese Windows system locale." to "Japanese Windows character encoding / Shift JIS variant." (Shortdesc helper) |
|||
Line 46:
=== Single-byte character differences ===
Windows-932 includes standard 7-bit [[ASCII]] mappings for single-byte sequences with the high bit set to 0. Hence, codes 0x5C and 0x7E are mapped to Unicode as U+005C REVERSE SOLIDUS (<code>\</code>, the [[backslash]]) and U+007E [[tilde|TILDE]] (<code>~</code>) respectively,<ref name="msmapping">{{cite web | url=https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT | title=CP932.TXT | publisher=Unicode Consortium}}</ref><ref name="msrefrender">{{cite web | url=https://msdn.microsoft.com/en-us/library/cc194889.aspx | title=Lead byte NULL — Code page 932 | publisher=Microsoft}}</ref><ref name="w3cjpprof"/> as they are in ASCII ([[ISO 646|ISO-646]]-US). This is likewise done by the W3C/WHATWG encoding standard.<ref>{{cite web | url=https://encoding.spec.whatwg.org/#shift_jis-decoder | title=12.3.1. Shift_JIS decoder | publisher=WHATWG | work=Encoding Standard
However, 0x5C in Windows-932 is nonetheless considered a Yen sign in certain contexts.<ref name="kaplan">{{cite web | title=When is a backslash not a backslash? | date=2005-09-17 | author=Kaplan, Michael S. | url=http://archives.miloush.net/michkap/archive/2005/09/17/469941.html | work=Sorting it all out}}</ref> For this reason, in many Japanese fonts, U+005C is displayed as a Yen symbol, which would normally be represented as U+00A5, rather than as a backslash per Unicode's suggested rendering. U+00A5 is one-way best-fit mapped onto 0x5C in Windows-932. However, code 0x5C in Windows-932 behaves as a reverse solidus (backslash) in all respects (e.g. in [[filename|file paths]] on Windows systems) other than how it is displayed by some fonts,<ref name="kaplan" /> and Microsoft's documentation for Windows-932 displays 0x5C as a backslash.<ref name="msrefrender" /> This mapping<ref name="msmapping" /> corresponds to the encoding named "ibm-943_P15A-2003" in [[International Components for Unicode]] (ICU),<ref name="icuwindows31j" /> except for minor reordering of a few [[C0 control characters]].
|