Content deleted Content added
Citation bot (talk | contribs) Removed parameters. | Use this bot. Report bugs. | Suggested by Abductive | #UCB_webform 3143/3850 |
Maxeto0910 (talk | contribs) m period after sentence Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit |
||
Line 9:
Until 2000s, most Japanese [[email]]s were in [[ISO-2022-JP]] ("JIS encoding") and [[web page]]s in [[Shift-JIS]] and mobile phones in Japan usually used some form of [[Extended Unix Code]].<ref>{{Cite web|url=http://ash.jp/code/code.htm|title=文字コードについて|date=2002|publisher=ASH Corporation|access-date=2019-05-14}}</ref> If a program fails to determine the encoding scheme employed, it can cause {{Nihongo3|"misconverted garbled/garbage characters"|文字化け|''[[mojibake]]''|literally "transformed characters"}} and thus unreadable text on computers.
[[File:PC-9801F Kanji ROM board.jpg|thumb|Kanji [[Read-only memory|ROM]] card installed in [[PC-9800 series|PC-98]], which stored about 3000 glyphs, and enabled a quick display. It also had a [[Random-access memory|RAM]] to store gaiji.]]
[[File:Control panel of public background music system.jpg|thumb|Embedded devices are still using [[half-width kana]].]]
The first encoding to become widely used was [[JIS X 0201]], which is a [[ISO 646|single-byte encoding]] that only covers standard 7-bit [[ASCII]] characters with [[Half-width kana|half-width katakana]] extensions. This was widely used in systems that were neither powerful enough nor had the storage to handle kanji (including old embedded equipment such as cash registers) because Kana-Kanji conversion required a complicated process, and output in kanji required much memory and high resolution. This means that only katakana, not kanji, was supported using this technique. Some embedded displays still have this limitation.
Line 49:
This can happen for example in the [[C (programming language)|C]] programming language, when having Shift-JIS in text strings. It does not happen in HTML since ASCII 0x00–0x3F (which includes ", %, & and some other used escape characters and string separators) do not appear as second byte in Shift-JIS, and backslash is not an escape characters there. But it can happen for [[JavaScript]] which can be embedded in HTML pages.
[[Extended Unix Code|EUC]], on the other hand, is handled much better by parsers that have been written for 7-bit ASCII (and thus [[Extended Unix Code|EUC]] encodings are used on UNIX, where much of the file-handling code was historically only written for English encodings). But EUC is not backwards compatible with JIS X 0201, the first main Japanese encoding. Further complications arise because the original Internet e-mail standards only support 7-bit transfer protocols. Thus {{IETF RFC|1468}} ("[[ISO-2022-JP]]", often simply called [[JIS encoding]]) was developed for sending and receiving e-mails.[[File:Japanese TV closed caption using gaiji.jpg|thumb|[[Gaiji]] is used in closed caption of Japanese TV broadcasting.]]
In [[character set]] standards such as [[JIS X 0208|JIS]], not all required characters are included, so [[gaiji]] ({{lang|ja|外字}} "external characters") are sometimes used to supplement the character set. Gaiji may come in the form of external font packs, where normal characters have been replaced with new characters, or the new characters have been added to unused character positions. However, gaiji are not practical in [[Internet]] environments since the font set must be transferred with text to use the gaiji. As a result, such characters are written with similar or simpler characters in place, or the text may need to be encoded using a larger character set (such as Unicode) that supports the required character.<ref>{{Cite web|url=http://heicyann.com/pc/20160218a/|title=住基ネット統一文字コードによる外字の統一について|last=兵ちゃん|date=2016-02-18|access-date=2019-05-14|archive-date=2020-08-02|archive-url=https://web.archive.org/web/20200802022153/http://heicyann.com/pc/20160218a/|url-status=dead}}</ref>
|