Content deleted Content added
Alexlatham96 (talk | contribs) |
m →{{anchor|0|999|57344|61439|65280|65533|65534|65535}}List of code page assignments: cell templates |
||
(29 intermediate revisions by 9 users not shown) | |||
Line 14:
With the release of [[PC DOS]] version 3.3 (and the near identical [[MS-DOS]] 3.3) IBM introduced the code page numbering system to regular PC users, as the code page numbers (and the phrase "code page") were used in new commands to allow the character encoding used by all parts of the OS to be set in a systematic way.<ref name="Duncan_1988_MS-DOS_Encyclopedia"/>
[[File:IBM CJK Code Page Numbers.svg|right|thumb|IBM code page numbers (CPGIDs and CCSIDs) used for CJK encodings. Microsoft's use of code page numbers for CJK encodings differs, and is noted in brackets where applicable.]]
After IBM and Microsoft ceased to cooperate in the 1990s, the two companies have maintained the list of assigned code page numbers independently from each other, resulting in some conflicting assignments. At least one third-party vendor ([[Oracle Corporation|Oracle]]) also has its own different list of numeric assignments.<ref name="oracle.com"/> IBM's current assignments are listed in their [[CCSID]] repository, while Microsoft's assignments are documented within the [[MSDN]].<ref name="Microsoft_Codepage-ID"/> Additionally, a list of the names and approximate IANA ([[Internet Assigned Numbers Authority]]) abbreviations for the installed code pages on any given Windows machine can be found in the Registry on that machine (this information is used by Microsoft programs such as [[Internet Explorer]]).
Line 254:
* 1278 – EBCDIC Adobe (PostScript) Standard Encoding
* 1279 – Hitachi Japanese Katakana Host<ref name="Paul_2001_CODEPAGE"/>
* 1300 – Generic Bar Code/OCR-
* 1301 – Zip + 4 POSTNET Bar
* 1302 – Facing Identification Marks
* 1303 – EBCDIC Bar Code
Line 275:
* [[Code page 301|301]] – IBM-PC Japan (Kanji) DBCS
* [[Code page 437|437]] – Original IBM PC hardware code page
*
* [[Code page 737|737]] – [[Greek language|Greek]]
*
* [[Code page 808|808]] – Russian with euro (same without euro: [[Code page 866|866]])
* [[Code page 848|848]] – Ukrainian with euro (same without euro: [[Code page 1125|1125]])
* [[Code page 849|849]] – Belarusian with euro (same without euro: [[Code page 1131|1131]])
* [[Code page 850|850]] – Latin-1
*
*
*
*
*
*
* [[Code page 858|858]] – Latin-1 with [[euro]] symbol
*
*
* [[Code page 861|861]] – [[Icelandic language|Icelandic]]
* [[Code page 862|862]] – [[Hebrew language|Hebrew]]
Line 300:
* [[Code page 868|868]] – [[Urdu language|Urdu]]
* [[Code page 869|869]] – [[Greek alphabet|Greek]]
*
* [[Code page 874|874]] – Thai with Low Tone Marks & Ancient Chars (conflictive ID with Windows 874; version with euro: [[Code page 1161|1161]] Windows version: is IBM [[Code page 1162|1162]])<!-- Attention! Neither IBM 874 nor Windows 874 are rigorously the same as ISO 8859-11 / TIS 620-2533 ISO 8859-11 is probably IBM 873-->
*
*
* [[Code page 878|878]] – [[KOI8-R]]
* [[Code page 891|891]] – Korean PC SBCS
*
* [[Code page 899|899]] – IBM-PC Symbol
* [[Code page 903|903]] – Simplified Chinese PC SBCS
* [[Code page 904|904]] – Traditional Chinese PC SBCS
*
* [[Code page 907|907]] – ASCII APL (3812)
* [[Code page 909|909]] – IBM-PC APL2 Extended
Line 330:
* [[Code page 949 (IBM)|949]] – Korean (Extended Wansung (ks_c_5601-1987)) ([[Code page 1088|1088]] + [[Code page 951|951]]) (conflictive ID with Windows 949 (Unified Hangul Code); Windows version is IBM 1363)
* [[Code page 951|951]] – Korean DBCS (IBM KS Code) (conflictive ID with Windows 951, a hack of Windows 950 with Unicode mappings for some PUA Unicode characters found in HKSCS, based on the file name)
*
* [[Code page 1040|1040]] – Korean Extended
* [[Code page 1041|1041]] – Japanese Extended (JIS X 0201 Extended)
* [[Code page 1042|1042]] – Simplified Chinese Extended
* [[Code page 1043|1043]] – Traditional Chinese Extended
*
* [[Code page 1086|1086]] – IBM-PC Japan #1
* [[Code page 1088|1088]] – Revised Korean (SBCS)
* [[Code page 1092|1092]] – IBM-PC Modified Symbols
* [[Code page 1098|1098]] – [[Persian language|Farsi]]
*
*
* [[Code page 1115|1115]] – IBM-PC People's Republic of China
* [[Code page 1116|1116]] – Estonian
Line 354:
* [[Code page 1167|1167]] – [[KOI8-RU]]
* [[Code page 1168|1168]] – [[KOI8-U]]
* [[Code page 1300|1300]] – ANSI [PTS-DOS 6.70, not 6.51]▼
* [[Code page 1370|1370]] – Traditional Chinese MIX ([[Big5|Big5 encoding]]) ([[Code page 1114|1114]] + [[Code page 947|947]] + euro) (same without euro: [[Code page 950|950]])
* [[Code page 1380|1380]] – IBM-PC Simplified Chinese GB PC-DATA (DBCS PC IBM GB 2312-80)
Line 377 ⟶ 376:
* [[Code page 895|895]] – 7-bit Japan Latin
* [[Code page 896|896]] – 7-bit Japan Katakana Extended
* [[Code page 901|901]] –
* [[Code page 902|902]] – ISO Estonian with euro (same without euro: [[Code page 922|922]])
* [[Code page 912|912]] – [[ISO 8859-2]] (extended in 1999)
* [[Code page 913|913]] – [[ISO 8859-3]]
* [[Code page 914|914]] – [[ISO 8859-4]]
* [[Code page 915|915]] – [[ISO 8859-5]] (extended
* [[Code page 916|916]] – [[ISO 8859-8]]
* [[Code page 919|919]] – [[ISO 8859-10]]
* [[Code page 920|920]] – [[ISO 8859-9]]
* [[Code page 921|921]] –
* [[Code page 922|922]] – ISO Estonian (same with euro: [[Code page 902|902]])
* [[Code page 923|923]] – [[ISO 8859-15]]
Line 448 ⟶ 447:
* [[Code page 1126|1126]] – IBM-PC Korean SBCS
* [[Code page 1162|1162]] – Windows Thai (Extension of [[Code page 874|874]]; but still called that in Windows)
*
* [[Code page 1174|1174]] – Windows Kazakh<ref name="Kazakh_1174"/
* [[Code page 1250|1250]] – Windows [[Central Europe]]
* [[Code page 1251|1251]] – Windows [[Cyrillic script|Cyrillic]]
Line 507 ⟶ 506:
* [[Code page 1055|1055]] – HP PC-Line
* [[Code page 1056|1056]] – HP Line Draw
*
* [[Code page 1058|1058]] – HP PC-8DN ('''not''' the same as [[code page 865]])
* [[Code page 1351|1351]] – Japanese DBCS HP character set
Line 579 ⟶ 578:
{{Div col|colwidth=30em}}
* [[Code page 932 (Microsoft Windows)|932]] – Supports [[Japanese writing system|Japanese]] [[Shift-JIS]]
* [[Code page 936 (Microsoft Windows)|936]] – Supports [[Simplified Chinese characters|Simplified Chinese]] [[GB2312]] or [[GBK (character encoding)|GBK]]
* [[Unified Hangul Code|949]] – Supports [[Hangul|Korean]] Unified Hangul Code
* [[Code page 950|950]] – Supports [[Traditional Chinese characters|Traditional Chinese]] [[Big5]]
** [[Code page 950|951]] – Supports [[Traditional Chinese characters|Traditional Chinese]] [[Big5]] with [[HKSCS]]
{{div col end}}
Line 588 ⟶ 589:
{{Div col|colwidth=30em}}
*
▲* [[Code page 720|720]] – Arabic (Transparent ASMO)
* [[Code page 737|737]] – [[Greek language|Greek]]
* [[Code page 850|850]] – Latin-1
*
*
*
*
* [[Code page 858|858]] – Latin-1 with [[euro]] symbol
*
*
* [[Code page 861|861]] – [[Icelandic language|Icelandic]]
* [[Code page 862|862]] – [[Hebrew language|Hebrew]]
Line 733 ⟶ 732:
* Symbol Set 8V — HP Arabic-8<!-- Contradictory sources about "Arabic-8"; http://h30434.www3.hp.com/t5/Printer-Software-and-Drivers/Arabic-fonts-on-Network-Printers/td-p/2231625 and http://printronix.com/emea/wp-content/uploads/manuals/PTX_PRM_ACA_P8_258187a.pdf -->
* Symbol Set 9K — HP Korean-8<!-- (ASCII + Jamo Code Table?) -->
* Symbol Set 9T — PC 8T (also known as Code Page 437-T; this is '''not'''
* Symbol Set 9V — Latin / Arabic for Windows (this is '''not''' [[code page 1256]])
* Symbol Set 11U — PC 8D/N (also known as Code Page 437-N; coded by IBM as [[code page 1058]]; this is '''not''' [[code page 865]])
Line 786 ⟶ 785:
* Symbol Set 9R — Windows 98 Cyrillic (Practically the same as [[code page 1251]])
* Symbol Set 9U — Windows 3.0
* Symbol Set 10G — PC-851 Latin/Greek (Practically the same as
* Symbol Set 10J — PS Text (Practically the same as [[PostScript Standard Encoding|Adobe Standard]])
* Symbol Set 10L — PS ITC Zapf Dingbats (Practically the same as
* Symbol Set 10N — ISO 8859-5 Latin/Cyrillic (1988 version — IR 144)
* Symbol Set 10R — PC-855 Cyrillic (Practically the same as
* Symbol Set 10T — Teletex<!-- (CCITT T.61?) -->
* Symbol Set 10U — PC-8 (Practically the same as [[code page 437]]; coded by IBM as
* Symbol Set 10V — CP-864 (Practically the same as [[code page 864]])
* Symbol Set 11G — CP-869 (Practically the same as [[code page 869]])
* Symbol Set 11J — PS ISO Latin-1 (Practically the same as
* Symbol Set 11N — ISO 8859-6 Latin/Arabic
* Symbol Set 12G — PC Latin/Greek (Practically the same as [[code page 737]])
* Symbol Set 12J — MC Text (Practically the same as [[Mac OS Roman|Macintosh Roman]])
* Symbol Set 12N — ISO 8859-7 Latin/Greek
* Symbol Set 12R — PC Gost (Practically the same as
* Symbol Set 12U — PC-850 Latin 1 (Practically the same as [[code page 850]])
* Symbol Set 13J — Ventura International
Line 810 ⟶ 809:
* Symbol Set 14R — PC Ukrainian (Practically the same as [[RUSCII]])
* Symbol Set 15H — PC-862 Israel (Practically the same as [[code page 862]])
* Symbol Set 16U — PC-857 Latin 5 (Practically the same as
* Symbol Set 17U — PC-852 Latin 2 (Practically the same as
* Symbol Set 18N — [[UTF-8]]
* Symbol Set 18U — PC-853 Latin 3 (Practically the same as
* Symbol Set 19L — Windows 98 Baltic (Practically the same as [[code page 1257]])
* Symbol Set 19M — Windows Symbol
* Symbol Set 19U — Windows 3.1 Latin 1 (Practically the same as [[code page 1252]])
* Symbol Set 20U — PC-860 Portugal (Practically the same as
* Symbol Set 21U — PC-861 Iceland (Practically the same as [[code page 861]])
* Symbol Set 23U — PC-863 Canada - French (Practically the same as [[code page 863]])
* Symbol Set 24Q — PC-Polish Mazowia (Practically the same as [[Mazovia encoding]])
* Symbol Set 25U — PC-865 Denmark/Norway (Practically the same as [[code page 865]])
* Symbol Set 26U — PC-775 Latin 7 (Practically the same as
* Symbol Set 27Q — PC-8 PC Nova (Practically the same as [
* Symbol Set 27U — PC Latvian Russian (also known as 866-Latvian)
* Symbol Set 28U — PC Lithuanian/Russian (Practically the same as [[code page 774]])
Line 836 ⟶ 835:
{{Div col|colwidth=30em}}
* [[Code page 100|100]] – DOS Hebrew hardware fontpage (Not from IBM; [[Hebrew MS-DOS|HDOS]])<ref name="Paul_2002"/>
*
*
*
*
*
*
*
*
*
* [[Code page 165|165]] – DOS Arabic (864 Extended) (Not from IBM; ADOS)<ref name="Paul_2002"/>
*
* [[Code page 437|190]] – DEC DOS German (appears to be identical to Code page 437)
* [[Code page 210|210]] – DEC DOS Greek (NEC Jetmate printers)
* 220 – DEC DOS Spanish (Not from IBM)
*
* [[Code page 620|620]] – DOS [[Mazovia encoding|Polish (Mazovia)]] (Not from IBM)<!-- Fido Mazowia? Variant with characters "Ć" and "ć" in positions 80 and 87? -->
* [[Code page 667|667]] – DOS [[Mazovia encoding|Polish (Mazovia)]] (Not from IBM)
*
*
*
* 709 – MS-DOS Arabic ([[Code page
*
*
*
*
* 721 – MS-DOS Arabic Nafitha International (Not from IBM)
* [[Code page 770|770]] – DOS Estonian, Latvian, Lithuanian<ref name="CP770"/> (From Lithuanian Lika Software;<ref name="lika"/> Lithuanian RST 1095-89 National Standard)▼
* 768 – Arabic Al-Arabi (Not from IBM)
▲*
* [[Code page 771|771]] – DOS Lithuanian/Cyrillic — KBL<ref name="CP771"/> (From Lithuanian Lika Software<ref name="lika"/>)
* [[Code page 772|772]] – DOS Lithuanian/Cyrillic<ref name="CP772"/> (From Lithuanian Lika Software;<ref name="lika"/> Lithuanian LST 1284:1993 National Standard; adopted by IBM as [[code page 1119]])
*
* [[Code page 774|774]] – DOS Lithuanian<ref name="CP774"/> (From Lithuanian Lika Software;<ref name="lika"/> Lithuanian LST 1283:1993 National Standard; adopted by IBM as [[code page 1118]])
*
*
*
*
* [[Code page 790|790]] – DOS [[Mazovia encoding|Polish (Mazovia)]]
*
*
*
*
*
*
* [[Code page 895|895]] – [[Kamenický encoding|Czech (Kamenický)]], (Not from IBM; conflictive ID with IBM CP895 — 7-bit EUC Japanese Roman)
* [[Mazovia encoding|896]] – DOS [[Mazovia encoding|Polish (Mazovia)]] (Not from IBM; conflictive ID with IBM CP896 — 7-bit EUC Japanese Katakana)<!-- Variant with the character "zł" in position 9B? -->
Line 882 ⟶ 883:
* [[ISO 8859-7|928]] – Greek (on Star<ref name="star"/> printers); same as Greek National Standard [[ISO 8859-7|ELOT 928]] (Not from IBM; conflictive ID with IBM CP928 — Simplified Chinese PC DBCS)
* [[Code page 966|966]] – Saudi Arabian (Not from IBM)
* 972 – Hebrew (VT100) (Not from IBM)
* [[Code page 991|991]] – DOS [[Mazovia encoding|Polish (Mazovia)]] (Not from IBM)
* [[Code page 999|999]] – DOS Serbo-Croatian I (Not from IBM); also known as PC Nova and CroSCII; lower part is JUSI.B1.002, upper part is code page 437; supports [[Slovenian language|Slovenian]] and [[Serbo-Croatian language|Serbo-Croatian]] (Latin script)
* [[Code page 1001|1001]] – Arabic (on Star<ref name="star"/> printers) (Not from IBM; conflictive ID with IBM CP1001 — MICR)
* [[Code page 1261|1261]] – Windows Korean IBM-1261 LMBCS-17, similar to [[Code page 1363|1363]]<!--https://web.archive.org/web/20161220082724/https://fossies.org/dox/w32tex-src/ucnv__lmb_8c_source.html Isn't it, by any chance, a misprint of code page 1361 (Johab)? Then code page 1261 is Windows Latin-3.-->
* [[Code page 1270|1270]] – Windows Sámi
▲* [[Code page 1300|1300]] – ANSI [PTS-DOS 6.70, not 6.51] (Not from IBM; conflictive ID with IBM EBCDIC 1300 — Generic Bar Code/OCR-B)
* [[Code page 771|2001]] – Lithuanian KBL (on Star<ref name="star"/> printers); same as code page 771
* [[Code page 1116|3001]] – Estonian 1 (on Star<ref name="star"/> printers); same as code page 1116
* [[Code page 922|3002]] – Estonian 2 (on Star<ref name="star"/> printers); same as code page 922
*
* [[Code page 866-Latvian|3012]] – Latvian-2 (on Star<ref name="star"/> printers); same as code page 866-Latvian (Latvian RST 1040-90 National Standard)
* [[MIK (character set)|3021]] – Bulgarian (on Star<ref name="star"/> printers); same as MIK
Line 924 ⟶ 927:
! ID !! Names !! Description !! Origin !! Platform !! DOS !! OS/2 !! Windows !! Mac !! Else !! Encoding !! Comment
|-
| 0 || {{N/A}} || Reserved || IBM, Microsoft || {{N/A}} || 3.3+ || 1.0+ ||
|-
| 437 || CP437, IBM437 || PC US || IBM<ref name="CP437"/> || IBM PC || 3.3+ || 1.0+ || {{Yes}} ||
|-
| 57344 - 61439 || {{N/A}} || Private use derivations || IBM || {{N/A}} || {{N/A}} || {{N/A}} || {{N/A}} || {{N/A}} || {{N/A}} || {{varies|various}} || Private use code page derivations (E000h-EFFFh)
|-
| 65280 - 65533 || {{N/A}} || Private use definitions || IBM || {{N/A}} || {{N/A}} || {{N/A}} || {{N/A}} || {{N/A}} || {{N/A}} || {{varies|various}} || Private use code page definitions (FF00h-FFFDh)
|-
| 65534 || {{N/A}} || Reserved || IBM, Microsoft || {{N/A}} ||
|-
| 65535 || {{N/A}} || Reserved || IBM, Microsoft || {{N/A}} || 3.3+ || 1.0+ ||
|}
Line 1,011 ⟶ 1,014:
== External links ==
{{Wikibooks|Character Encodings/Code Tables}}
* [http://www.ibm.com/software/globalization/cdra/glossary.jsp#SPTGLCDPG IBM CDRA glossary]
* {{webarchive|url=https://web.archive.org/web/20160205110331/http://www-01.ibm.com/software/globalization/g11n-res.html|date=2016-02-05|title=IBM code pages}}
|