Talk:Windows code page: Difference between revisions

Content deleted Content added
ANSI or not?: Substituting template per documentation, replaced: {{unsigned → {{subst:unsigned using AWB
m Reverted 1 edit by 2409:40C2:6018:4DEB:287B:76FF:FE61:65C4 (talk) to last revision by Cewbot
 
(18 intermediate revisions by 13 users not shown)
Line 1:
{{WikiProject banner shell|class=B|
{{WikiProject Computing }}
{{WikiProject Microsoft Windows |importance=mid}}
}}
 
== Way too big? ==
This page should link to the individual code pages instead of listing all of the tables at once (for one thing it is missing most of the mappings anyway, like for 932). --''unsigned''
 
: {{Done}}. Someone else apparently have already cleaned up the article.--[[User:Makkachin|Makkachin]] ([[User talk:Makkachin|talk]]) 00:50, 21 July 2013 (UTC)
== Generator ==
the generator code is given below, readtxt.pas can be obtained from http://bewareserv.sourceforge.net/ most of the data used came from http://www.unicode.org/PUBLIC/mappings the .mspx.html files used where unicode.org didn't have a mapping availible for the code page in question are from http://www.microsoft.com/globaldev/reference/oem.mspx
 
== ANSI or not? ==
<pre>
program byposition;
 
Once and for all, is it correct to say "ANSI" to the Windows code pages? Currently, some pages on wikipedia say it's wrong (as ANSI never defined these code pages, but Microsoft just says "ANSI" to it anyway), while this article makes the impression that it is ok. --[[User:Abdull|Abdull]] 23:53, 17 March 2006 (UTC)
uses
:Well microsofts technical documents use that term all over the place and i don't belive anyone uses the term ansi code page for anything else. I can't imagine ANSI are particularlly happy about having thier name put to something that isn't thiers though. I guess it all depends on how you define right and wrong ;) [[User:Plugwash|Plugwash]] 10:44, 18 March 2006 (UTC)
sysutils,readtxt; //we use our own text reader as the delphi one can't handle
:::ANSI explicitly disavows any understanding of the standards they publish, and explicitly repudiates any suggestion that they have an opinion about the meaning of their standards. So if you want to say, for example, that the first 127 code points all represent the same character .... ANSI doesn't have an opinion on that. Wikipedia on the other hand.... <!-- Template:Unsigned IP --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/124.188.134.186|124.188.134.186]] ([[User talk:124.188.134.186#top|talk]]) 06:05, 25 October 2020 (UTC)</small> <!--Autosigned by SineBot-->
//unix format text
::Microsoft is now leaning away from "ansi" and uses "active" instead to describe the current code page. <small><span class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:131.107.0.73|131.107.0.73]] ([[User talk:131.107.0.73|talk]] • [[Special:Contributions/131.107.0.73|contribs]]) </span></small><!-- Template:Unsigned -->
const
maxcharset =12;//9;
startat = $80;
{$define showallchars}
{$define breakbeforecodepoint}
var
buildarray : array[0..255,0..maxcharset] of longint;
names : array[0..31] of string;
procedure processcharset(name:string;number:byte;filename:string);
var
t: treadtext;
line:string;
i,j:integer;
begin
names[number] := name;
 
== Redundant comment ==
readtext_init(t,filename);
repeat
line := readtext_line(t);
if (length(line)>=11) and (line[2]='x') and (line[7]='x') then begin;
//writeln('processing line '+line);
//unicode.org format
buildarray[strtoint('$'+copy(line,3,2)),number] := strtoint('$'+copy(line,8,4));
end else if (length(line)>=11) and (copy(line,3,5)=' = U+') then begin;
//ms format
buildarray[strtoint('$'+copy(line,1,2)),number] := strtoint('$'+copy(line,8,4));
end;
until readtext_eof(t);
end;
var
t: textfile;
i,j,k : integer;
firstline : boolean;
goodline : boolean;
rowcounter : integer;
comparevalue : integer;
begin
for i := 0 to 255 do for j := 0 to maxcharset do buildarray[i,j] := -1;
 
"Recent Microsoft products and APIs use Unicode internally, but many applications and APIs '''(including Java)''' continue to…"
{processcharset('windows-874|874',0,'CP874.txt');
: does the Java comment seem relevant? I mean, there are a million and one applications that use the older methods, shouldn't we name them here too ?
processcharset('windows-1250|1250',1,'CP1250.txt');
processcharset('windows-1251|1251',2,'CP1251.txt');
processcharset('windows-1252|1252',3,'CP1252.txt');
processcharset('windows-1253|1253',4,'CP1253.txt');
processcharset('windows-1254|1254',5,'CP1254.txt');
processcharset('windows-1255|1255',6,'CP1255.txt');
processcharset('windows-1256|1256',7,'CP1256.txt');
processcharset('windows-1257|1257',8,'CP1257.txt');
processcharset('windows-1258|1258',9,'CP1258.txt');}
 
== Removal of chart ==
processcharset('code page 437|437',0,'CP437.txt');
processcharset('code page 720|720',1,'720.mspx.html');
processcharset('code page 737|737',2,'CP737.txt');
processcharset('code page 775|775',3,'CP775.txt');
processcharset('code page 850|850',4,'CP850.txt');
processcharset('code page 852|852',5,'CP852.txt');
processcharset('code page 855|855',6,'CP855.txt');
processcharset('code page 857|857',7,'CP857.txt');
processcharset('code page 858|858',8,'858.mspx.html');
processcharset('code page 862|862',9,'CP862.txt');
processcharset('code page 866|866',10,'CP866.txt');
processcharset('windows-874|874',11,'CP874.txt');
processcharset('windows-1258|1258',12,'CP1250.txt');
 
I have removed the charts of the code pages from this article, and put all the information that was left in a miscellaneous section. I am using the latest version of the screen reader [[JAWS (screen reader)|JAWS]] on a fairly fast computer and JAWS froze for thirty seconds when I entered this page. I almost couldn't edit the section with the chart to remove it. I can understand that happening at somewhere like [[wikipedia:articles for deletion/yesterday]] or [[wikipedia:requests for adminship]] when it is busy, but JAWS should never freeze for more than a few seconds when I enter an article. Not even the article [[United States]] is that taxing on JAWS resources. The charts for code pages are available elsewhere on the Internet in an uneditable form. '''[[User:Graham87|Graham]]'''[[User talk:Graham87|<span style="color:green;">87</span>]] 08:29, 21 June 2007 (UTC)
assignfile(t,'output.txt');
rewrite(t);
writeln(t,'<table {{prettytable}}>');
 
== Available only to managed applications? ==
 
1201 — Unicode (BMP of ISO 10646, UTF-16BE) has the following description: "Available only to managed applications"
firstline := true;
Does it make any sense to include this explanation?
rowcounter := 0;
The 1200 - utf-16 encoding is also available only to managed applications according to the linked MS page, but here its missing.
for i := startat to 255 do begin
[[User:Hubalu|Hubalu]] ([[User talk:Hubalu|talk]]) 10:24, 30 January 2014 (UTC)
goodline := false;
comparevalue := buildarray[i,0];
{$ifdef showallchars}
goodline := true;
{$else}
for j := 1 to maxcharset do begin
if comparevalue <> buildarray[i,j] then goodline := true;
end;
{$endif}
if goodline then begin
if (rowcounter and ($1F shr(0{$ifndef twocol}+1{$endif} {$ifdef breakbeforecodepoint}+1{$endif} ))) = 0 then begin
write(t,'<tr>');
{$ifdef twocol}for j := 1 to 2 do{$endif} begin
 
== Parked stuff found on category page ''Category:Windows code pages'' ==
write(t,'<td>position<br>([[hexadecimal|hex]])');
for k := 0 to maxcharset do begin;
write(t,'<td>[['+names[k]+']]');
end;
end;
end;
{$ifdef twocol}if (rowcounter and 1) =0 then{$endif} write(t,'<tr>');
write(t,'<td>'+inttohex(i,2));
inc(rowcounter);
//if firstline then begin
// firstline := false;
// write(t,'<td>{{uplusfirst}}'+inttohex(i,4));
//end else begin
 
Still missing: [[windows-874]] (redirects to ISO/IEC 8859-11),
//end;
[[code page 50220]], [[code page 50221]], [[code page 50222]], which are variations on [[ISO-2022-JP]].
 
External links
*[http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx Full list of Windows code pages]
*[http://www.microsoft.com/globaldev/reference/cphome.mspx Windows Code Page reference chart]
*[http://www.iana.org/assignments/charset-reg IANA Charset Name Registrations]
*[http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS Unicode mapping table for Windows code pages]
*[http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit Unicode mappings of windows code pages with "best fit"]
 
windows-874:
*[http://www.microsoft.com/globaldev/reference/sbcs/874.mspx Windows 874 reference chart]
*[http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT Unicode mapping table for Windows 874]
*[http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit874.txt Unicode mappings of windows 874 with "best fit"]
 
End of stuff moved from category page.
 
== More than two groups of code pages ==
for j := 0 to maxcharset do begin
write(t,'<td>');
case buildarray[i,j] of
-1 : ;
$00 : write(t,'[[NUL]]');
$01 : write(t,'[[SOH]]');
$02 : write(t,'[[STX]]');
$03 : write(t,'[[ETX]]');
$04 : write(t,'[[EOT]]');
$05 : write(t,'[[ENQ]]');
$06 : write(t,'[[ACK]]');
$07 : write(t,'[[BEL]]');
$08 : write(t,'[[BS]]');
$09 : write(t,'[[TAB]]');
$0A : write(t,'[[LF]]');
$0B : write(t,'[[VT]]');
$0C : write(t,'[[FF]]');
$0D : write(t,'[[CR]]');
$0E : write(t,'[[SO]]');
$0F : write(t,'[[SI]]');
 
{{Cquote|text=There are two groups of code pages in Windows systems: OEM and ANSI code pages.}}
$10 : write(t,'[[DLE]]');
There are actually 4 groups of code pages: ANSI, OEM, Mac and EBCDIC under Windows.
$11 : write(t,'[[DC1]]');
All of them can vary dependent on the locale of the Windows version.
$12 : write(t,'[[DC2]]');
See LOCALE_IDEFAULTMACCODEPAGE and LOCALE_IDEFAULTEBCDICCODEPAGE in this link: https://msdn.microsoft.com/de-de/library/windows/desktop/dd373761(v=vs.85).aspx
$13 : write(t,'[[DC3]]');
$14 : write(t,'[[DC4]]');
$15 : write(t,'[[NAK]]');
$16 : write(t,'[[SYN]]');
$17 : write(t,'[[ETB]]');
$18 : write(t,'[[CAN]]');
$19 : write(t,'[[EM]]');
$1A : write(t,'[[SUB]]');
$1B : write(t,'[[ESC]]');
$1C : write(t,'[[FS]]');
$1D : write(t,'[[GS]]');
$1E : write(t,'[[RS]]');
$1F : write(t,'[[US]]');
 
== KOI8 ==
$80 : write(t,'[[PAD]]');
$81 : write(t,'[[HOP]]');
$82 : write(t,'[[BPH]]');
$83 : write(t,'[[NBH]]');
$84 : write(t,'[[IND]]');
$85 : write(t,'[[NEL]]');
$86 : write(t,'[[SSA]]');
$87 : write(t,'[[ESA]]');
$88 : write(t,'[[HTS]]');
$89 : write(t,'[[HTJ]]');
$8A : write(t,'[[VTS]]');
$8B : write(t,'[[PLD]]');
$8C : write(t,'[[PLU]]');
$8D : write(t,'[[RI]]');
$8E : write(t,'[[SS2]]');
$8F : write(t,'[[SS3]]');
 
$90 : write(t,'[[DCS]]');
$91 : write(t,'[[PU1]]');
$92 : write(t,'[[PU2]]');
$93 : write(t,'[[STS]]');
$94 : write(t,'[[CCH]]');
$95 : write(t,'[[MW]]');
$96 : write(t,'[[SPA]]');
$97 : write(t,'[[EPA]]');
$98 : write(t,'[[SOS]]');
$99 : write(t,'[[SGCI]]');
$9A : write(t,'[[SCI]]');
$9B : write(t,'[[CSI]]');
$9C : write(t,'[[ST]]');
$9D : write(t,'[[OSC]]');
$9E : write(t,'[[PM]]');
$9F : write(t,'[[APC]]');
 
$A0 : write(t,'[[NBSP]]');
$AD : write(t,'[[SHY]]');
 
 
else write(t,'[[&#x'+inttohex(buildarray[i,j],4)+';]]');
end;
{$ifdef breakbeforecodepoint}
if buildarray[i,j] >=0 then write(t,'<br><small>U+'+inttohex(buildarray[i,j],4)+'</small>');
{$else}
if buildarray[i,j] >=0 then write(t,'<sub>U+'+inttohex(buildarray[i,j],4)+'</sub>');
{$endif}
end;
writeln(t,'</td>');
 
 
end;
end;
writeln(t,'</table>');
closefile(t);
//for counter := 0 to 65535 do begin;
 
end.
</pre>
 
== ANSI or not? ==
 
Once and for all, is it correct to say "ANSI" to the Windows code pages? Currently, some pages on wikipedia say it's wrong (as ANSI never defined these code pages, but Microsoft just says "ANSI" to it anyway), while this article makes the impression that it is ok. --[[User:Abdull|Abdull]] 23:53, 17 March 2006 (UTC)
:Well microsofts technical documents use that term all over the place and i don't belive anyone uses the term ansi code page for anything else. I can't imagine ANSI are particularlly happy about having thier name put to something that isn't thiers though. I guess it all depends on how you define right and wrong ;) [[User:Plugwash|Plugwash]] 10:44, 18 March 2006 (UTC)
::Microsoft is now leaning away from "ansi" and uses "active" instead to describe the current code page. <small><span class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:131.107.0.73|131.107.0.73]] ([[User talk:131.107.0.73|talk]] • [[Special:Contributions/131.107.0.73|contribs]]) </span></small><!-- Template:Unsigned -->
 
== Redundant comment ==
 
"Recent Microsoft products and APIs use Unicode internally, but many applications and APIs '''(including Java)''' continue to…"
: does the Java comment seem relevant? I mean, there are a million and one applications that use the older methods, shouldn't we name them here too ?
 
== Removal of chart ==
 
Where is the source for “20866” designation by Microsoft? [[User:Incnis Mrsi|Incnis Mrsi]] ([[User talk:Incnis Mrsi|talk]]) 10:28, 3 August 2019 (UTC)
I have removed the charts of the code pages from this article, and put all the information that was left in a miscellaneous section. I am using the latest version of the screen reader [[JAWS (screen reader)|JAWS]] on a fairly fast computer and JAWS froze for thirty seconds when I entered this page. I almost couldn't edit the section with the chart to remove it. I can understand that happening at somewhere like [[wikipedia:articles for deletion/yesterday]] or [[wikipedia:requests for adminship]] when it is busy, but JAWS should never freeze for more than a few seconds when I enter an article. Not even the article [[United States]] is that taxing on JAWS resources. The charts for code pages are available elsewhere on the Internet in an uneditable form. '''[[User:Graham87|Graham]]'''<font color="green">[[User talk:Graham87|87]]</font> 08:29, 21 June 2007 (UTC)