Content deleted Content added
mNo edit summary |
→Criticism: wrong wlink |
||
Line 926:
== Criticism ==
Many older character encodings (unlike Unicode) suffer from several problems. Some vendors insufficiently document the meaning of all code point values in their code pages, which decreases the reliability of handling textual data consistently through various computer systems. Some vendors add proprietary extensions to established code pages, to add or change certain code point values: for example, byte 0x5C in [[Shift JIS]] can represent either a [[back slash]] or a
Applications may also mislabel text in [[Windows-1252]] as [[ISO-8859-1]]. The only difference between these code pages is that the code point values in the range 0x80{{ndash}}0x9F, used by ISO-8859-1 for control characters, are instead used as additional printable characters in Windows-1252{{snd}} notably for [[quotation marks]], the [[euro sign]] and the [[trademark symbol]] among others. Browsers on non-Windows platforms would tend to show empty boxes or question marks for these characters, making the text hard to read. Most browsers fixed this by ignoring the character set and interpreting as Windows-1252 to look acceptable. In HTML5, treating ISO-8859-1 as Windows-1252 is even codified as a [[W3C]] standard.<ref>{{cite web |url=https://encoding.spec.whatwg.org/#names-and-labels |title=Encoding |at=sec. 4.2 Names and labels |publisher=[[WHATWG]] |date=27 January 2015 |access-date=4 February 2015 |archive-url=https://web.archive.org/web/20150204174315/https://encoding.spec.whatwg.org/#names-and-labels |archive-date=4 February 2015 |url-status=live}}</ref> Although browsers were typically programmed to deal with this behaviour, this was not always true of other software. Consequently, when receiving a file transfer from a Windows system, non-Windows platforms would either ignore these characters or treat them as a standard control characters and attempt to take the specified control action accordingly.
|