Revision as of 14:45, 22 September 2020 edit JMF (talk \| contribs) Extended confirmed users 61,480 edits →Decimal input (Windows) Part 2: oil on troubled waters or oil on burning waters? ← Previous edit		Revision as of 16:23, 22 September 2020 edit undo Spitzak (talk \| contribs) Extended confirmed users 10,519 edits →Decimal Input (part 3): new section Next edit →
Line 235: ** If the Windows user sends that file to a Japanese friend or a Mac user, the display/print may differ. [I am conscious here that the context for ''this'' discussion is Unicode input, so substitution of (for example) curly quotes for typewriter quotes probably won't happen, but autocorrect has a habit of barging in where it is not wanted so I'm not taking any bets!). Does that help in any way or just add to the confusion? --[[User:John Maynard Friedman\|John Maynard Friedman]] ([[User talk:John Maynard Friedman\|talk]]) 14:44, 22 September 2020 (UTC) == Decimal Input (part 3) == Yes, I absolutely agree there is just misunderstanding here, not an argument. I believe Peter Brown has some fundemental error and I really am trying to be helpful in correcting it, though it is very hard to tell exactly what his error is. The basic question is why he started talking about 448, either implying that mod-256 can turn 980 into 448, or that for some reason 448 has fewer possible results of mod-256 than 980, when in fact both of them turn into the exact same number, 192. I think your math expression is possibly messed up as you use letters in the last one that don't appear in any others, thus it's unrelated. But yes f(g(x)) defines a function the does the translation g of x and then the translation f of that result, and could be written as a new function h(x). You are wrong about what happens when a file is sent to Japan. All the software under consideration is storing the resulting unicode code points in the file, not the numbers the user typed, and the file will display the same there. I'll try to outline my understanding of what happens, and emphasize where I think the confusion might lie. The user types {{keypress\|Alt\|9}}{{keypress\|6\|0\|chain=}}. This produces the number 960 which the software will now turn into a character. The user may also type {{keypress\|Alt\|4}}{{keypress\|4\|8\|chain=}} and produce the number 448 which the software will now turn into a character. For ''some'' software, the numbers are used directly as the Unicode code point. For 960 this produces U+03C0 which is {{char\|π}}. For 448 this produces U+01C0 which is {{char\|ǀ}}. For ''other'' software (ie a different program than the one that used it as a Unicode code point), the numbers have the [[modulo operator]] 256 applied. This turns ''both'' 960 and 448 into 192 (and I'm sorry, but I have always heard this described as "turns into" and have worked in computers for 40 years on both coasts and in England). They both turn into exactly the same number, therefore any further steps are exactly as easy or hard to describe for each of them, there is no advantage of talking about 448 over 960. There is a further confusion in that 192 is not used as a Unicode code point, but instead it is used to index either the "ANSI" code page or the "OEM" code page. If the ANSI code page is used, and it is set to [[CP1252]] (which it usually is), then 192 turns into U+00C0 or {{char\|À}}. Thus 192, 448, and 960 all turn into the same character in these programs. Most of CP1252 matches Unicode, including ___location 192, for these locations you can pretty much say the 192 is turned directly into Unicode. If the "OEM" code page is used (which appears to be the case "in Notepad or in the Wiki edit box") it looks at ___location 192 in [[CP437]] (or some similar page), and gets U+2514, which is {{char\|└}}. Thus 192, 448, and 960 all turn into the same character in these programs. I would be very interested in what happens if {{keypress\|Alt\|0}}{{keypress\|9\|6\|0\|chain=}}, ie with a zero prefix, is typed "in Notepad or in the Wiki edit box". This may cause 192 to be chosen from the ANSI code page and get {{char\|À}}. Or it might cause Unicode to be used. [[User:Spitzak\|Spitzak]] ([[User talk:Spitzak\|talk]]) 16:23, 22 September 2020 (UTC)

Talk:Unicode input: Difference between revisions