Content deleted Content added
m Archiving 1 discussion(s) from Module talk:Zh) (bot |
m Archiving 1 discussion(s) from Module talk:Zh) (bot |
||
Line 195:
: You are right. It is not mentioned but it should be I think under zh-min-nan in [http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry this doc]. which means 'nan' should be used. I have updated the sandbox and it seems to work in the [[Template:Zh/testcases|testcases]]. Can the main module be updated with this change?--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 06:03, 25 February 2016 (UTC)
:[[File:Yes check.svg|20px|link=]] '''Done'''<!-- Template:ETp --> [[User:Bazj|Bazj]] ([[User talk:Bazj|talk]]) 07:38, 25 February 2016 (UTC)
== Different traditional and simplified glyphs despite unified Unicode characters ==
I tried to use this template in the article [[Tsai Ing-wen]] to give the different traditional and simplified forms of this person’s Chinese name, as was done in [[:de:Tsai Ing-wen|the corresponding article on de.WP]]. However the result of writing <tt><nowiki>{{zh|t=蔡英文|s=蔡英文|p=Cài Yīngwén}}</nowiki></tt> is “{{zh|t=蔡英文|s=蔡英文|p=Cài Yīngwén}}” and the HTML includes <tt><nowiki><span xml:lang="zh" lang="zh">蔡英文</span></nowiki></tt> where the characters are given only once and lack any markup for script (<tt><nowiki>zh-Hant</nowiki></tt> vs. <tt><nowiki>zh-Hans</nowiki></tt> instead of just <tt><nowiki>zh</nowiki></tt>). Would it be possible to alter this template so that it is possible to give different traditional and simplified glyphs even when the Unicode characters for the two are unified? (In case you are not familiar with this aspect of [[Unicode]] read the article [[Han unification]]). [[User:LiliCharlie|LiliCharlie]] ([[User talk:LiliCharlie|talk]]) 12:54, 16 January 2016 (UTC)
: The module recognises when the traditional and simplified characters are identical, and if they are it combines them as has happened here. This is normal practice in WP articles; it is only useful to give both when they are different, and the template helps with this by eliminating such redundancy. It only does if they are identical (the bit of the script that does it is args["s"] == args["t"] on line 131). The German version of the template has much simpler (non-module) code which does not do this.
: What you may be seeing is some difference due to the different fonts your system is using for simplified and traditional. That is I think uncommon though. I do not see it here or on de.wp, and I suspect the same will be true for the vast majority of en.wp users. It will only users with particular settings for e.g. traditional and simplified Characters that will notice any difference, and then the difference will only be in the rendering not the underlying characters. You can change your settings, or use a style sheet to control the rendering of particular page elements here. See [[User:JohnBlackburne/common.css]] for some examples.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 14:01, 16 January 2016 (UTC)
::Well, the difference I see is certainly not just due to different fonts; I actually use the [[Source Han Sans]] font family for ''all'' CJK locales, so I see exactly the same glyphs when there should be no difference, and different glyphs when there should be one. Nowadays all major browsers seem to define different locales for at least simplified and traditional Chinese characters (and usually also distinguish between traditional TW and HK, as occasionally even these very similar locales use different glyphs; see [http://appsrv.cse.cuhk.edu.hk/~irg/irg/irg44/IRGN2074C.pdf here]). A lot of East Asian readers are extremely fussy about glyphic differences, and at times (though rarely) they even fail to recognize a Han character rendered in a shape that is uncommon to them. In my case, the traditional and simplified glyphs for the character {{lang|zh-Hant|蔡}}/{{lang|zh-Hans|蔡}} differ in each and all of its four components {{lang|zh|艹⺼又示}}, esp. {{lang|zh|艹⺼示}}. Why not give the readers the information that they ''are'' different? — Besides, you can’t really rely on Unicode’s (or rather, the [[Ideographic Rapporteur Group|IRG]]’s) unificaton scheme, which often seems quite random. For example {{lang|zh-Hans|禅}} and {{lang|ja|禅}} are unified (=one Unicode character) while {{lang|zh-Hans|单}} and {{lang|jp|単}}, which show exactly the same glyphic difference, are not. Unicode’s Han unification is known to be a disputed matter, and even according to the Unicode Standard locale markup is indispensable in cases like these. — P.S.: I already use stylesheets to see whatever ''I'' want to see. What I would like to achieve though is that ''any user'' gets what they deserve: 100% reliable information. [[User:LiliCharlie|LiliCharlie]] ([[User talk:LiliCharlie|talk]]) 15:29, 16 January 2016 (UTC)
::: The problem is that you are not giving readers that information, that they are different. If it displays both simplified and traditional and they look the same, as they are the same characters, then probably most users will not notice the duplication (most do not read Chinese) but those who do will be confused over why the same characters appear twice although they are the same, unlike on other pages. I’ve looked at it with three browsers on two different OSes and the simplified and traditional characters look the same. [[wikt:蔡]] says simplified and traditional are the same. That your browser displays them differently must be down to your browser and OS settings. I suspect very few readers of the English WP have similar settings, though it would be very hard to find out. It is not something that can really be addressed in the template/module as it would break how it appears on many other pages, though how many I do not know.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 16:46, 16 January 2016 (UTC)
::::{{lang|zh-Hant|蔡}}/{{lang|zh-Hans|蔡}} has ''clearly'' different glyphs for CN and TW/HK in Unicode’s [http://www.unicode.org/charts/PDF/U4E00.pdf#354 CJK Unified Ideographs chart] (look for U+8521). More examples: {{lang|zh-Hant|望}}/{{lang|zh-Hans|望}} (U+671B, p. 162) and {{lang|zh-Hant|龜}}/{{lang|zh-Hans|龜}} (U+9F9C, p. 545), which confuses even Chinese scholars. [[User:LiliCharlie|LiliCharlie]] ([[User talk:LiliCharlie|talk]]) 18:47, 16 January 2016 (UTC)
:::::FWIW the glyphs displayed on my PC for the trad and simp are almost identical other than in font weight. As a Chinese speaker/reader, the nuance involved is not a big issue and I'm sure that equally applies to my 1.2 Billion compadres, since we care about where the strokes are, not their weight. [[User:Philg88|<span style="color:#3a23e2; font-weight:bold; text-shadow:grey 0.1em 0.1em 0.1em;"> Philg88 </span>]]<sup>♦[[User_talk:Philg88|talk]]</sup> 07:41, 17 January 2016 (UTC)
::::::Font weight? There should be no differernce in weight between the fonts used for traditional and simplified Chinese. No, what I’m talking about here are structural differences: selection, number and relative position of the strokes. As in {{lang|zh-Hant|埩}} vs. {{lang|zh-Hans|埩}}. Do you really expect en.WP users to be able to tell that these are “identical,” just variants of the same Unicode character while {{lang|zh-Hant|爭}} and {{lang|zh-Hans|争}} are not? What you write makes me believe you are unfamiliar with the basic issues and the idiosycrasies of IRG/Unicode [[Han unification]]. — Besides I think that Wikipedia is there for those who want to present or acquire (scientifically) accurate knowledge, not for those who think they don’t need accuracy and can ignore existing differences because as “practical” language users they know better than the experts. [[User:LiliCharlie|LiliCharlie]] ([[User talk:LiliCharlie|talk]]) 08:57, 17 January 2016 (UTC)
::::::::This is an issue that we've never really addressed because the difference is generally insignificant except in serious ''hanzi''-related studies, but we probably should. On my (Windows) desktop I see no difference, but now on my MacBook Pro it is correctly displaying the variant forms. The font weight is not an issue here, Phil's machine is just rendering them as different fonts, but they are indeed variant forms. <small><b><span style="border:1px solid;background:#030303"><span style="color:white"> White Whirlwind </span>[[User talk:White_whirlwind|<span style="color:#030303;background-color:white;"> 咨 </span>]]</span></b></small> 09:48, 17 January 2016 (UTC)
{{od}}That is interesting. I too am on a MacBook but a fairly old model with an older version of OS X, and am seeing no difference in rendering. It would not surprise me though if Apple supported this with their frequent OS updates which generally keep current with Unicode changes. The latest iOS might be similar.
I still think there is no need to change the template for this, but it is something that is straightforward to do without actually modifying the template. Just change the content of one of the strings without changing the rendering, with e.g. a [[zero-width joiner]]:
* {{zh|t=蔡英文|s=‍蔡英文|p=Cài Yīngwén}}
But I would not recommend this as it will be confusing for other editors not familiar with this obscure piece of markup. It is better to avoid the template altogether, and supply the links and templates for markup yourself:
* [[Simplified Chinese characters|simplified Chinese]]: {{lang|zh-Hans|蔡英文}}; [[Traditional Chinese characters|traditional Chinese]]: {{lang|zh-Hant|蔡英文}}; [[pinyin]]: ''{{lang|zh-Latn-pinyin|Cài Yīngwén}}''
But because the difference between simplified and traditional characters will still not be apparent to most readers it is probably worth including a footnote in addition to the links.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 15:35, 17 January 2016 (UTC)
::This is quite spontaneous and not well thought of:
::Another solution would be to create a new template (probably based on this module) for these infrequent cases. The documentation might warn users not to use it unless the glyphs representing the two identical character strings bear a minimum amout of dissimilarity. (I don’t know if it is possible to check for dissimilarity using a whitelist or a blacklist.) And, as you say, the template also warns readers that what they see may not be what they are supposed to see, and possibly also provides a link to a help page. [[User:LiliCharlie|LiliCharlie]] ([[User talk:LiliCharlie|talk]]) 16:54, 17 January 2016 (UTC)
Definitely not a separate/new template. This template+module combines the functionality of a number of previous templates, as having the code all in one place makes it easier to maintain and ensures a consistent style and format for all uses. Since the template switched to using a Lua module there is no longer a technical need to have separate templates (previously the limitations of parser functions made it necessary). It is easy to add here if there is consensus to do so.
So I have added a “nomerge“ option to the sandbox version in the same way as other options, and added some examples to the testcases: [[Template:Zh/testcases]]. This one from there demonstrates how it works:
* <nowiki> {{Zh/sandbox |t=蔡英文|s=蔡英文|p=Cài Yīngwén|nomerge="y"}}</nowiki> gives:
**{{Zh/sandbox |t=蔡英文|s=蔡英文|p=Cài Yīngwén|nomerge="y"}}
Clearer than using obscure markup but with the same effect. In particular it does not change any existing uses. Have a look at the module sandbox [[Module:Zh/sandbox]] for the particular changes. If this seems OK to other editors we can go ahead and add it to the main template.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 23:52, 17 January 2016 (UTC)
:What is the purpose of this template? I think it is to convey the Chinese name of the person, book, etc that is the subject of article, for readers who understand characters. It's not to give lessons in typography to people who don't understand hanzi – we have specialist articles for that. In a proper setup, zh-Hant should yield traditional forms, zh-Hans simplified ones and zh the reader's preference between these. (Of course, both fonts will have to do something artificial if they cover all the non-unified variants.) So if the reader sees unified characters in their preferred form, they will know which characters are meant, and the template's job is done. In such cases (and Han unification is rather conservative) the other variant is unnecessary, and this template already produces distracting clutter. However, this doesn't apply to {{tlx|infobox Chinese}}, which has more room. [[User talk:Kanguole|Kanguole]] 01:18, 18 January 2016 (UTC)
:: I‘m on a more up to date Mac now which does draw the character differently for simplified and traditional, but it is impossible to see the difference unless I make the text size about as large as the browser will let me, and look at the part of the character that is different. It really is a minor thing, not important for describing the subject of the article, about on par with the various Romanisations that appear in {{tlx|infobox Chinese}}. Accordingly I've added it to that template where it takes little space.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 03:14, 18 January 2016 (UTC)
Why is the order of {{tlx|Zh/sandbox}} input (t < s ) and output (Hans < Hant) reversed? [[User:LiliCharlie|LiliCharlie]] ([[User talk:LiliCharlie|talk]]) 14:43, 18 January 2016 (UTC)
: It ignores the order of the parameters that are passed. you can use {{para|first}} to override the default ordering. See the template documentation.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 23:22, 18 January 2016 (UTC)
::Why have this double reversed logic "nomerge=y". Can we please keep it as "merge=no" (not collapsed) and "merge=yes" (collapsed and default). That way the logic is the same way round as labels=no and links=no (we don't do nolinks=y and nolabels=y).
::There seems to be some bug in the scripts implementation for args["s"] == args["t"]. <s>If s=U+8521 and t=U+671B then s=/=t and should return false. So why does the script return a true? Maybe to your eye, s=U+8521 and t=U+671B look incredibly similar and maybe the font author for my font didn't bother drawing the minor difference so on my computer (Win 10 English with no extra packages for fonts) they really look identical, but the computer doesn't know all that. The computer only sees a number code for a character. So how can apple==orange return true?</s> [[User:Rincewind42|Rincewind42]] ([[User talk:Rincewind42|talk]]) 01:56, 25 February 2016 (UTC)
:::I am assuming we won't go ahead with the merge option as no-one else seems to want it and the problem in the article has been addressed another way. As for the other problem can you provide an example; my test with the characters with those unicode values works fine:
::::{{zh|s=蔡|t=望}}
:::--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 05:43, 25 February 2016 (UTC)
::::I miss read the unicode numbers that {{u|Philg88}} had posted so I've struck out my previous comment. I'm now quite sure that this is all just about fonts. If you look at [http://www.unicode.org/charts/PDF/U4E00.pdf#354 CJK Unified Ideographs chart] (Large PDF) for U+8521 you'll see three Chinese characters marked with 蔡 G0-324C, 蔡 HB1-BDB2, 蔡 T1-6E5B. When I look at the Unicode PDF, I can see that there is a small difference in the direction of the stroke on the HB1-BDB2 and T1-6E5B versus the G0-324C. It is a very small difference but it is there. When I copy/paste those characters into a World.doc, the differences remain until I change the font. If I set the font for all of the characters to Microsoft JhengHei then all three render in Word as the HB1 and T1 render on the PDF. If I change the font to Microsoft YaHei, NSimSun or SimSun, then all three render as per G0 on the PDF. I get the same results when testing with U+671B and U+9F9C. In particular I find SimSun's rendering U+9F9C strikingly different form the rendering by of Microsoft JhengHei. It's not just a slight change, there are several extra strokes added and removed. Now I don't have a huge number of Chinese fonts installed but the only way I can get these characters to render as the Unicode PDF file renders them, is to vary my font selection from character to character. [[User:Rincewind42|Rincewind42]] ([[User talk:Rincewind42|talk]]) 16:24, 25 February 2016 (UTC)
:::::Also compare the Unicode chart glyphs for U+57E9 {{lang|zh-Hans|埩}}/{{lang|zh-Hant|埩}} to U+4E89 {{lang|zh-Hans|争}} and U+722D {{lang|zh-Hant|爭}}. <small>[[Wikipedia:WikiLove|Love]]</small> —[[:commons:User:LiliCharlie|LiliCharlie]] <small>([[User talk:LiliCharlie|talk]])</small> 17:02, 25 February 2016 (UTC)
{{Infobox Chinese
|t = 蔡英文
|s = 蔡英文
|p = Cài Yīngwén
|w = Tsai<sup>4</sup> Ying<sup>1</sup>-wen<sup>2</sup>
|mi = {{IPAc-cmn|c|ai|4|-|ying|1|wen|2}}
|poj = Chhoà Eng-bûn
|tl = Tshuà Ing-bûn
|h = Tshai Yîn-vun
|showflag = tl, h}}
This is the infobox copied from [[Tsai Ing-wen|the article]]. How does that look? It shows different characters for me, though the change is a very small one which I can only see if I increase the font size by several steps. The font(s) it uses are PingFang SC and PingFang TC, a new font in Mac OS X 10.11. We don’t have a policy on character variants that I am aware of but my own view is outside of articles on Chinese characters we should not bother with them. The vast majority of readers will not notice them. Even people who can read Chinese can surely read the character if it is the 'wrong' variant. Often variations in rendering due to different fonts can be more significant. Unless there is a particular reason for mentioning it, such a logo which uses a particular non-standard variant, it does not need to be mentioned. {{tl|Infobox Chinese}} is exceptional though as it often contains obscure, little used transcriptions which are of little interest, but hidden away in a collapsible box so they don't clutter or distract from the article content. Seems the best place for obscure character variants like this.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 17:18, 25 February 2016 (UTC)
:Even though the differences may seem ridiculously small to people who grew up in the West, it is probably true what ''hsknotes'' wrote in [http://languagelog.ldc.upenn.edu/nll/?p=1364#comment-30433 this comment] on Language Log: ''"... And in Chinese, the font change and simplifications make an arguably far bigger difference than u's becoming w's or th's from þ or even colour being turned into color. Sometimes the medium is the message, or at least is part of it."'' And this usually applies even to English speakers from the Sinosphere, where a simple font change is often considered a means of conveying identity and political attitude. <small>[[Wikipedia:WikiLove|Love]]</small> —[[:commons:User:LiliCharlie|LiliCharlie]] <small>([[User talk:LiliCharlie|talk]])</small> 17:50, 25 February 2016 (UTC)
::That comment though is about the significant differences introduced by simplification. But that’s not what’s happening here. It’s not been simplified as it is no simpler. There is according to e.g. [[wikt:蔡]] just one character for simplified and traditional. There is only one unicode address, 8521. They are treated as the same, the differences are so small to be invisible at normal font sizes, smaller than e.g. differences due to the font(s) or other factors.--<small>[[User:JohnBlackburne|JohnBlackburne]]</small><sup>[[User_talk:JohnBlackburne|words]]</sup><sub style="margin-left:-2.0ex;">[[Special:Contributions/JohnBlackburne|deeds]]</sub> 18:53, 25 February 2016 (UTC)
:::The comment explicitly mentions font change though. I know it's hard for Westerners to understand that glyphic differences even '''of the same character''' are seen as political statements ("communist/Mao forms"). I also know that most browsers display CJK characters at too small sizes. (My eyesight has become bad, and on my system CJK characters are bigger if language markup is used, that's why I often make edits adding it, in order to be able to read it myself.) <small>[[Wikipedia:WikiLove|Love]]</small> —[[:commons:User:LiliCharlie|LiliCharlie]] <small>([[User talk:LiliCharlie|talk]])</small> 19:13, 25 February 2016 (UTC)
|