Module talk:Unicode chart: Difference between revisions

Content deleted Content added
General Punctuation, row U+206x
organize to-do list
Line 1:
{{talkheader}}
{{to do|inner=
*Centralize current version info ([[Module:Unicode data/version]]?) and delete parameter.
*Figure out how to programmatically identify "default ignorable" code points and printable vs. non-printable format chars (rather than hardcoding a bunch of ranges—which is feasible but might not age well).
*Add a way to insert column header "reminder" rows at arbitrary intervals for huge blocks. Or maybe just do it automatically every 16 rows.
*Css
**Figure out [[#Formatting abbreviations|ideal scaling factors]] (and spacing) for characters and boxed placeholder abbreviations.
**Cell height/width properties: Do we need them? Something about [[Template:Unicode chart Javanese|Javanese]] in particular?
**Choose appropriate <code>font-family</code> order of succession for empty class definitions at [[Template:Unicode chart/script styles.css]].
**Figure out what to do about the [https://i.imgur.com/QzF7oVa.png highly elongated] [[%EF%B7%BD|U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM]].
*Implement some way to [[WP:CLICKHERE|click to show]] more character info [[#Notes about notes|in the footer area]] without rustling everyone's jimmies.
}}
 
==Notes about notes==
Line 12 ⟶ 23:
*:Personally I prefer NBSP as the base for combining characters as dotted circle (which we currently use) often interferes with the character. [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 11:33, 10 September 2019 (UTC)
―[[special:contributions/cobaltcigs|cobaltcigs]] 17:55, 9 September 2019 (UTC)
 
==Update/to-do==
''See [[Template:Unicode chart/testcases]].''
*I've reduced the number of required parameters to only the name of the block and the version string. In reality, the former can probably be deduced (from the name of the calling template), and the latter should be exposed by [[Module:Unicode data]] in some fashion (to avoid hard-coding 12.0 on any other page) and should be updated as frequently as the data subpages are updated.
*I've got it looking up the [[ISO 15924]] and using that to select a <code><nowiki><span></nowiki></code> from [[Template:Script]] containing a css class for an appropriate <code>font-family</code>. Better would be a way to apply the <code>class</code> and <code>dir</code> attributes directly to the <code><nowiki><td></nowiki></code> element.
*Start/end codepoints still exist as an option. The looked-up values can be overriden to subdivide a large block without confusing the module.
*<s>I need to debug out why it gives an error <span class="error">at line 38: bad argument #2 to 'format' (string expected, got nil)</span> but only for some block names.</s>
**It was because the [[Module:Unicode data/scripts]].ranges table skips certain chars, including the Ⴧ and Ⴭ in Georgian. Added a workaround. ―[[special:contributions/cobaltcigs|cobaltcigs]] 22:10, 9 September 2019 (UTC)
―[[special:contributions/cobaltcigs|cobaltcigs]] 20:49, 9 September 2019 (UTC)
 
== Existing charts ==
Line 60 ⟶ 62:
**<code>start</code>/<code>end</code> parameters have been scrapped in favor of a single <code>range</code> parameter which can contain multiple ranges (connected by hyphen or en dash, and separated from each other by comma, whitespace, the word "and", or in fact anything that's not a hex digit).
*14 and 15. If the unicode block display names can't be made to exactly match the [[Module:Unicode data/blocks|"official" names]] in all cases, we'll need a (hopefully short) list of aliases. Adding a blocknamelink parameter which continues to default to <code>Blockname (Unicode chart)</code> if empty would be easy and sufficient. Let's try to avoid having three sets of names wherever possible.
**{{done}} <code>link_name</code> and <code>display_name</code> parameters added for differing cases. ―[[special:contributions/cobaltcigs|cobaltcigs]] 2013:4913, 914 September 2019 (UTC)
*{{done}} 16. I don't see why not. See 13.
―[[special:contributions/cobaltcigs|cobaltcigs]] 18:20, 10 September 2019 (UTC)
 
One additional advantage of modules vs. conventional templates is that modules can receive and obey parameters based on pattern matching without exhaustively defining every possible parameter name. E.g. the Basic Latin template could call the module with the following extra parameters.
<pre>
| name = Basic Latin
<!-- these rows get dotted boxes or whatever -->
| box_U000x = yes
| box_U001x = yes
<!-- visual aliases for non-printables -->
| display_U0006 = ACK
| display_U0009 = TAB
<!-- printables that are bad titles -->
| link_U0020 = Space (punctuation)
| link_U0023 = Number sign
| link_U00F5 = Underscore
</pre>
The inner loop of the module could check whether such parameter exists and know whether to behave differently. This should be used sparingly but presents a decent solution for exceptional cases, which will be absent from most blocks. ―[[special:contributions/cobaltcigs|cobaltcigs]] 20:21, 10 September 2019 (UTC)
 
I have some follow-up:
Line 113 ⟶ 100:
I'm prepared to go with #4 for now, then upgrade to #5–6 only after all the other issues are addressed. ―[[special:contributions/cobaltcigs|cobaltcigs]] 09:17, 12 September 2019 (UTC)
:I've never been very keen on specifying fonts on the Wikipedia side, because 1) most fonts for most Unicode scripts are not available on most users devices without downloading them; 2) in the past editors have tended to specify fonts that they have on their own system so that it looks nice for them, without considering other users; and 3) the Wikipedia specified fonts may override users' font preferences set in their browser (or in Wikipedia settings). Personally I would rather not specify any fonts, and leave it to the user's browser to apply an appropriate font, but I know that this is a minority view, so I'm OK with your suggested solution. [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 13:06, 12 September 2019 (UTC)
 
==Formatting abbreviations==
Besides worrying about which abbreviations are used in the charts, there's an issue of formatting. Today, long ones are often split into two or more lines to control the width of the chart. An extreme example is NULL NOTE HEAD in [[Template:Unicode chart Musical Symbols]] but this practice happens in other places like [[https://en.wikipedia.org/wiki/Template:Unicode_chart_Mongolian]] and [[Template:Unicode chart Variation Selectors Supplement]]. I haven't checked to see if the abbreviations are always in a dashed box but maybe we could have a parm like <tt><nowiki>...|abbr|1D15|{{resize|75%|NULL<br />NOTE<br />&amp;nbsp;HEAD&amp;nbsp;}}</nowiki></tt> to preserve the ability to format these in the current fashion. In any case, formatting is something to consider. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 21:47, 12 September 2019 (UTC)
Line 127 ⟶ 115:
 
== Pink cells ==
 
A footnote says “Pink cells indicate non-printable format characters.” That is untrue: they currently indicate [https://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3ACf%3A%5D all format characters], some of which are printable. It would be more useful, I think, to highlight [https://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3ADI%3A%5D-%5B%3ACn%3A%5D default ignorable characters]. [[User talk:Gorobay|Gorobay]] ([[User talk:Gorobay|talk]]) 03:27, 14 September 2019 (UTC)
:Okay. I shall try to figure out how to distinguish between these using the available data modules. ―[[special:contributions/cobaltcigs|cobaltcigs]] 11:19, 14 September 2019 (UTC)