Module talk:Unicode chart: Difference between revisions

Content deleted Content added
 
(28 intermediate revisions by 9 users not shown)
Line 1:
{{talkheader}}
{{to do|inner=
*Add <code>is_default_ignorable</code> to [[Module:Unicode data]] (currently in [[Module:Unicode data/sandbox]]) and use it to identify "default ignorable" code points.
*Figure out how to programmatically identify "default ignorable" code points and printable vs. non-printable format chars (rather than hardcoding a bunch of ranges—which is feasible but might not age well).
*Add a way to insert column header "reminder" rows at arbitrary intervals for huge blocks. Or maybe just do it automatically every 16 rows.
*Css
Line 37 ⟶ 38:
# How to handle block-specific formatting? For example [[Template:Unicode chart Javanese]] has a specific height and some of the characters in [[Template:Unicode chart Control Pictures]] use a different font size.
# How to handle character links? Like {{ping|BabelStone}}, I'm not a fan of linking specific characters (but others are). It looks like your code, optionally, will link every character if an article exists, but this could increase the number of linked characters. And many characters aren't linked to the character itself, like U+2245 in [[Template:Unicode chart Mathematical Operators]]. Some link to wikt, like U+0x2105 in [[Template:Unicode chart Letterlike Symbols]] and all the characters in [[Template:Unicode chart CJK Unified Ideographs Extension A]].
# {{done}} Some blocks have special parameters that need to be taken into account: [[Template:Unicode chart Alphabetic Presentation Forms]], [[Template:Unicode chart Enclosed Alphanumeric Supplement]], [[Template:Unicode chart Enclosed CJK Letters and Months]], [[Template:Unicode chart Halfwidth and Fullwidth Forms]], [[Template:Unicode chart Miscellaneous Symbols]], and [[Template:Unicode chart Supplemental Symbols and Pictographs]]. As with most of these questions, this only only applies if you're replacing existing chart templates.
# How to determine the chart name? Most charts use the block name for the title but some don't. For example, "C0 Controls and Basic Latin" is the chart name for the "Basic Latin" block.
# How to determine what to link the chart name to. For example, the [[Template:Unicode chart Kangxi Radicals]] chart links to "Kangxi radical#Unicode". Most either link to the block name itself or the block name with "(Unicode block)" appended.
Line 61 ⟶ 62:
**<code>start</code>/<code>end</code> parameters have been scrapped in favor of a single <code>range</code> parameter which can contain multiple ranges (connected by hyphen or en dash, and separated from each other by comma, whitespace, the word "and", or in fact anything that's not a hex digit).
*14 and 15. If the unicode block display names can't be made to exactly match the [[Module:Unicode data/blocks|"official" names]] in all cases, we'll need a (hopefully short) list of aliases. Adding a blocknamelink parameter which continues to default to <code>Blockname (Unicode chart)</code> if empty would be easy and sufficient. Let's try to avoid having three sets of names wherever possible.
**{{done}} <code>link_name</code> and <s><code>display_name</code></s> parameters added for differing cases. ―[[special:contributions/cobaltcigs|cobaltcigs]] 13:13, 14 September 2019 (UTC)
*{{done}} 16. I don't see why not. See 13.
―[[special:contributions/cobaltcigs|cobaltcigs]] 18:20, 10 September 2019 (UTC)
Line 71 ⟶ 72:
*8 The "Dashed Box Convention" is explained at https://www.unicode.org/versions/Unicode12.0.0/ch24.pdf#G8175 It's an oversight not having a note explaing this convention. It was added to match Unicode's charts. I think it's useful. Depending on the font, without the dashed box U+0602 is easily confusable with U+060E, U+1F1E6 looks the same as captial A, etc. As far as I know there's no way to determine which characters get a dashed box programmatically. As of version 12.1 it's used on U+0000-0020, 007F-00A0, 00AD, 034F, 0600-0605, 061C, 06DD, 070F, 08E2, 0CF1-0CF2, 0D4E, 0F0C, 1039, 115F-1160, 17B4-17B5, 17D2, 180B-180E, 1A60, 1BAB, 1CF5-1CF6, 2000-200F, 2011, 2028-202F, 205F-2064, 2066-206F, 2D7F, 2E3A-2E3B, 3000, 303E, 3164, AAF6, FE00-FE0F, FEFF, FFA0, FFF9-FFFB, 10A3F, 11003-11004, 1107F, 110BD, 110CD, 111C2-111C3, 11A3A, 11A47, 11A84-11A89, 11A99, 11D45-11D46, 11D97, 13430-13438, 16F8F-16F92, 1BC9D, 1BCA0-1BCA3, 1D159, 1D173-1D17A, 1DA9B-1DA9F, 1DAA1-1DAAF, 1F1E6-1F1FF, E0001, E0020-E007F, and E0100-E01EF.
*10 Unicode charts use XXX (in a dotted box) for U+0080, 0081, and 0099 and I don't think Wikipedia's charts should contradict the cited source. (For some archane history of these three characters, I recommend http://unicode.org/pipermail/unicode/2015-October/002876.html) I think the only way of determining the abbreviations to use in the charts is a hardcoded table. They don't always match an alias. For example U+E007F is displayed as "END". A lot of the code points that use the dashed box convention display abbreviations. I haven't compiled a definitive list.
*{{done}} 13 In [[Template:Unicode chart Enclosed CJK Letters and Months]] the hangul subset isn't contiguous. Nor is the emoticon subset of [[Template:Unicode chart Miscellaneous Symbols]]. I didn't add these features so I don't know what reaction you'll get from removing them.
[[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 23:09, 10 September 2019 (UTC)
 
Line 84 ⟶ 85:
:* <b>The three characters that Unicode displays as "XXX" do indeed have abbreviations in NameAliases.txt but they all have a type of "figment" as in "figment of one's imagination". I feel strongly that we shouldn't assign abbreviations to the charts that contradict the ones used in the actual, cited Unicode charts.</b> [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 21:47, 12 September 2019 (UTC)
* I gave the control characters a light blue background and an explanatory footnote similar to those for RESERVED and NONCHARACTER. Also dashed boxes around the abbreviations, which are loaded from [[Module:Unicode data/aliases|here]]. Some have multiple abbreviations. The current behavior is to choose the last one, because at brief glance that seemed most correct in most cases. I'd rather we move the "official" or preferred abbreviation to the top and consistently select the first one instead. I've yet to research what, if anything, might be broken by changing abbreviation order.
:* <b><ttsamp>Module:Unicode data/aliases</ttsamp> is generated from Unicode's NameAliases.txt file. It looks like it is in the same order, so any tweeking we do to order would be problematic when the file is updated. If we changed the script that creates aliases we would just be moving the logic from the chart script to the generation script. Other users of <ttsamp>alias</ttsamp> may not have the same requirement so I think the right place to make the determination for what to use in the charts belongs in the chart script. I have another abbreviation issue but I'll do that in a new section for clarity.</b> [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 21:47, 12 September 2019 (UTC)
―[[special:contributions/cobaltcigs|cobaltcigs]] 09:17, 12 September 2019 (UTC)
 
Line 103 ⟶ 104:
 
==Formatting abbreviations==
Besides worrying about which abbreviations are used in the charts, there's an issue of formatting. Today, long ones are often split into two or more lines to control the width of the chart. An extreme example is NULL NOTE HEAD in [[Template:Unicode chart Musical Symbols]] but this practice happens in other places like [[Template:Unicode_chart_Mongolian]] and [[Template:Unicode chart Variation Selectors Supplement]]. I haven't checked to see if the abbreviations are always in a dashed box but maybe we could have a parm like <ttsamp><nowiki>...|abbr|1D15|{{resize|75%|NULL<br />NOTE<br />&amp;nbsp;HEAD&amp;nbsp;}}</nowiki></ttsamp> to preserve the ability to format these in the current fashion. In any case, formatting is something to consider. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 21:47, 12 September 2019 (UTC)
:Eww. See [[User:BabelStone/sandbox#Musical Symbols]] for an attempt to replicate that (without any <code><nowiki><br />&amp;nbsp;</nowiki></code> crap, which is great!). Note that 1D173–1D17A are identified as "format" characters in [[Module:Unicode data|this file]], but "NULL NOTE HEAD" is not. Hence the difference in css/color. The pink can of course be changed later. ―[[special:contributions/cobaltcigs|cobaltcigs]] 20:45, 13 September 2019 (UTC)
::Wow, I've never realised that U+1D159 is not a format character. Are there any other characters displayed as a dashed box around text that are not format or control characters? <s>I don't think so</s> (variation selectors are gc=Mn). The worrying thing is there seems to be no way of extracting the information from the UCD, so it relies on visually checking the Unicode code charts, but what if it changes suddenly to a graphic character in a new version of Unicode? My gut feeling is that gc=So is wrong if the character has no visible glyph and is not whitespace. [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 22:52, 13 September 2019 (UTC)
::I couldn't immediately work out where you are specifying a smaller font size for "NULL NOTE HEAD" compared with "Begin Beam" etc. I think that all the dashed boxes need a smaller font size because (on my system at least) the dashed letters are much larger size than Basic Latin letters, and make the cells overwide. Can we simply add "font-size:75%" for td.box in [[Template:Unicode chart/styles.css]], or is there more to it? [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 23:30, 13 September 2019 (UTC)
:::This text uses {{code|lang=css|span.small-1 { font-size:80%; } span.small-2 { font-size:59%; } }} wherein the suffix digit is determined by the number of spaces converted to linebreaks in whatever text is shown (which may be read from the aliases file or from a <s><code>display_NNNN</code></s> override parameter). Then the property {{code|lang=css|white-space:pre;}} forces <code>\n</code> to show up as literal linebreaks so we don't have to resort to {{code|lang=html|<br />}}. Thus one-word abbreviations such as <code>ACK</code> use the same size as regular chars. All of this can be easily changed. For now, I've tightened the dashed box and cell margins/padding a little bit. ―[[special:contributions/cobaltcigs|cobaltcigs]] 10:08, 14 September 2019 (UTC)
 
==Version==
There have been many past discussions about how to determine which Unicode version to show in the footnote of the chart. Because they were manually updated, it wasn't practical to have a master switch for the version. If the charts are created using <ttsamp>Module:Unicode data</ttsamp> it might be possible to do away with the mindless updating I do once a year for all the charts. A new <ttsamp>Module:Unicode data/version</ttsamp> item could be added that is manually updated after all of the other <ttsamp>Module:Unicode data</ttsamp> files are updated. Basically, it's just a string field to say "We've updated all the other data to version x". If the version footnote was pulled from that string, it would alleviate a lot of manual effort. It would mean adding <ttsamp>Module:Unicode data/version</ttsamp> to the list of "regenerate the charts if tables x, y, and z change". [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 21:47, 12 September 2019 (UTC)
:FYI: After a few updates, all of the [[Module:Unicode data]] subpages are now up-to-date (Unicode version 12.1). [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 04:45, 14 September 2019 (UTC)
::I do like the idea of centralizing the version string. Even as a single-purpose one-liner module {{code|lang=lua|return "12.1"}} would be fine. ―[[special:contributions/cobaltcigs|cobaltcigs]] 10:17, 14 September 2019 (UTC)
Line 138 ⟶ 139:
| display_name = General Punctuation (2060–206F only)
| range = 2060–206F
<!-- parameters gone, now loaded from master list -->
| display_2061 = <i>ƒ</i><small>()</small>
| display_2062 = ✕
| display_2063 = ,
| display_2064 = +
<!-- skip 2065–2069 -->
<!-- spaces magically become \n here; box span has white-space:pre. -->
| display_206A = I SS
| display_206B = A SS
| display_206C = I AFS
| display_206D = A AFS
| display_206E = NA DS
| display_206F = NO DS
}}
: I can't find a file in the Unicode Character Database that lists the display forms for the dotted box characters. They aren't in [https://www.unicode.org/Public/UCD/latest/ucd/NamesList.txt NamesList.txt], which is parsed into the PDF that you linked to. So they would have to be gathered manually from the PDFs, unless they can be found somewhere else. — [[User:Erutuon|Eru]]·[[User talk:Erutuon|tuon]] 04:13, 18 September 2019 (UTC)
::As far as I know, there isn't anything in the UCD. I've always determined dotted box notation manually. BTW: I think the <s>display_20xx</s> parms above are appropriate. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 04:40, 18 September 2019 (UTC)
:: To clarify, "manually" would mean by visual approximation. Copy/paste gives us private-use codepoints assigned to arbitrary glyphs which represent the whole abbreviation (in some font that probably doesn't exist outside the PDF). So much eww. ―[[special:contributions/cobaltcigs|cobaltcigs]] 13:39, 18 September 2019 (UTC)
::: If you're interested, the fonts with the dashed glyphs (SpecialsUC4/5/6.ttf) are bundled with the free [https://unicode.org/unibook/ Unibook] application that is used to generate the Unicode and ISO/IEC 10646 code charts. [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 16:06, 18 September 2019 (UTC)
Line 183 ⟶ 173:
:: ―[[special:contributions/cobaltcigs|cobaltcigs]] 20:50, 16 September 2019 (UTC)
::: You can rely on <code>lookup_category</code> never returning <code>nil</code> (at least when supplied a valid code point); <code>memo_lookup</code> guarantees that. The return value is either a "real" category when the code point is found in <code>singles</code> or <code>ranges</code> or Cn (Unassigned). — [[User:Erutuon|Eru]]·[[User talk:Erutuon|tuon]] 22:40, 16 September 2019 (UTC)
::: Oops. Actually, what I said is true of [[Module:Unicode data/sandbox]], but at the moment [[Module:Unicode data]] is buggy. — [[User:Erutuon|Eru]]·[[User talk:Erutuon|tuon]] 23:35, 30 September 2019 (UTC)
 
===Selectability: CSS vs. plain text===
Line 191 ⟶ 182:
===Actual aliases vs. corrections===
Can we have a demo of the info panel for a block with one or more characters that have a formal alias? I suggest Vertical Forms with its horrendously long name and alias for FE18. [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 10:22, 17 September 2019 (UTC)
:Eww, a spelling error ("BRAKCET"). So the correctly spelled name is '''currently not loaded at all''' because it's recorded in the [[Module:Unicode data/aliases|aliases file]] as a <code>correction</code> rather than an <code>alias</code>. Aliases are currently loaded by the module (see control characters in the Latin chart above), whereas corrections will be a new concept which I'm not yet sure how best to handle. Do we want to show the misspelled title (maybe with a {{tl|sic}} tag, even) and note the correction as such on the next line? Or should we just replace it outright without comment? I suppose I'll begin reviewing the other <code>correction</code>s vs. what names they are correcting, to see how trivial or major their differences tend to be. For now, here's what the Vertical Forms block currently looks like: {{unicode chart|Vertical Forms|info=yes}} ―[[special:contributions/cobaltcigs|cobaltcigs]] 12:01, 17 September 2019 (UTC)
{{unicode chart|Vertical Forms|info=yes}}
 
{{collapsible section|title=Complete list of corrections (28) to consider|content=<nowiki />
Line 208 ⟶ 200:
::: Related: Can I also get your opinion on whether to put atypical abbreviations in the boxes for [[#General Punctuation, row U+206x]] above? ―[[special:contributions/cobaltcigs|cobaltcigs]] 20:15, 17 September 2019 (UTC)
:::: Yes. I'd display all of the aliases in the order they appear in NameAliases.txt (which is preserved in [[Module:Unicode data/aliases]]). But I also think the ''type'' of alias is useful to know. My preference would look like this:
:::: <table class="wikitable" style="width:100%;"><tr><th style="padding: 0px; width: 8%; font-family: 'serif'; font-weight: normal; font-size: 250%;"><span style="border: 2px dashed black; padding: 3px;">LF</span></th><td><div class="title" style="font-weight: bold; display: inline-block;">U+000A &lt;control&gt;</div><div class="category" style="display: inline; white-space: pre; "> (control)</div><div class="aliases plainlist" style="line-height: 120%;"><div style="white-space: pre; display: inline-block; vertical-align: top;">Control: LINE FEED<br />Control: NEW LINE<br />Control: END OF LINE<br />Abbreviation: LF<br />Abbreviation: NL<br />Abbreviation: EOL</div><div style="font-family: monospace;">[other stuff below...]</div></div></td></tr></table>
:::::{{done}}, see [[#info-000A]] above. Using <code><nowiki><ul></nowiki></code> because [https://stackoverflow.com/a/1726103 <code><nowiki><br /></nowiki></code> is for poetry and mailing addresses]. And I've just noticed the word "alias" won't actually appear to the reader. ―[[special:contributions/cobaltcigs|cobaltcigs]] 17:09, 18 September 2019 (UTC)
:::: As far as which abbreviation to use in the Wikipedia chart, I think it should match the official, cited Unicode chart. I'm guessing that a lot of them match the first/only abbreviation type of named alias but obviously not always. As you mentioned, U+206x is a good example of chart abbreviations that don't match named aliases. I'm thinking a table of chart abbreviations would be required. You could probably default the chart abbreviation if no exception is found but would it be worth the processing to not find a match first or is it faster to just add them all to a table?<br />My concern with using different chart abbreviations than Unicode is that there is no right answer. If someone were to change the Wikipedia chart abbreviation for U+000A from <ttsamp>LF</ttsamp> to <ttsamp>NL</ttsamp> would that be wrong/revertable? What about <ttsamp>LINE</ttsamp>? Or <ttsamp>LFEED</ttsamp>? If we don't have a definitive way to determine the chart abbreviation we open ourselves up to edit wars. Being able to cite the actual Unicode chart gives us one, definitive chart abbreviation.<br />Great work so far, BTW. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 22:10, 17 September 2019 (UTC)
 
::::: Okay, clearly I misinterpreted "I think all alias types should be shown using the Type: ALIAS" to mean "replace more specific alias-type labels with the word ALIAS". Makes a lot more sense with a picture drawn, glad I asked.
::::: So my actual concern about U+206x is that stand-in symbols might be mistaken for the actual glyph '''even by readers otherwise familiar with "normal" control/format character abbreviations''' which consist of multiple capital letters. So some explanatory footnotes might really be needed there.
:::::: '''Agreed'''. My first draft of a note would be "A dashed box indicates characters which normally have no visible display or only modify the display of other characters. {{cite web|title = Dashed Box Convention | url = https://www.unicode.org/versions/Unicode12.0.0/ch24.pdf#G8175 | publisher=Unicode Consortium }}"<br />The citation might be overkill. Although the nuances are pretty complicated so maybe the citation is justified. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 02:04, 18 September 2019 (UTC)
::::: <s>Currently the display text can be overridden from the calling environment (ultimately, a block-specific template) for all assigned codepoints with few restrictions,<ref>Exception: whitespace characters, where the main grid disregards all abbreviations real or fake, instead forcing white-on-green rectangular display of the literal character to show relative size (and allow user to select/copy just like any other printable character). This differs from the source material but seems beneficial enough to justify. So for these codepoints, only in the lower info panel can the display text such as <span style="padding: 2px; border:1px dashed black;">NBSP</span> actually be overridden.</ref> which has been done in the U+206x example (and less constructively in the [[User:BabelStone/sandbox#Basic Latin (with various per-cell customizations)|"Vulgar" Latin]] sandbox section).</s> If we do load a master list of favored abbreviations from a sub-module (containing everything from <code>LF</code> to <code>NULL NOTE HEAD</code>), the <s><code>display_NNNN = FOO</code></s> parameters could be totally deleted.
::::::{{done}} and {{removed}}
::::: ―[[special:contributions/cobaltcigs|cobaltcigs]] 23:14, 17 September 2019 (UTC)
:::::: '''Oops''', I completely forgot about the <s><code>display_NNNN = FOO</code></s> parm. I like the idea of a master list because it centralizes the data but either approach will work. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 02:04, 18 September 2019 (UTC)
::::::: +1 for a master list. [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 16:13, 18 September 2019 (UTC)
:::::::: {{done}} ―[[special:contributions/cobaltcigs|cobaltcigs]] 06:42, 19 September 2019 (UTC)
{{reflist-talk}}
 
===Master list complete===
See [[Module:Unicode chart/display]] and make any corrections/amendments as needed. Maybe I missed a few reading all those PDFs. Except for the CJK blocks where even "skimming" would be too generous a term. <s><code>display_NNNN</code></s> params will be whacked soon. ―[[special:contributions/cobaltcigs|cobaltcigs]] 04:38, 19 September 2019 (UTC)
:{{removed}} ―[[special:contributions/cobaltcigs|cobaltcigs]] 06:42, 19 September 2019 (UTC)
::I've reviewed the list and made some changes. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 17:28, 28 September 2019 (UTC)
 
==Going horizontal==
I've made the utf/html info slide to the right rather than downward when an alias list is present. Seems like a more efficient use of space. Seems to look okay next to the infamous BRAKCET correction, which I've confirmed is the longest string in the alias file. ―[[special:contributions/cobaltcigs|cobaltcigs]] 20:05, 18 September 2019 (UTC)
:I don't like the other information forced to the right when there's an alias. It's unexpected and I don't think the savings in vertical space makes up for it. Sorry, it just looks misaligned to me.
:Unrelated to the down vs. side option, I have two comments on the displayed information when you click on a code point:
:First, can we move the hex HTML escape sequence before the decimal one (&#x... / &#...)? I've never understood why someone would go through the trouble of calculating the decimal value of a code point in order to create an HTML escape sequence but maybe that's just me. In any case, having the hex value first would align nicer with the UTF-16 information directly above it. Hopefully the hex usage is more comman anyway so it would make sense putting it first.
:Second, instead of the wording "Introduced in Unicode version x", I'd like to use more precise wording that the source uses.[https://www.unicode.org/Public/UNIDATA/DerivedAge.txt] This wording change seems trivial but it gets around the messy issue of various pre-1.1 characters. If Age is 1.1 (the earliest shown in the file), it would say "Assigned as of Unicode 1.1". Otherwise it would say "Newly assigned in Unicode x". Thanks. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 17:28, 28 September 2019 (UTC)
 
==Named subsets added==
To more thoroughly address DRMcCreedy's item #13, I've added a way to refer to [[Module:Unicode chart/subsets|pre-defined named subsets]] in lieu of inputting a <code>range</code>. I suppose it may also be feasible to do unions/differences/intersections at some point, if there's a demand for it.
{{unicode chart
| block_name = Enclosed CJK Letters and Months
| link_name = Enclosed CJK Letters and Months
| display_name = Enclosed CJK Letters and Months (Hangul)
| subset = CJK_Letters_Months_Hangul
| info = yes
}}
Also new is the black line indicating skipped rows. Seems like a helpful feature.
 
The block name is also optional now. If omitted, there's no PDF link. But we can still set a display title and a link target for the subject. This would allow greater flexibility in generating a chart that transcends block divisions, such as "all control characters" (the subset name for which could be "special" in that it's generated by a function reading an [[Module:Unicode data/control|existing data file]], rather than hardcoded). But here's a sillier example for now.
 
{{unicode chart
| display_name = Basic Latin (vowels)
| link_name = English phonology#Vowels
| subset = Basic Latin vowels
| info = yes
}}
―[[special:contributions/cobaltcigs|cobaltcigs]] 13:45, 20 September 2019 (UTC)
 
:I'd lean towards a jagged line like a ripped piece of paper but the thick black line is certainly noticable enough for the user to realize something's going on. I would, however, like the notes to say "heavy" or "thick" black line because every row has a "black horizontal line". [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 17:28, 28 September 2019 (UTC)
 
== Orientation of glyphs for vertical scripts ==
 
For scripts such as [[Template:Unicode chart Mongolian|Mongolian]] and [[Template:Unicode chart Phags-pa|Phags-pa]] which are written in vertical columns, the glyphs in the font have horizontal orientation so that complete runs of horizontal text can be rotated into vertical orientation by a higher level protocol (commonly [[CSS]]). Currently, in our code charts we rotate the glyphs into vertical orientation. This used to match the Unicode code charts, which used to show vertically-oriented glyphs for Mongolian and Phags-pa, but a few years back the editor of the Unicode code charts deliberately changed the Mongolian and Phags-pa code charts to show horizontally-oriented glyphs to reflect how the glyphs are represented at the font level. My question is, should we continue to rotate glyphs in the dynamic Mongolian and Phags-pa charts or should we leave them in horizontal orientation to match the current Unicode code charts? My preference is to rotate into vertical orientation as this matches user expectation (it is how Mongolian and Phags-pa glyphs are presented in books on these scripts). [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 08:12, 28 September 2019 (UTC)
:I don't have a strong preference, although I do think Unicode showing them horizontally seems strange. Vertical seems better. [[User:Drmccreedy|DRMcCreedy]] ([[User talk:Drmccreedy|talk]]) 17:28, 28 September 2019 (UTC)
 
== Unicode 13.0 ==
 
Unicode 13.0 will be released in March. Can we complete outstanding work on the Unicode chart module by then? Or shall we continue to use the old Unicode chart templates for the Unicode 13 update? [[User:BabelStone|BabelStone]] ([[User talk:BabelStone|talk]]) 10:16, 10 January 2020 (UTC)
 
== The cell displayed for U+E003B (TAG SEMICOLON) contains a colon in a dashed box instead of a semicolon ==
 
The chart shown on page [[Tags_(Unicode_block)]] shows the various tag characters as their normal version in a dashed box, but the character shown in the box for U+E003B (TAG SEMICOLON) is a colon instead of a semicolon. I'm not quite sure where/how to update the template. [[Special:Contributions/81.107.76.114|81.107.76.114]] ([[User talk:81.107.76.114|talk]]) 00:29, 12 August 2020 (UTC)
 
== Missing end tag for table ==
 
{{Ping|Cobaltcigs|Erutuon}} {{Tl|Unicode chart}} has little usage guidance, and I came to [[Module talk:Unicode chart]] (this very page), which has 6 missing end tags for {{tag|table}}, all associated with {{Tl|Unicode chart}}. So I went to [[Special:WhatLinksHere/Template:Unicode chart|Pages that link to "Template:Unicode chart"]]. There are 6 pages that transclude {{Tl|Unicode chart}}, and they all have missing end tags for {{tag|table}}.
 
So, my request is either abandon this project, or write some usage notes that include how to use it without leaving a missing end tags lint error for {{tag|table}}. —[[User:Anomalocaris|Anomalocaris]] ([[User talk:Anomalocaris|talk]]) 07:32, 8 October 2023 (UTC)
: [[User:Vanisaac|Vanisaac]] mistakenly [[Special:Diff/1168432448|got rid of]] the end of the table (<code>|}</code>) while inserting this module into [[Template:Unicode chart]]. [[User:SWinxy|SWinxy]] [[Special:Diff/1169564617|added it back]], but inside the noinclude tag. I just moved it so that it was transcluded. I'm not sure the module should be in the template at this point because it's still marked as "pre-alpha" and hasn't been worked on since 2019, but I'm not going to try to evaluate that. — [[User:Erutuon|Eru]]·[[User talk:Erutuon|tuon]] 20:48, 8 October 2023 (UTC)
::Ah thank you. I must've thought that Module:Unicode chart somehow emitted a |} upon transclusion of this template, but not when the module was invoked, hence why I put the |} in the noinclude. [[User:SWinxy|SWinxy]] ([[User talk:SWinxy|talk]]) 21:38, 8 October 2023 (UTC)
::[[User:Erutuon|Erutuon]]: Thank you for taking care of this! —[[User:Anomalocaris|Anomalocaris]] ([[User talk:Anomalocaris|talk]]) 22:59, 8 October 2023 (UTC)
 
== Trying again from scratch ==
 
When I stumbled across this (April 2024) [[Template:Unicode chart]] wasn't working and no one seemed to be actively working on it. I sent a message to [[User:Cobaltcigs]] (the last person who edited [[Module:Unicode chart]] and when I didn't hear back, I went ahead and started trying to build by own version in the sandbox. The pages I'm using are:
* '''Lua:''' [[Module:Unicode chart/sandbox]]
* '''CSS:''' [[Template:Unicode chart/sandbox/styles.css]]
* '''Template:''' [[Template:Unicode chart/sandbox]]
 
After a couple days, I've created something that works in the majority of testcases, although there are still some edgecases for unusual characters that still need to be ironed out. You can see my version at:
* '''Testcases:''' [[Template:Unicode chart/sandbox/testcases]]
 
- [[User:Eievie|Eievie]] ([[User talk:Eievie|talk]]) 18:22, 22 April 2024 (UTC)