Module talk:Lang-zh: Difference between revisions

Content deleted Content added
 
(39 intermediate revisions by 10 users not shown)
Line 22:
__TOC__
 
== The unnamed parameter ==
== Template-protected edit request on 5 April 2024 ==
 
Hiya! I've just only now noticed that if you pass {{tl|lang-zh}} and pals an unnamed parameter in addition to any other parameter (like {{para|p}}, fr.ex.), the unnamed parameter does not display, rather than being aliased to {{para|c}}.{{pb}}This '''is''' in the documentation, but what's the reason for this behaviour? I'm struggling to imagine any case where an editor calls {{tl|lang-zh}} with an unnamed parameter alongside any other parameter, and puts anything in the {{para|1}} spot other than the Chinese text to display (the basic purpose of the template).{{pb}}Happy to hear I'm just dumb and have overlooked an obvious use case, but if no one knows of any can we just map {{para|1}} to {{para|c}} excepting where one or more of {{mset|{{para|c}}, {{para|t}}, {{para|s}}}} is passed to the template? [[User:Folly Mox|Folly Mox]] ([[User talk:Folly Mox|talk]]) 18:08, 25 January 2025 (UTC)
{{edit template-protected|Module:Lang-zh|answered=yes}}
I would like to enable the option "first=poj" analogously to "first=j". The "first=j" option allows Cantonese romanisations to be given before Mandarin romanisations, in articles where Cantonese is more relevant. The proposed "first=poj" option would allow Hokkien romanisation (POJ) to be given first, in articles where Hokkien more relevant, e.g. for [[Bukit Ho Swee]], [[Hong-Gah Museum]], [[Tamsui District]].
 
:@[[User:Folly Mox|Folly Mox]] I fully agree with your request. I have no idea why <code><nowiki>{{lang-zh|实验|p=shíyàn}}</nowiki></code> displays as {{lang-zh|实验|p=shíyàn}}, instead of {{lang-zh|c=实验|p=shíyàn}}. {{lang-zh|实验}} and {{lang-zh|实验|labels=no}} work, though. [[User:Toadspike|<span style="color:#21a81e;font-variant: small-caps;font-weight:bold;">'''Toadspike'''</span>]] [[User talk:Toadspike|<span style="color:#21a81e;font-variant: small-caps;font-weight:bold;">[Talk]</span>]] 21:19, 30 January 2025 (UTC)
I believe this could be achieved by adding the following:
 
::{{re|Folly Mox|Toadspike}} I agree that this is confusing behaviour. It looks like the module currently only processes an unnamed argument at the end, on lines 287-309, in the case that it has not constructed any output. This section of code also duplicates some of the code earlier in the module, which is bad practice. As well as fixing the problem above, it would also be simpler and more maintainable to remove this section of code and instead map {{para|1}} to {{para|c}} at the beginning (e.g. just after line 103, where two other aliases are defined). [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 15:37, 22 April 2025 (UTC)
From line 114, after:
 
== no-merging s and t ==
local j1 = false -- whether Cantonese Romanisations go first
 
Today I was editing [[Jing (philosophy)]] and I realized that the article began with "(Chinese: 敬; Chinese: 敬)". This is because the editor had written <code><nowiki>{{zh|c=敬|t=敬}}</nowiki></code> and gotten {{zh|c=敬|t=敬}}. I tried to correct this to <code><nowiki>{{zh|s=敬|t=敬}}</nowiki></code> and got a single "Chinese: 敬" instead, which does not adequately display the difference in the [[grass radical]]. I write this here as support for the merge=no option, which I read about in the talk page archives of this template while trying to figure this situation out. Then again: I don't know Chinese, so maybe this ''is'' an unimportant difference.
insert:
 
(After dabbling with variation selectors, I ended up just using a zero width space to differentiate the fields.) [[User:Dingolover6969|Dingolover6969]] ([[User talk:Dingolover6969|talk]]) 14:22, 20 April 2025 (UTC)
local poj1 = false -- whether Hokkien Romanisations go first
:{{re|Dingolover6969}} I see the motivation for occasionally needing to distinguish character variants that share a Unicode code point, but the need for an extremely lengthy invisible comment indicates that using a non-breaking space is not a good solution. It also allows a linebreak between the character and the semi-colon, which is incorrect.
:In the particular case of [[Jing (philosophy)]], highlighting the distinction is unnecessary. In the talk page archive that you mentioned (where the conclusion was not to include a no-merge option), I fully agree with the comment by {{ping|Kanguole}} "What is the purpose of this template? I think it is to convey the Chinese name of the person, book, etc that is the subject of article, for readers who understand characters. It's not to give lessons in typography to people who don't understand hanzi – we have specialist articles for that."
:On [[Wade–Giles#Tones]] where you have also added a non-breaking space, the same effect can be more transparently achieved by using the template twice. In that case, the difference is again unimportant, but I support displaying both variants because it aligns the characters with the other rows in the table. [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 09:18, 22 April 2025 (UTC)
::I see what you mean. I have no strong opinions about it, really. Whether "for readers who understand characters" this distinction is important is beyond me. Another editor had apparently tried to achieve this, creating an odd result, but maybe they were mistaken to want to.
::Good call on the Wade-Giles tone table.
::Anyway, for those rare cases when the distinction might be required and the template is invoked only once:
::I thought allowing breaking was the right behavior, out of ignorance, but reviewing [https://www.unicode.org/reports/tr14/tr14-51.html the Unicode recommendation on the matter] makes me think I was wrong, and in fact line breaks shouldn't be allowed before semicolons or close-parentheses. A [[word joiner]] &amp;NoBreak; could be used instead to get the correct behavior.
::There also is some intended way to do this in Unicode itself with variation selectors. I'm pretty sure this is the way, but not sure enough that I would want to put in on a page: {{zh|s=敬&#xFE00;|t=敬|labels=no}} (text: 敬&#xFE00;, 敬) {{zh|s=麻|t=麻&#xFE00;|labels=no}} (text: 麻, 麻&#xFE00;). The details of which variation selector to use for what is different for every character. It's a simple enough system, and the ideal solution if you're confident, but I'm not confident. The "text" renditions above should also display as the appropriate trad and simp variants — they don't on my system; the template displays them distinctly simply because they are distinct Unicode sequences (post-normalization). [[User:Dingolover6969|Dingolover6969]] ([[User talk:Dingolover6969|talk]]) 15:27, 28 April 2025 (UTC)
:::Chinese speaker (reader?) here, I usually gloss over the two different versions of the grass radical without noticing it. I've never heard of anyone caring about the difference. Unless we're discussing that radical specifically, I don't see the need to list both characters; if we ''are'' discussing that radical specifically, it might be better to use images, where the difference can be more clearly shown. I'm not firmly opposed to adding a merge=no option, but I haven't yet seen a case where it would be useful. [[User:Toadspike|<span style="color:#21a81e;font-variant: small-caps;font-weight:bold;">'''Toadspike'''</span>]] [[User talk:Toadspike|<span style="color:#21a81e;font-variant: small-caps;font-weight:bold;">[Talk]</span>]] 21:00, 28 April 2025 (UTC)
 
== Why are semicolons used instead of commas? ==
From line 121, after:
 
I had a momentary confusion on the article ''[[Captain of Destiny]]'' where it sort of looks like "literally" doesn't apply to the Chinese text. I've since edited it to clarify, but look at the [[Special:Permalink/1224678293|previous version]] to see what I mean. I skimmed the template documentation but didn't see a reason given. — [[User:W.andrea|W.andrea]] ([[User talk:W.andrea|talk]]) 14:48, 1 May 2025 (UTC)
if (testChar == "j") then
:The semicolon was outside of the template in the previous version. How would this template or its documentation affect that situation? In your updated version, you have used the {{para|l}} parameter, but I would have put "Cheung Po the Kid" as the value, per the linked article. – [[User:Jonesey95|Jonesey95]] ([[User talk:Jonesey95|talk]]) 17:47, 1 May 2025 (UTC)
j1 = true
::Sorry, I don't think you understand what I'm asking: For most languages, things are separated by commas, while different languages are separated by semicolons, e.g. {{tqb|{{langx|fr|maison|lit=house}}; {{langx|es|casa|lit=house}}}} Why does this template use semicolons instead of commas? e.g. {{tqb|{{zh|c=房屋|l=house}}}} For a counter-example, {{tlx|langx}} uses commas: {{tqb|{{langx|zh|房屋|lit=house}}}} — [[User:W.andrea|W.andrea]] ([[User talk:W.andrea|talk]]) 13:44, 2 May 2025 (UTC)
end
::{{tqbm|The semicolon was outside of the template in the previous version. How would this template or its documentation affect that situation?}} Oh sorry, I might have misunderstood what you meant by this. Yes the semicolon was outside the template, but it was following the convention of the template, and after editing, the semicolon is inside the template. — [[User:W.andrea|W.andrea]] ([[User talk:W.andrea|talk]]) 13:51, 2 May 2025 (UTC)
::{{tqbm|In your updated version, you have used the {{para|l}} parameter, but I would have put "Cheung Po the Kid" as the value, per the linked article.}} I just used what was already there and it's the title of the linked article. I don't speak Chinese anyway so I wouldn't feel comfortable changing it. But this is beside the point of my question. — [[User:W.andrea|W.andrea]] ([[User talk:W.andrea|talk]]) 13:53, 2 May 2025 (UTC)
:::I see what you mean. This module appears to separate each parameter with semicolons. The list of parameters in lines 18–52 are undifferentiated. I think someone would have to adjust the module code to precede "lit." with a comma. – [[User:Jonesey95|Jonesey95]] ([[User talk:Jonesey95|talk]]) 17:24, 2 May 2025 (UTC)
 
== L switch throwing Linter errors when value ends with a closing italics tag ==
insert:
 
Just noticed this on the [[Mangtong]] page while clearing old "missing end tag" errors:
if (testChar == "poj") then
poj1 = true
end
 
{|-
(The variable is named "testChar" but it is defined by the regular expression "%a+", which will match not only a single character but also longer strings.)
!Code...
|
!Renders as...
|-
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=reformed ''mangtong''}}), was developed in the 20th century.</nowiki></code>
|
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l= reformed ''mangtong''}}), was developed in the 20th century.
|}
 
(On a separate note, there seems to be a superfluous space before "end" on lines 120 and 123.)
 
It seems like passing a value ending with an italicized value via the <code>l=</code> parameter throws the Linter error. Adding a &nbsp or equivalent parameter after the closing italics tag won't resolve the error...
From line 137, after:
 
{|-
if (j1) then
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=reformed ''mangtong''&nbsp;}}), was developed in the 20th century.</nowiki></code>
orderlist[4] = "j"
|
orderlist[5] = "cy"
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l= reformed ''mangtong''&nbsp;}}), was developed in the 20th century.
orderlist[6] = "sl"
|-
orderlist[7] = "p"
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=reformed ''mangtong''{{nbsp}}}}), was developed in the 20th century.</nowiki></code>
orderlist[8] = "tp"
|
orderlist[9] = "w"
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l= reformed ''mangtong''{{nbsp}}}}), was developed in the 20th century.
end
|-
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=reformed ''mangtong'' }}), was developed in the 20th century.</nowiki></code>
|
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l= reformed ''mangtong'' }}), was developed in the 20th century.
|}
 
insert:
 
The only solution appears to be if a non-apostrophe/non-quotation after the close-italics tag but before the }}:
if (poj1) then
orderlist[4] = "poj"
orderlist[5] = "p"
orderlist[6] = "tp"
orderlist[7] = "w"
orderlist[8] = "j"
orderlist[9] = "cy"
orderlist[10] = "sl"
end
 
{|-
This puts POJ before the Mandarin and Cantonese romanisations. [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 08:49, 5 April 2024 (UTC)
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=′reformed ''mangtong''′}}), was developed in the 20th century.</nowiki></code>
: {{Done}} [[User:Pppery|* Pppery *]] [[User talk:Pppery|<sub style="color:#800000">it has begun...</sub>]] 02:53, 15 April 2024 (UTC)
|
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=′reformed ''mangtong''′}}), was developed in the 20th century.
|}
 
...which seems a bit of a hack.
== Double-quotes around glosses ==
 
Is there a reason we use double-quotes rather than single-quotes to show the output of {{para|tr}}? [[MOS:SIMPLEGLOSS]] suggests we should prefer singles. — <span class="vcard"><span class="fn">[[User:OwenBlacker|OwenBlacker]]</span> <small>(he/him; [[User talk:OwenBlacker|Talk]])</small></span> 17:37, 18 June 2024 (UTC)
 
Interestingly, attempting to italicize the entire value results in both italics tags being ignored (same as when leaving italics tags out altogether)...
:Because {{para|l}} is used for literal translations & glosses, and {{para|tr}} is (much more rarely) used for non-literal translations. [[User:Remsense|<span style="border-radius:2px 0 0 2px;padding:3px;background:#1E816F;color:#fff">'''Remsense'''</span>]][[User talk:Remsense|<span lang="zh" style="border:1px solid #1E816F;border-radius:0 2px 2px 0;padding:1px 3px;color:#000">诉</span>]] 17:39, 18 June 2024 (UTC)
::Aha, that makes sense. So I have probably been misusing {{para|tr}} when I should have been using {{para|l}}. Thank you! — <span class="vcard"><span class="fn">[[User:OwenBlacker|OwenBlacker]]</span> <small>(he/him; [[User talk:OwenBlacker|Talk]])</small></span> 18:03, 18 June 2024 (UTC)
 
{|-
== Commas within literal glosses ==
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=''reformed mangtong''}}), was developed in the 20th century.</nowiki></code>
|
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=''reformed mangtong''}}), was developed in the 20th century.
|-
|<code><nowiki>A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=reformed mangtong}}), was developed in the 20th century.</nowiki></code>
|
|A modernized version of the ''mangtong'', called ''gǎigé mángtǒng'' ( {{zh|c=改革芒筒|l=reformed mangtong}}), was developed in the 20th century.
|}
 
Any chance this can be fixed?
What should we do if there needs to be a comma within a literal translation? I noticed this on [[Yi Jian Mei (song)]], where the quotes should be placed around the whole comma-separated phrase, not individually around each side of the comma. [[User:Pacificboy|pacificboy]] ([[User talk:Pacificboy|talk]]) 03:56, 11 July 2024 (UTC)
 
[[User:SirOlgen|SirOlgen]] ([[User talk:SirOlgen|talk]]) 16:28, 21 August 2025 (UTC)
:My assumption when adding this feature was that if one needed to add a comma, it should probably be treated as a proper translation, not a gloss. It turns out I never use this formatting, so I could very plausibly disable it. [[User:Remsense|<span style="border-radius:2px 0 0 2px;padding:3px;background:#1E816F;color:#fff">'''Remsense'''</span>]][[User talk:Remsense|<span lang="zh" style="border:1px solid #1E816F;border-radius:0 2px 2px 0;padding:1px 3px;color:#000">诉</span>]] 05:49, 11 July 2024 (UTC)
::Ah, that makes sense! I’ll convert it to a translation. Thanks. [[User:Pacificboy|pacificboy]] ([[User talk:Pacificboy|talk]]) 02:45, 12 July 2024 (UTC)
 
:So to put it in non-Lint terms, it doesn't appear as though this template's "literal" parameter is properly handling italics tags which occur either first or last in the passed string. [[User:SirOlgen|SirOlgen]] ([[User talk:SirOlgen|talk]]) 18:09, 21 August 2025 (UTC)
== Template-protected edit request on 17 August 2024 ==
::Yes, see this previous discussion: [[Module_talk:Lang-zh/Archive_5#Trailing_bold_in_l=_not_being_removed]]. A workaround is to use HTML italic tags, e.g. <code><nowiki>{{zh|l=reformed <i>mangtong</i>}}</nowiki></code> for {{zh|l=reformed <i>mangtong</i>}}. The stripping of bold and italic markup is commented in the code but not mentioned in the template documentation. At the time of the previous discussion, I wasn't convinced that stripping quotation marks was necessary, but I also didn't know of any use cases where it would cause a problem and so I didn't push the point further. However, this looks like a valid use case, and I think we should revisit the question of why quotes should be stripped. In contrast, this doesn't happen for {{tl|lit}}, which otherwise has extremely similar functionality to the {{para|l}} parameter. [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 21:57, 22 August 2025 (UTC)
 
:::Thanks a million for the background info and workaround (I'm embarrassed for not having thought to try that, LOL). This does seem like a pretty obscure use case, but it also seems a near certainty there will be more examples out there among the 1.9M outstanding "[[Special:LintErrors/missing-end-tag|missing end tag]]" errors.
{{Edit template-protected|Module:Lang-zh|answered=yes}}
:::Thanks again!
 
:::[[User:SirOlgen|SirOlgen]] ([[User talk:SirOlgen|talk]]) 02:11, 24 August 2025 (UTC)
I propose the following changes to add [[Tâi-uân Lô-má-jī Phing-im Hong-àn|Tâi-lô]] romanization support. Of course, POJ covers 95% of Hokkien/Minnan use cases (hence why I have added the "tailo" IANA subtag) but it could still be useful for Taiwanese-specific pages. Additions and modifications below:
 
<syntaxhighlight lang="diff">
--- Module:Lang-zh
+++ Module:Lang-zh
 
@@ after line 29 @@ local labels = {
["sl"] = "Sidney Lau",
["poj"] = "Pe̍h-ōe-jī",
+ ["tl"] = "Tâi-lô",
["zhu"] = "Zhuyin Fuhao",
["l"] = "lit.",
@@ after line 46 @@ local wlinks = {
["poj"] = "Pe̍h-ōe-jī",
+ ["tl"] = "Tâi-uân Lô-má-jī Phing-im Hong-àn",
@@ after line 63 @@ local ISOlang = {
["poj"] = "nan-Latn",
+ ["tl"] = "nan-Latn-tailo",
 
@@ after line 74 @@ local italic = {
["poj"] = true,
+ ["tl"] = true,
 
@@ at line 136 @@
- local orderlist = {"c", "s", "t", "p", "tp", "w", "j", "cy", "sl", "poj", "zhu", "l", "tr"}
+ local orderlist = {"c", "s", "t", "p", "tp", "w", "j", "cy", "sl", "poj", "tl", "zhu", "l", "tr"}
 
@@ after line 150 @@ if (poj1) then
orderlist[4] = "poj"
- orderlist[5] = "p"
- orderlist[6] = "tp"
- orderlist[7] = "w"
- orderlist[8] = "j"
- orderlist[9] = "cy"
- orderlist[10] = "sl"
+ orderlist[5] = "tl"
+ orderlist[6] = "p"
+ orderlist[7] = "tp"
+ orderlist[8] = "w"
+ orderlist[9] = "j"
+ orderlist[10] = "cy"
+ orderlist[11] = "sl"
end
</syntaxhighlight>
[[User:MSG17|MSG17]] ([[User talk:MSG17|talk]]) 15:53, 17 August 2024 (UTC)
 
:@[[User:MSG17|MSG17]]: This sounds reasonable, and would be helpful on pages such as [[Penang Hokkien]] where both POJ and TL are used in the article text. @[[User:Pppery|Pppery]] or @[[User:Jonesey95|Jonesey95]], would you be able to help here? [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 13:03, 19 September 2024 (UTC)
::I'll take a look at this ASAP, thank you for your improvements! <span style="border-radius:2px;padding:3px;background:#1E816F">[[User:Remsense|<span style="color:#fff">'''Remsense'''</span>]]<span style="color:#fff">&nbsp;‥&nbsp;</span>[[User talk:Remsense|<span lang="zh" style="color:#fff">'''论'''</span>]]</span> 13:06, 19 September 2024 (UTC)
:{{done}}<!-- Template:ETp --> <span style="border-radius:2px;padding:3px;background:#1E816F">[[User:Remsense|<span style="color:#fff">'''Remsense'''</span>]]<span style="color:#fff">&nbsp;‥&nbsp;</span>[[User talk:Remsense|<span lang="zh" style="color:#fff">'''论'''</span>]]</span> 13:48, 19 September 2024 (UTC)
 
== Further romanization discussion ==
Coming off of my request to add Tâi-lô, what other romanization systems should be added to the template? I feel like [[Pha̍k-fa-sṳ]] annd [[Wugniu]] could be helpful. I don't see any IANA latn subtages for other Sinitic languages however. [[User:MSG17|MSG17]] ([[User talk:MSG17|talk]]) 15:53, 17 August 2024 (UTC)
 
== Trailing bold in l= not being removed ==
 
In <syntaxhighlight>{{zh|t=竹子林站|j=Zuk1 Zi2 Lam4 Zaam6|l = '''Bamboo Forest station'''}}</syntaxhighlight>, the opening bold markup is properly removed, but the trailing bold markup is not removed. It looks like the regular expression at <syntaxhighlight>term = string.gsub(term, "^([ \"']*)(.*)([ \"']*)$", "%2")</syntaxhighlight> needs some adjustment to the middle wildcard search. – [[User:Jonesey95|Jonesey95]] ([[User talk:Jonesey95|talk]]) 13:23, 16 September 2024 (UTC)
 
:{{ping|Jonesey95}} This is because the * operator is greedy, so .* matches everything else in the string. Changing .* to .*? would make it lazy, so that the final term catches all trailing characters. In other words, change the line of code to: <syntaxhighlight>term = string.gsub(term, "^([ \"']*)(.*?)([ \"']*)$", "%2")</syntaxhighlight> [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 13:51, 16 September 2024 (UTC)
::Thanks! That fixed the problem at [[Zhuzilin station]] and probably other pages. – [[User:Jonesey95|Jonesey95]] ([[User talk:Jonesey95|talk]]) 17:26, 16 September 2024 (UTC)
::Thank you for fixing my shoddy regex, by the way. <span style="border-radius:2px;padding:3px;background:#1E816F">[[User:Remsense|<span style="color:#fff">'''Remsense'''</span>]]<span style="color:#fff">&nbsp;‥&nbsp;</span>[[User talk:Remsense|<span lang="zh" style="color:#fff">'''论'''</span>]]</span> 13:05, 19 September 2024 (UTC)
:::{{re|Jonesey95|Remsense}} On further reflection, this doesn't work as intended. I had thought the string was a regex, but it is in fact a Lua pattern, which is slightly different. The Lua equivalent of *? is - which would give: <syntaxhighlight>term = string.gsub(term, "^([ \"']*)(.-)([ \"']*)$", "%2")</syntaxhighlight> Writing .*? in Lua (as I suggested above) actually means greedily matching all characters (.*) followed by a single question mark (? can also be an operator, but Lua pattern operators can't be nested so in this context it is interpreted as a literal). So actually the new pattern usually doesn't make a substitution, unless there is a question mark. This means it usually fails, e.g. where there are multiple glosses separated by commas and spaces, the spaces are not stripped. However, looking at what the pattern match applies to, I'm not completely sure I understand why the quotes should be stripped in the first place (is there a set of testcases to check against?). At [[Zhuzilin station]], the current code makes no substitution, and so it keeps the bold formatting, presumably as intended. The old code meant that the bold formatting was stripped at the beginning and not the end, so the rest of the article became bold (which was a bad and confusing error). Correcting .*? to .- as above would strip both, making it impossible to add bold formatting. Is the intention to catch cases where an editor unnecessarily adds quotes to the gloss? Is this a common problem? If so, is removing the ability to add bold and italic formatting a fair price to pay?
::: If we want to strip one quote mark but no more (so that we catch editors manually adding quotes, but allow formatting), pattern matching is bit more complicated. I think it would be easiest to separate the stripping of whitespace and quotes. When stripping one single quote, we need to check that there isn't more than one, but we also need to allow the string to contain an apostrophe (so we can't just use [^']- in the middle) and a gloss could potentially be a single character (so we can't just use [^'].-[^'] in the middle). So it seems easiest to strip the leading and trailing quotes separately. This gives three lines (I've also removed two sets of brackets that were capturing substrings that weren't used): <syntaxhighlight>term = string.gsub(term, "^ *(.-) *$", "%1")
term = string.gsub(term, "^[\"']?([^\"'].-)$", "%1")
term = string.gsub(term, "^(.-[^\"'])[\"']?$", "%1")</syntaxhighlight> [[User:Freelance Intellectual|Freelance Intellectual]] ([[User talk:Freelance Intellectual|talk]]) 15:43, 24 September 2024 (UTC)