Content deleted Content added
(36 intermediate revisions by 15 users not shown) | |||
Line 2:
{{copied|from=Module:String|from_oldid=552254999|to=:incubator:Module:Wp/nod/String|to_diff=4299113}}
{{User:HBC Archive Indexerbot/OptIn|target=/Archive index|mask=/Archive <#>|leading_zeros=0|indexhere=yes}}
{{archives|banner=yes|age=90|bot=lowercase sigmabot III}}{{User:MiszaBot/config
| algo=old(90d)
| archive=Module talk:String/Archive %(counter)d
Line 10:
| minthreadsleft=5
}}
== How to do string.replace of level3 headers in transcluded wikitext ==
Line 118 ⟶ 102:
:::: All that does is mapping each digit to its superscript unicode character. If we wanted to do the same using {{ml|string|replace}} we would need ten module invocations. --[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 01:55, 22 September 2023 (UTC)
::::: We already have [[Module:MultiReplace]] for that. Yes, the syntax you used is slightly terser than that module's syntax, but there's no need to shake up the world. [[User:Pppery|* Pppery *]] [[User talk:Pppery|<sub style="color:#800000">it has begun...</sub>]] 02:09, 22 September 2023 (UTC)
:::::: I have not done a speed test, but I suspect that syntax will not be the only thing where this function will greatly beat [[Module:MultiReplace]]. In fact, I suspect that performance will be tremendously less expensive here. This function is less powerful (the mapping happens verbatim character by character), hence, exactly for this reason, it is possible to map without problems strings that are page-long – while I suspect that [[Module:MultiReplace]] will break the servers in that case. If all this seems unrealistic, think about the amount of Unicode characters: just to map four types of accents in {{[[:la:Formula:Sine notis diacriticis|la:Sine notis diacriticis]]}} I had to write the following string:
:::::: <syntaxhighlight lang="wikitext">ĀĒĪŌŪȲĂĔĬŎŬÀÈÌÒÙỲÁÉÍÓÚÝÂÊÎÔÛŶāēīōūȳăĕĭŏŭàèìòùỳáéíóúýâêîôûŷ̄̆</syntaxhighlight>
:::::: I assure you, there can be way more complex transliterations than this. And last but not least: in terms of both computational complexity and ease of use, what you call “shaking up the world” is rather a going back to the old days. --[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 02:26, 22 September 2023 (UTC)
::::::: Yes, and do we really care? Don't make speculative claims about breaking the servers - see [[WP:PERF]]. I'm still not convinced that we need any of this level of complexity instead of just letting people type what they want to type rather than relying on templates to fix it for them and it's clear we won't convince each other at this point. [[User:Pppery|* Pppery *]] [[User talk:Pppery|<sub style="color:#800000">it has begun...</sub>]] 02:33, 22 September 2023 (UTC)
:::::::: Fine. Continuing with this spirit interface-wise Latin Wikipedia will soon beat English Wikipedia. --[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 02:41, 22 September 2023 (UTC)
== Protected edit request on 25 October 2023 ==
{{edit fully-protected|Module:String|answered=yes}}
Please, add '''r''' to the word '''fist''' (resulting in ''fi'''r'''st''), line number 61. [[User:Gkiyoshinishimoto|Nishimoto, Gilberto Kiyoshi]] ([[User talk:Gkiyoshinishimoto|talk]]) 18:11, 25 October 2023 (UTC)
== Protected edit request on 3 September 2024 ==
{{edit template-protected|answered=yes}}
All of the Lua pseudo-regex special characters are in the ASCII range. See [[:en:UTF-8#Encoding]]. Therefore, we don't need at all to use the (costly) <code>mw.ustring.*</code> functions in some parts I have reviewed.
My request is to replace:
<syntaxhighlight lang="lua">
function str._escapePattern( pattern_str )
return mw.ustring.gsub( pattern_str, "([%(%)%.%%%+%-%*%?%[%^%$%]])", "%%%1" )
end
</syntaxhighlight>
with:
<syntaxhighlight lang="lua">
function str._escapePattern( pattern_str )
return ( string.gsub( pattern_str, "[%(%)%.%%%+%-%*%?%[%^%$%]]", "%%%0" ) )
end
</syntaxhighlight>
(I am also removing the capture group, which is unneeded as we can use the "%0" whole capture)
('''edit:''' I am also taking the opportunity, for extra robustness, to add parentheses in order to discard the 2nd value (number of replacements) returned by these gsub() functions, then subsequently by _escapePattern(). The more I encounter this "multiple values returned" Lua feature, the more I think it was a terrible design idea)
Second change: line 409, we can similarly replace:
<syntaxhighlight lang="lua">
replace = mw.ustring.gsub( replace, "%%", "%%%%" ) --Only need to escape replacement sequences.
</syntaxhighlight>
with:
<syntaxhighlight lang="lua">
replace = string.gsub( replace, "%%", "%%%%" ) --Only need to escape replacement sequences.
</syntaxhighlight>
These changes would significantly decrease the overhead of having the "plain mode" enabled in this module's functions.
[[User:Od1n|Od1n]] ([[User talk:Od1n|talk]]) 03:26, 3 September 2024 (UTC)
:[{{fullurl:Module:String|diff=prev&oldid=1243840019}} 1243840019], thanks. [[User:Od1n|Od1n]] ([[User talk:Od1n|talk]]) 22:38, 3 September 2024 (UTC)
== Protected edit request on 18 October 2024 ==
{{edit fully-protected|Module:String|answered=yes}}
The value returned by a module function must always be a string, however some functions here return numbers (these are <code>[[Module:String#len|len]]</code>, <code>[[Module:String#str_find|str_find]]</code>, <code>[[Module:String#find|find]]</code> and <code>[[Module:String#count|count]]</code>). Could you please apply [[Special:Diff/1251805774/1251806211|this diff]]? You can just copy and paste the code at [[Special:PermanentLink/1251806211|this permanent link]].
Although unnoticeable when used in normal wikitext, this can create problems when [[Module:String]] is invoked using other modules.
For instance, focusing on the <code>[[Module:String#len|len]]</code> function, for each argument passed, a template named <code>mytemplate</code> containing the following code
<syntaxhighlight lang="wikitext">{{#invoke:params|mapping_by_invoking|string|len|mapping_by_replacing|^.*$|%0 mod 3|1|for_each|[$#:$@]}}</syntaxhighlight>
should print <syntaxhighlight lang="wikitext" inline>[PARAMETER-NAME:LENGTH-OF-PARAMETER mod 3]</syntaxhighlight>
The code above invokes <code>{{mfl|string|len|...}}</code> for each parameter passed. Then it attempts to replace the lengths saved with <code>%0 mod 3</code>, i.e. by adding <code> mod 3</code> at the end of each parameter. And so, for instance, <syntaxhighlight lang="wikitext" inline>{{mytemplate|hello|world|foo|bar}}</syntaxhighlight> should print
: [1:5 mod 3][2:5 mod 3][3:3 mod 3][4:3 mod 3]
However, since <code>{{mfl|string|len|...}}</code> returns a number, any attempt to do string manipulation with the number returned will generate an error. --[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 05:17, 18 October 2024 (UTC)
: {{not done}}:<!-- Template:ESp --> {{tq|q=y|The value returned by a module function must always be a string}} is not true. [[mw:Extension:Scribunto/Lua reference manual#Returning text]] states {{tq|The module function should usually return a single string; whatever values are returned will be passed through tostring() and then concatenated with no separator.}} Further, when calling a module function from other Lua code even that doesn't apply; in that case it's like any other Lua function. I also note this change may well break other code that calls these functions (if it for some reason calls functions from this module instead of calling Scribunto's string manipulation functions directly) that expect a number from <code>len</code> or the like. [[User:Anomie|Anomie]][[User talk:Anomie|⚔]] 11:12, 18 October 2024 (UTC)
:: Alright, it seems then that I will have to fix that in {{mfl|params|mapping_by_invoking}} and stringify whatever modules may return. --[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 13:35, 18 October 2024 (UTC)
== Bug in <code>replace</code>: empty strings are not recognized ==
Hi. I noticed that the <code>[[Module:String#replace|replace]]</code> function is unable to recognize empty strings (see third example):
# <syntaxhighlight lang="wikitext" inline>{{#invoke:string|replace|Foo|^.*$|Hello|1|false}}</syntaxhighlight>
#: ↳ {{#invoke:string|replace|Foo|^.*$|Hello|1|false}}
# <syntaxhighlight lang="wikitext" inline>{{#invoke:string|replace|Bar|^.*$|Hello|1|false}}</syntaxhighlight>
#: ↳ {{#invoke:string|replace|Bar|^.*$|Hello|1|false}}
# <syntaxhighlight lang="wikitext" inline>{{#invoke:string|replace||^.*$|Hello|1|false}}</syntaxhighlight>
#: ↳ {{#invoke:string|replace||^.*$|Hello|1|false}}
--[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 10:47, 12 July 2025 (UTC)
:Because of [[Module:String#L-402--L-404|lines 402–404]]. The reasoning for that code is not, so far as I can tell, documented. There is similar code, also not documented, in <code>find()</code> but that code makes some sort of sense – find anything in an empty string should return <code>0</code>. Makes me wonder if <code>replace()</code> was created after <code>find()</code> and used <code>find()</code> as an armature upon which to construct <code>replace()</code>. Seems to me that [[Module:String#L-402|line 402]] could be rewritten as: <syntaxhighlight lang="lua" inline="1">if '' == pattern then</syntaxhighlight>. But, are there any templates out there that rely on this anomaly?
:—[[User:Trappist the monk|Trappist the monk]] ([[User talk:Trappist the monk|talk]]) 13:24, 12 July 2025 (UTC)
::Function <code>replace()</code> [[Special:Diff/540121093|was added on 24 February 2013]], two days after [[Special:Diff/539690696|function <code>find()</code> was added]]. The early return in <code><nowiki>if source_str == '' or pattern == '' [...]</nowiki></code> was added in between those edits: [[Special:Diff/540073010]]. —[[User:Andrybak|andrybak]] ([[User talk:Andrybak|talk]]) 14:10, 12 July 2025 (UTC)
:::With some work (<code><nowiki>{{#invoke:string|replace|2=^.*$|3=Hello|4=1|5=false}}</nowiki></code>), it is possible for there to be no parameter 1. I don't know what <code>_getParameters</code> would do with that but the code in <code>str.replace</code> should handle a situation where parameter 1 is nil. For convenience, the code treats nil and empty as the same and that might be part of the reasoning for returning an empty string. I agree that <code>^.*$</code> should match an empty string although, as mentioned above, it is possible that someone has taken advantage of this undocumented behavior. {{ping|WOSlinker}} Any thoughts? [[User:Johnuniq|Johnuniq]] ([[User talk:Johnuniq|talk]]) 04:37, 13 July 2025 (UTC)
::::Yes, I think I must have just copied find and updated the code to do replace. There only seems to be [https://en.wikipedia.org/w/index.php?title=Special:Search&limit=50&offset=0&ns0=1&ns1=1&ns2=1&ns3=1&ns4=1&ns5=1&ns6=1&ns7=1&ns8=1&ns9=1&ns10=1&ns11=1&ns12=1&ns13=1&ns14=1&ns15=1&ns100=1&ns101=1&ns118=1&ns119=1&ns828=1&ns829=1&search=insource%3A%2F%5C%5E%5C.%5C%2A%5C%24%2F 24 occurences] of <code>^.*$</code> so won't take long to check if the undocumented behaviour is used. -- [[User:WOSlinker|WOSlinker]] ([[User talk:WOSlinker|talk]]) 07:39, 13 July 2025 (UTC)
:::::@[[User:WOSlinker|WOSlinker]]: Unfortunately there are an arbitrary number of patterns that can match an empty string, e.g., {{code|^X*$}}, {{code|X*}}, {{code|X?}} and of course an empty string will match another empty string, etc. There are certainly better ways to replace empty strings with nonempty ones but the logic is valid. The suggestion [[User:Trappist the monk|Trappist the monk]] made is not the right solution either because it ignores the {{code|replace}} text. Instead change the {{code|lang=lua|or}} to an {{code|lang=lua|and}} and change the return from {{code|source_str}} to {{code|replace}}. In fact, another optimization would be: inside {{code|lang=lua|if plain then}} add {{code|lang=lua|1=if pattern == source_str then return replace end}}. —[[User:Uzume|Uzume]] ([[User talk:Uzume|talk]]) 19:16, 16 July 2025 (UTC)
:::::: {{Re|Trappist the monk|andrybak|Johnuniq|WOSlinker|Uzume}} Any updates on this? --[[User:Grufo|Grufo]] ([[User talk:Grufo|talk]]) 12:49, 27 July 2025 (UTC)
|