Wikipedia:Bots/Requests for approval/ScannerBot: Difference between revisions
Content deleted Content added
→Discussion: Reply |
→Discussion: Reply |
||
Line 72:
*:::<code>{{red|<nowiki>https://www.webcitation.org/6d0sXMyOT?url=https://twitter.com</nowiki>}}</code> [[User:0xDeadbeef|<span style="font-family:Fira Mono,Courier New,monospace">0x<span style="text-transform:uppercase">Deadbeef</span></span>]] <span style="font-family: serif">([[User talk:0xDeadbeef|T]] [[Special:Contributions/0xDeadbeef|C]])</span> 04:03, 15 May 2022 (UTC)
*::::Right, I can't say what the regex would be. One method is match every string "/https?://twitter" and convert to "__hidestring__" (same with "?url=") - and when done convert those hidden strings back before saving the article. The "__hidestring__" might be "__hidestring-fs-http__" or "__hidestring-fs-https__" so you know how to revert back. Or really best, save the literal string in a table and the hidden string is the table identifier so it be restored. That way it can match on "/https?://(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\\-]*[a-zA-Z0-9])[.])*twitter" which will capture all hostname(s) such as "/http://beta.twitter" -- [[User:GreenC|<span style="color: #006A4E;">'''Green'''</span>]][[User talk:GreenC|<span style="color: #093;">'''C'''</span>]] 17:33, 15 May 2022 (UTC)
*:::::Okay I used a negative lookbehind and you can look at the tests here: https://regexr.com/6lmgl [[User:0xDeadbeef|<span style="font-family:Fira Mono,Courier New,monospace">0x<span style="text-transform:uppercase">Deadbeef</span></span>]] <span style="font-family: serif">([[User talk:0xDeadbeef|T]] [[Special:Contributions/0xDeadbeef|C]])</span> 23:18, 15 May 2022 (UTC)
|