Revision as of 04:51, 16 March 2023 edit Graham87 (talk \| contribs) Autopatrolled, Event coordinators, Extended confirmed users, Page movers, Importers, Rollbackers, Temporary account IP viewers 298,107 edits →See also: rm line break to make proper HTML list for screen readers ← Previous edit		Latest revision as of 00:09, 13 November 2024 edit undo Jonesey95 (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, Mass message senders, Template editors 411,316 edits Fix high-priority Linter errors. I hope you don't mind this minor cleanup edit in your user space.
Line 32: Sometimes the semicolon is erroneously omitted. The bot attempts to detect this and suggests a repair, subject to manual approval by the bot operator. However, some entities are not checked for missing semicolons because they would cause too many false positives due to URLs of the form <nowiki>http://xxxxx.yyy?aaaa=....</nowiki>'''&bbbb'''<nowiki>=...</nowiki>. For example <ttkbd>&sect</ttkbd> is not checked for because it is rare and many such URLs may include variables such as "<ttkbd>&section=</ttkbd>". ==== Numeric character references ==== Line 105: More generally, <nowiki>[[A\|B]]</nowiki> is simplified to <nowiki>[[B]]</nowiki> if A and B differ only trivially (first letter case-insensitive and disregarding leading and trailing blanks). If A and B cannot be simplified, any leading and trailing blanks in the "A" part of <nowiki>[[A\|B]]</nowiki> are removed; however, they are not removed in the "B" part of <nowiki>[[A\|B]]</nowiki> or <nowiki>[[B]]</nowiki> (because we could have, for instance, <ttkbd>text text<nowiki>[[ link]]</nowiki> text</ttkbd>). A flag can be set to do further link simplification when certain conditions are fulfilled (such as the page at "A" being a redirect to B). This functionality is described at the [[User:Curpsbot-unicodify/redirects\|/redirects]] sub-page. '''This is still in development''' and is currently turned off. Line 173: Numeric character reference (&#<num>;) or character entity references (&<name>;) are not converted when they represent ASCII characters (eg, &#39; &amp; &gt; &lt; &quot;). This is because such usage may be intended to avoid being considered wiki markup: for instance [http://en.wikipedia.org/w/index.php?title=Battle_of_Calabria&oldid=21409945]: :<ttkbd><nowiki>''Warspite''</nowiki>&#39;s 381 mm rounds</ttkbd> where <ttkbd>&#39;</ttkbd> is used instead of <ttkbd><nowiki><nowiki>'</nowiki></nowiki></ttkbd>, to display: :''Warspite'''s 381 mm rounds Line 180: However, printable ASCII (not [[control character]]s or [[DEL]]) is almost always converted when it occurs in the form of %NN in link page names: for instance: :<ttkbd><nowiki>[[</nowiki>New_York%2C_New_York_%28song%29<nowiki>]]</nowiki></ttkbd> → <ttkbd><nowiki>[[</nowiki>New York, New York (song)<nowiki>]]</nowiki></ttkbd> The exceptions are for %5B ( [ ), %5D ( ] ) and %7C ( \| ), which are converted to numeric character references instead because otherwise they would interfere with the <nowiki>[[ \| ]]</nowiki> syntax. This is mostly hypothetical, since it's unlikely that these will ever occur in article titles. Line 200: ==Missing semicolons== The bot will try to detect missing final semicolons in character entity references (such as "<ttkbd>D&eacutej&aacute vu</ttkbd>" instead of "<ttkbd>D&eacute;j&aacute; vu</ttkbd>") and prompt the user on whether to repair this. This requires manual intervention because of the possibility of false positives. In particular, & occurs often in URLs, in the form: : <ttkbd><nowiki>http://xxxxx.yyy?aaa=...&bbb=...&ccc=...</nowiki></ttkbd>. * Some entities are not checked because they are substrings of another entity (for instance, "&sigma" is not checked because there would be a false positive with every occurrence of "&sigmaf") Line 207: The bot will also try to detect missing final semicolons in numeric character references (such as "<ttkbd>&#263</ttkbd>" instead of "<ttkbd>&#263;</ttkbd>", or the equivalent hexadecimal with "<ttkbd>&#x</ttkbd>") and prompt the user on whether to repair this. In this case, the possibility of false positives is greatly reduced compared to the previous case, however we still require manual intervention to approve the change. ==Right-to-left and bidirectional text== Line 226: When these numeric character references are converted to Unicode, the appearance (in the browser's editor or in the diffs [http://en.wikipedia.org/w/index.php?title=Ani_Maamin&diff=22549317&oldid=17321508]) displays as: [[<span dir="ltr">he:אני מאמין (פיוט)</span>]] when really it should display as: [[he:<span dir="rtl">אני מאמין (פיוט)</span>]] Note that this is only a display issue: the actual underlying Unicode characters are all in proper sequence and the [[:he:אני מאמין (פיוט)\|Hebrew interwiki link itself]] works fine and takes you to the correct page. The issue is that the browser display can't decide whehter the final closing parenthesis should attach to the preceding Hebrew letter ("ט") and display as "(" as a right-to-left closing parenthesis, or whether it should attach to the following ASCII character "]" and display as ")" as a left-to-right closing parenthesis. When embedded within article text — like this: אני מאמין (פיוט) — there may also be display issues, but in this case it is sufficient to enclose the text within <span dir="rtl"> … </span> to make it display properly: <span dir="rtl">אני מאמין (פיוט)</span>. In the case of Arabic or Hebrew interwiki links, I'll usually go ahead and manually approve the change: the convenience to Arabic- or Hebrew-speaking editors to be able to actually read the interwiki link (instead of dealing with &# soup) outweighs the single misplaced parenthesis. Other cases are handled on a case-by-case basis. In some especially complicated cases of embedding (for example [[Template:User ar-1]]) it will be preferable to leave the numeric character references rather than convert to Unicode.

Curpsbot-unicodify