Wikipedia talk:WikiProject JavaScript: Difference between revisions
Content deleted Content added
→The whole regex: reply |
→The whole regex: add explanation |
||
Line 198:
: {{ping|Nathanm mn}} We still haven't figured it out. The problem I'm trying to solve is how to identify when a list item has a child. A child list item will have one more asterisk at the beginning than the parent. So, I set up a capturing group for the asterisks at the beginning of the parent (so $1 would be the back reference), and then try to match that number of asterisks plus one more in the child (using $1\*). But it isn't working. I am stuck. There are other criteria which the entries to be removed must fail, otherwise I wish to keep them. So simply getting rid of all children isn't what I'm after. We already know they are red linked entries, because the first half of the program puts all redlinks into an array, which we process in the second half of the program. Then the nested if structure checks first for whether the current redlink in the array has no entry. If it doesn't, then we check to see if it has no colon annotation. If it doesn't have a colon separator, then we check to see if it doesn't have a hyphenated annotation. If it doesn't have an en dash separator, then we check to see if it has no children. If it doesn't have a child, then we delete it from the wiki source, modifying the actual article itself.
: Once all redlinked entries that fail our tests are removed, then the rest of the program mops up, deleting red category links, and delinking all redlinks that still remain after that. We know, due to the extensive filtering we just subjected them to, that they are all embedded redlinks, the content of which we want to keep. I'll make a sample below that presents examples of the data instances to be processed. [[User talk:The Transhumanist|<i>The Transhumanist</i>]] 22:12, 6 May 2017 (UTC)
=== What the script is supposed to do ===
Here is a sample item list:
* [[Geography]]
* [[Geology]] – this text is an annotation. And here is an embedded [[redlink 1]]. After all the end node (dead end) redlinked entries are removed, this redlink will be delinked.
* [[Redlink 2]]
* [[Redlink 3]]
** [[Redlink 4]]
** [[Psychology]]
* [[Redlink 5]]: this text is also an annotation. So I want to keep this entry.
* [[Redlink 6]]
** [[Redlink 7]]
*** [[Redlink 8]]
**** [[Redlink 9]] – this annotation will prevent this entry and all its parents from being deleted. They will however be delinked after list item removal and relinked category removal are completed.
* [[Redlink 10]]
** [[Redlink 11]]
** [[Redlink 12]]
*** [[Redlink 13]]
**** [[Redlink 14]]
** [[Redlink 15]]
What we want to do is remove the list entries for which the topic is a redlink, but which do not have annotations, and which do not have children. Then we delete redlinked categories, and delink whatever redlinks are leftover — those will be by definition embedded, such as redlink 1 and redlink 3. Redlink 3 is embedded by virtue of having children.
Redlink 2 is a dead end. It is an end node in the tree structure that contains only a redlink. It gets deleted.
The script goes through the list multiple times, until it no longer finds dead end redlinks. This is because when it removes a redlinked end node, that may cause its redlinked parent to become a dead end node (such as when it has no other children). Multiple iterations catch these. So the entire branch starting with Redlink 10 will be deleted.
Here is the problem I've run into: the script currently and erroneously deletes the Redlink 3 list item. Because $1\n doesn't seem to be identifying the Redlink 4 list item as having more asterisks in the wikisource than the Redlink 3 list item. I do not know why.
All this processing is to be done in the editor, so that the redlinked entries are actually removed from the article. [[User talk:The Transhumanist|<i>The Transhumanist</i>]] 22:54, 6 May 2017 (UTC)
== Use of Wikipedian programmer categories ==
|