User talk:The Transhumanist/StripSearchSorted.js: Difference between revisions

Content deleted Content added
does not yet sort
 
(37 intermediate revisions by the same user not shown)
Line 1:
{{User:The Transhumanist/Workshop boilerplate/Lead hatnote}}
: ''I'm using this page as a workspace. The talk page portion of it starts at [[#Discussions]], below.''
 
: '''''This script is partially operational''''', ''but doesthere notis yeta sortquirk anything.in Also,wikEd: whenWhen cutresults andare copied/pasted into wikEd, the results are erroneously double spaced. IfClicking youon canundo figurein outwikEd howreverts theit scriptto cansingle removespaced theas extrainitially linefeeds,intended. please(I let medon't know why. ThankIf you do, please tell me.)''
 
[[User:The Transhumanist/StripSearchSorted.js|StripSearchSorted.js]]: stripsprovides a menu item to strip search results down to bare pagenamespage names, sortssort them alphabetically, and addsadd bullet list wikicode formatting for easy copying and pasting into articles. The menu item is a toggle switch that turns this function on and off, and remembers its status for all searches. By default, just by being installed, the script removes from search results the redirected entries and members of matching categories (as they don't match the search string), even if you don't use the menu item. For Vector skin only.
 
= Script's workshop =
Line 10:
== Description / instruction manual ==
 
: '''''This script is operational''''', ''but there is a quirk in wikEd: When results are copied/pasted into wikEd, the results are erroneously double spaced. Clicking on undo in wikEd reverts it to single spaced as initially intended. (I don't know why. If you do, please tell me.)''
: ''This script is partially operational; it doesn't sort the results yet.''
 
[[User:The Transhumanist/StripSearchStripSearchSorted.js|StripSearchInWikicodeStripSearchSorted.js]]: stripsprovides a menu item to strip search results down to bare pagenamespage names, sort them alphabetically, and addsadd bullet list wikicode formatting for easy copying and pasting into articles. ItThe alsomenu removesitem redirectedis entriesa toggle switch that turns this function on and off, and isremembers especiallyits usefulstatus for "intitle:"all searches. By default, just by being installed, the script removes from search results the redirected entries and members of matching categories (as they don't match the search string), even if you don't use the menu item. For Vector skin only.
 
ThisIn other words, when the menu item is turned on, this script reduces the search results to a list of links. It strips out the data between the page names, including that annoying "from redirect" note. It adds <code>* [[]]</code> to each entry and sorts them so they look like this:
 
<nowiki>* [[</nowiki>[[John Wayne]]<nowiki>]]</nowiki></br>
<nowiki>* [[</nowiki>[[Clint Eastwood]]<nowiki>]]</nowiki></br>
<nowiki>* [[</nowiki>[[Brad Pitt]]<nowiki>]]</nowiki></br>
<nowiki>* [[</nowiki>[[Clint Eastwood]]<nowiki>]]</nowiki></br>
<nowiki>* [[</nowiki>[[Dwayne Johnson]]<nowiki>]]</nowiki></br>
<nowiki>* [[</nowiki>[[John Wayne]]<nowiki>]]</nowiki></br>
<nowiki>* [[</nowiki>[[Tom Cruise]]<nowiki>]]</nowiki></br>
 
ThisThe formatting makes it easier to copy and paste the links from search results into articles.
 
Once installed, the menu item "SR sort" will appear in the side bar tools menu, specifying what action it is ready to perform (either "turn on" or "turn off").
Once installed, the script automatically processes your Wikipedia search results.
 
{{User:The Transhumanist/Workshop boilerplate/Install}}
To install, add this line to your vector.js page:
 
=== Known issues ===
<syntaxhighlight lang="javascript">
importScript("User:The Transhumanist/StripSearchInWikicode.js");
</syntaxhighlight>
 
Quirk in wikEd: When results are copied/pasted into wikEd, the results are erroneously double spaced. Clicking on undo in wikEd reverts it to single spaced as initially intended. (I don't know why. If you do, please tell me.)
If you want the detail back in your search results, remove that line, or comment it out by placing two forward slashes (//) at the beginning of it.
 
=={{User:The Transhumanist/Workshop boilerplate/Explanatory notes}} (source<!--includes codeh2 walkheading-through) ==->
 
This section explains the source code, in detail.
 
You can only use so many comments in the source code before you start to choke or bury the programming itself. So, I've put short summaries in the source code, and have provided in-depth explanations here. My intention is twofold:
# to thoroughly document the script so that even relatively new JavaScript beginners can understand what it does.
# to refresh my memory of exactly how the script works, in case I don't look at the source code (or any JavaScript) for weeks or months.
 
The explanatory notes include examples, and links to relevant documentation pages, tutorials, etc.
 
In addition to some standard [[JavaScript]] code, this script also relies heavily on the [[jQuery]] library.
 
If you have any questions, feel free to ask me ''[[#Discussions|at the bottom of this page under '''Discussions''']]''. Trying to answer them will help me learn JavaScript better.
 
=== General approach ===
 
The script uses the jQuery method .hide() for the stripping the elements by class name. Here's an example of stripping out elements with a particularthe class name "searchalttitle":
 
<syntaxhighlight lang="javascript">
<code>$( ".searchalttitle" ).hide();</code>
$( ".searchalttitle" ).hide();
</syntaxhighlight>
 
Learn about methods at https://www.w3schools.com/js/js_object_methods.asp
Line 58 ⟶ 46:
Learn about .hide at http://api.jquery.com/hide/
 
{{User:The Transhumanist/Workshop boilerplate/Aliases}}
=== aliases ===
 
{{User:The Transhumanist/Workshop boilerplate/Bodyguard function}}
An alias is one string defined to mean another. Another term for "alias" is "shortcut". In the script, the following aliases are used:
 
{{User:The Transhumanist/Workshop boilerplate/The ready event listener-handler}}
<code>$</code> is the alias for jQuery
 
=== Activation filters ===
<code>mw</code> is the alias for mediawiki
 
I didn't know what else to call these. I wanted the program to only work when intended, and only on intended pages (search result pages). So, I applied the [https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Building_blocks/conditionals conditional, '''''if'''''], as follows...
These two aliases are set up like this:
 
==== Vector skin activation filter ====
<syntaxhighlight lang="javascript">
( function ( mw, $ ) {}( mediaWiki, jQuery ) );
</syntaxhighlight>
 
I use the Vector skin, and haven't tested the script on any other skin, so the script basically says "''if'' the vector skin is in use, do what's between the curly brackets". (Which includes the rest of the main program. Note that functions, aka subroutines, follow after the main program.).
That is a "bodyguard function", and is explained in the section below...
 
=== Bodyguard function ===
 
The bodyguard function assigns an alias for a name within the function, and reserves that alias for that purpose only. For example, if you want "t" to be interpreted only as "transhumanist".
 
Since the script uses jQuery, we want to defend jQuery's alias, the "$". The bodyguard function makes it so that "$" means only "jQuery" inside the function, even if it means something else outside the function. That is, it prevents other javascript libraries from overwriting the $() shortcut for jQuery. It does this via [[scoping]].
 
The bodyguard function is used like a wrapper, with the alias-containing source code inside it. Here's what a jQuery bodyguard function looks like:
 
<syntaxhighlight lang="javascript">
// Only activate on Vector skin
1 ( function($) {
if ( mw.config.get( 'skin' ) === 'vector' ) {
2 // you put the body of the script here
// The rest of the script goes here
3 } ) ( jQuery );
}
</syntaxhighlight>
 
===== <div style="font-size:90%">mw.config.get ( 'skin' )</div> =====
''See also: [http://stackoverflow.com/questions/8666467/how-to-declare-this-function-in-jquery-document-ready-is-not-working bodyguard function solution].''
 
This looks up the value for skin (the internal name of the currently used skin) saved in MediaWiki's configuration file.
To extend that to lock in "mw" to mean "mediawiki", use the following (this is what the script uses):
 
* [https://www.w3schools.com/jquery/ajax_get.asp jQuery get() Method]
<syntaxhighlight lang="javascript">
* [https://www.mediawiki.org/wiki/Manual:Interface/JavaScript#mw.config mw.config]
1 ( function(mw, $) {
2 // you put the body of the script here
3 } ) (mediawiki, jQuery);
</syntaxhighlight>
 
===== <div style="font-size:90%">logical operators</div> =====
''For the best explanation of the bodyguard function I've found so far, see: [http://codeimpossible.com/2010/01/13/solving-document-ready-is-not-a-function-and-other-problems/ Solving "$(document).ready is not a function" and other problems]''
 
"<code>===</code>" means "equal value and equal type"
=== The ready() event listener/handler ===
 
* [https://www.w3schools.com/js/js_comparisons.asp JavaScript Comparison and Logical Operators]
The ready() event listener/handler makes the rest of the script wait until the page (and its [[Document Object Model|DOM]]) is loaded and ready to be worked on. If the script tries to do its thing before the page is loaded, there won't be anywhere for it to place the menu item (mw.util.addPortletLink), and the script will fail.
 
==== Page title activation filter ====
In jQuery, it looks like this: [http://learn.jquery.com/using-j\query-core/document-ready/ <code>$( document ).ready() {});</code>]
 
The part of the script that is being made to wait goes inside the curly brackets. But you would generally start that on the next line, and put the ending curly bracket, closing parenthesis, and semicolon following that on a line of their own), like this:
 
<syntaxhighlight lang="javascript">
// Run this script only if " - Search results - Wikipedia" is in the page title
1 $(function() {<br>
if (document.title.indexOf(" - Search results - Wikipedia") != -1) {
2 // Body of function (or even the rest of the script) goes here, such as a click handler.<br>
// The rest of the script goes here
3 });
}
</syntaxhighlight>
 
''This is all explained further at [https://api.jquery.com/ready/ the jQuery page for <code>.ready()</code>]''
 
For the plain vanilla version see: http://docs.jquery.com/Tutorials:Introducing_$(document).ready()
 
=== Activation filters ===
 
I didn't know what else to call these. I wanted the program to only work when intended, and only on intended pages (search result pages). So, I applied the [https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Building_blocks/conditionals conditional, '''''if'''''].
 
I use the Vector skin, and haven't tested the script on any other skin, so the script basically says "''if'' the vector skin is in use, do what's between the curly brackets". (Which includes the entire rest of the program).
 
<syntaxhighlight lang="javascript">
// Only activate on Vector skin
if ( mw.config.get( 'skin' ) === 'vector' ) {
 
// Run this script only if " - Search results - Wikipedia" is in the page title
if (document.title.indexOf(" - Search results - Wikipedia") != -1) {
</syntaxhighlight>
 
Line 196 ⟶ 153:
* 2017-10-27
** Forked script from a copy of [[User:The Transhumanist/StripSearchInWikicode.js]], and forked this workshop from a copy of [[User talk:The Transhumanist/StripSearchInWikicode.js]].
* 2017-1011-2805
** Evad37 provided sequence for sorting the search results
* 2017-11-09
** Add toggle switch (dual menu item)
** Apply class of "Stripped" to the modified results, so that they can be removed to make way for original results
** Make switch swap out results between original and modified, and vice versa
* 2018-01-20
** Added TrueMatch function (intitle bug workaround)
*** Evad37 provided the 2 key lines
 
== DesiredTask featureslist ==
 
=== Bug reports ===
 
=== Desired/completed features ===
 
: ''Completed features are marked with {{done}}''
 
Improvements that would be nice:
 
* '''True Match''' (built-in intitle fix) &ndash; intitle doesn't work right in that it ignores common words, and so results turn up without the specified search term. This feature would discard all the results that don't match the search term (which the search feature should have done in the first place). (And since it'll all be in an array, anyways, this should be easy to implement).
 
== Development notes ==
 
=== Implementing True Match ===
 
Run the function if Title includes "intitle:"
 
Parse the Title with regex to get the intitle string. The string may be a single word or a phrase within double quotation marks. Use regex pipe for or.
 
Then keep only the search results that include that string. One way to do this is use a regex to inverse match via negative look-arounds. <code><nowiki>^((?!hi there).*)$</nowiki></code> will match any line not containing "hi there". Those are the lines we want to remove.
 
See annotationToggle for how to wrap entries in classed span tags, and then hide those spans. But do it with jQuery instead.
 
=== Adding the wikicode ===
Line 544 ⟶ 527:
:: Thank you for the guidance. How would you "rebuld the links from the array"? ''[[User talk:The Transhumanist|The Transhumanist]]'' 02:19, 27 October 2017 (UTC)
:::You can make links from page titles using code like I've got in [[User:Evad37/extra.js]]'s makeLink function. But in your case you need to also surround the link with <code>*[[</code> and <code>]]</code>, and have the whole thing within a block tag like &lt;div> or &lt;p>. Do that for each item in the array, and then you can add them all to (or next to) an element on the page using a jQuery method like .before(), .after(), .prepend(), or .append(), each of which can take an array as the input. - <u>'''[[User:Evad37|Evad]]''37'''''</u>&nbsp;<span style="font-size:95%;">&#91;[[d:w:User talk:Evad37|talk]]]</span> 02:40, 27 October 2017 (UTC)
 
== Adding a filter to StripSearchSorted.js ==
 
:''Originally posted to Evad37's talk page:''
 
There's a really annoying design flaw in WP's search's intitle feature. Common words like "of" are ignored, even though they are included within a quoted phrase. So, intitle:"of Boston" is interpreted as just intitle:Boston. And the search results are filled with non-matching results. To make matters worse, the search results include matches of the phrase in the contents of pages, watering the results down even more to inlcude pages that don't even have "Boston" in the title. What I need is for results to strictly match the term provided after "intitle:".
 
For StripSearchSorted.js, you wrote a long sequence of chained methods (which I modified ever so slightly):
 
<syntaxhighlight lang="javascript">
// Replace the search results by hiding the original results and use .after to insert a modified version of those results
$('ul.mw-search-results').hide().after(
$('<div id="Stripped"></div>').append(
$('ul.mw-search-results')
.children()
.map( function() {
return $(this).find('a').text();
})
.get()
.sort()
.map( function(title) {
return $('<div></div>').append(
'* [[',
$('<a>').attr({
'href':'https://en.wikipedia.org/wiki/'+mw.util.wikiUrlencode(title),
'target':'_blank'
}).text(title),
']]'
);
})
)
);
</syntaxhighlight>
 
Is it possible to continue adding to this chain in order to filter the array down to elements that only include the intitle search string?
 
Assume we've put the search string into a variable, say <code>var intitleString;</code>
 
After the closing parenthesis (included below), the .filter chain continuation might look something like this:
 
<syntaxhighlight lang="javascript">
).filter(function () { return this.
})
</syntaxhighlight>
 
The problem is, I don't know what to put after "this." to match intitleString. I know regex generally speaking, but I don't know how to include it in a chain, or how to match a variable with it.
 
By the way, would this nuke the array if intitle wasn't specified in the search? Can an if control structure be put in a chain? (Like: If "intitle" is in the title, do this). ''[[User talk:The Transhumanist|The Transhumanist]]'' 12:05, 7 December 2017 (UTC)
:Filtering is possible, but it's easier to do the filtering before the .map(), because at that stage you have a plain array of strings (each of which is a title), rather than an array of jQuery objects (which you have to drill down into to get the title string). When filtering on a plain javascript array, don't use <code>this</code> (that only works with jQuery objects) – the basic syntax is
<syntaxhighlight lang="javascript">
newArray = oldArray.filter(function(arrayElement) {
// do stuff, and return true (or a truthy value) to keep the array element,
// or return false (or a falsey value) to remove the array element
});
</syntaxhighlight>
:To check if a string contains a test string, you can do <code>''string''.indexOf(''testString'')</code>, which returns -1 if not found, or a number of where it is found. To convert to a true/false value; you just do <code>''string''.indexOf(''testString'') !== -1</code>. That's for case-sensitive results, and dosen't care about word boundaries. To do more advanced matching, you have to make a regex object, and then test for a match using <code>''regex''.test(''string'')</code>, which returns true or false accordingly.
:So putting it all together, elswehere in your code you make your regex pattern <code>var intitlePatt</code>, then
<syntaxhighlight lang="javascript">
// ... same as code block above ...
.get()
.filter(function(title) {
return intitlePatt.test(title);
})
.sort()
// ... same as code block above ...
</syntaxhighlight>
:To stop things blowing up, you just have to make sure everything passes the filter when there's no intitle: in the search, i.e. set <code>intitleString = '<nowiki></nowiki>'</code> or <code>intitlePatt = /./</code> for that case. You can't really have control structures in a chain – you would have to store the intermediate value of the chain in a variable, then put in the control structure, and resume the chain from the intermediate variable. Like
<syntaxhighlight lang="javascript">
var intermidateFoo = $(foo).bar().baz();
if (condition1) {
intermidateFoo.qux().foobar().barbaz();
} else {
intermidateFoo.barbaz();
}
</syntaxhighlight>
:- <u>'''[[User:Evad37|Evad]]''37'''''</u>&nbsp;<span style="font-size:95%;">&#91;[[d:w:User talk:Evad37|talk]]]</span> 02:34, 8 December 2017 (UTC)
 
:: So, let me see if I got this straight...
 
:: You store the search's intitle value in a variable, and if there isn't one, the variable's value would just be null.
 
:: Then, in the chain, filter out non-matching entries. If the variable has a null value, meaning that intitle wasn't included in the search, all entries would match.
 
:: Is that correct? ''[[User talk:The Transhumanist|The Transhumanist]]'' 21:50, 10 December 2017 (UTC)
:::Not quite... <code>null</code> doesn't match anything, so no entries would match. To get all entries to match, you either have to set the variable to something that does actually match any entry (<code>intitleString = '<nowiki></nowiki>'</code> or <code>intitlePatt = /./</code> depending on whether you use indexOf or regex matching inside the filter); or else have an explicit check inside the filter which will just return true if the variable is null. - <u>'''[[User:Evad37|Evad]]''37'''''</u>&nbsp;<span style="font-size:95%;">&#91;[[d:w:User talk:Evad37|talk]]]</span> 03:02, 11 December 2017 (UTC)
:::: So, there is no way to match null in regex? So you can't match null or whatever the string is, using the pipe character? ''[[User talk:The Transhumanist|The Transhumanist]]'' 04:26, 11 December 2017 (UTC)
:::::If you want to check if a variable is null or undefined, just do <code>someVar == null</code> (gives true if someVar is null/undefined, false otherwise). You can combine this with other logical tests using <code>||</code> , <code>&&</code> , <code>!</code> as usual. - <u>'''[[User:Evad37|Evad]]''37'''''</u>&nbsp;<span style="font-size:95%;">&#91;[[d:w:User talk:Evad37|talk]]]</span> 04:37, 11 December 2017 (UTC)
 
== Adding TrueMatch to StripSearchSorted ==
 
:''Originally posted to Evad37's talk page''
 
I'm in the process of trying to fix the intitle bug in Wikipedia's search, by providing the solution as a function within [[User:The Transhumanist/StripSearchSorted.js|StripSearchSorted.js]].
 
The intitle bug is that when you enter a search phrase in WP's search box with a common word (like this: intitle:"in Germany"), the titles in the search results don't match.
 
I'm almost done, but I can't figure out how to get :contains to accept a variable:
 
<syntaxhighlight lang="javascript">
function TrueMatch() {
// The purpose of this function is to filter out non-matches
 
// Activation filter:
// Run this function only if 'intitle:"' is in the page title
// Notice the lone " after intitle:
if (document.title.indexOf('intitle:"') != -1) {
 
// Body of function
// Create variable with page title
var docTitle = document.title;
 
// Display on screen for checking
//alert ( docTitle );
 
// Extract the intitle search string from the html page title
// We want the part between the quotation marks
var regexIntitle = new RegExp('intitle:"(.+?)(")(.*)','i');
var intitle;
intitle = docTitle.replace(regexIntitle,"$1");
//alert ( intitle );
// Filter out search results that do not match
$( "li" ).not( 'li:contains(" + intitle + ")' ).remove();
}
}
</syntaxhighlight>
 
It works fine up until that last line. I want to remove all li elements that do not contain the text in the intitle variable. ''[[User talk:The Transhumanist|The Transhumanist]]'' 07:51, 19 January 2018 (UTC)
:Instead of passing a single string for the selector, you need to build a string up around the variable:
:<syntaxhighlight lang="javascript">$( "li" ).not( 'li:contains("' + intitle + '")' ).remove();</syntaxhighlight>
:The <code>'li:contains("' + intitle + '")'</code> gets processed first, and the result is passed through to <code>.not()</code>. Or if you wanted to be a bit more explicit, you could do something like
:<syntaxhighlight lang="javascript">
var intitle_selector = 'li:contains("' + intitle + '")';
$( "li" ).not( intitle_selector ).remove();
</syntaxhighlight>
: - <u>'''[[User:Evad37|Evad]]''37'''''</u>&nbsp;<span style="font-size:95%;">&#91;[[d:w:User talk:Evad37|talk]]]</span> 13:59, 19 January 2018 (UTC)
 
:: I tried both methods you posted above. I tested it on intitle:"of Australia". The script runs, and strips out the details as it is supposed to. And it is sorting the results. But it isn't removing the non-matches. It's like it's matching everything, and therefore removing nothing. (When it matches nothing, like in my version above, it removes everything, leaving the results blank). I reactivated the alerts, and those show up fine. It's still the last line that isn't working. When you replace it with <code>$( "li" ).remove();</code>, it removes all results. ''[[User talk:The Transhumanist|The Transhumanist]]'' 02:16, 20 January 2018 (UTC)
 
:: Testing further on the "of Australia" search...
 
:: <code>$( "li").not('li:contains( "of" )').remove();</code> resulted in blank results (ie, none).
 
:: <code>$( "li").not('li:contains( of )').remove();</code> resulted in no matches (ie results unaffected).
 
:: So, it looks like the first one is matching nothing, causing all li elements to be removed, while the second one is matching everything, causing no li elements to be removed. ''[[User talk:The Transhumanist|The Transhumanist]]'' 03:11, 20 January 2018 (UTC)
 
 
:: I stared at the current page source, and discovered spans with the class "searchmatch", the contents of which appear to have been causing false matches. So, I blasted those with:
 
::<syntaxhighlight lang="javascript">
// First, strip out the searchmatch class elements (they match).
$( 'li').find( '.searchmatch').remove();
</syntaxhighlight>
 
:: Then, with the above line in place, I tested your solution again, but it didn't work:
 
::<syntaxhighlight lang="javascript">
$( "li" ).not( 'li:contains("' + intitle + '")' ).remove();
</syntaxhighlight>
 
:: The results turned up blank, which means it removed everything.
:: Ironically, doing the opposite works:
 
::<syntaxhighlight lang="javascript">
$( 'li:contains("' + intitle + '")').remove();
</syntaxhighlight>
 
:: Unfortunately, this removes precisely the entries the user wants to keep. ''[[User talk:The Transhumanist|The Transhumanist]]'' 08:26, 20 January 2018 (UTC)
:::I think we need to be more specific, and target the main link of each result - since the value of <code>intitle</code> will be somewhere within the <code>li</code>, just not neccesarily in the title. Plus we can limit the searching of <code>li</code>s to just the search results, rather than the whole page:
:::<syntaxhighlight lang="javascript">
// Mark true results with a class
$('.mw-search-results').find('li').has( 'div > a:contains("' + intitle + '")' ).addClass('truematch');
// Remove other results
$('.mw-search-results').find('li').not('.truematch').remove();
</syntaxhighlight>
:::Which basically means: In the <code>mw-search-results</code>, find the <code>li</code>s which have a <code>div</code> that itself has (as a direct child element) an <code>a</code> that contains the text <code>intitle</code>, and add the class <code>truematch</code> to those <code>li</code>s. Then, in the <code>mw-search-results</code>, find the <code>li</code>s which do not have the class <code>truematch</code>, and remove them. - <u>'''[[User:Evad37|Evad]]''37'''''</u>&nbsp;<span style="font-size:95%;">&#91;[[d:w:User talk:Evad37|talk]]]</span> 09:25, 20 January 2018 (UTC)
 
:::: That did the trick. It works beautifully. Thank you.
 
:::: This leads the way to the development of two related programs:
::::# StripSearchFilter.js &ndash; will allow the user to enter additional search terms to filter down the results, including a term to keep or a term to discard. Can use it multiple times to further refine the result.
::::# SearchSuite.js &ndash; will put selected features on their own switches so they can be turned on and off. Like the details stripping, and the inserted wikicode. It will also include the search filter feature mentioned above.
 
:::: I'll keep you posted. ''[[User talk:The Transhumanist|The Transhumanist]]'' 11:28, 20 January 2018 (UTC)