Talk:Merge algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Add topic

Revision as of 21:48, 28 December 2005 edit Mikeblas (talk \| contribs) Administrators 82,118 edits m →about the examples ← Previous edit		Latest revision as of 03:28, 6 February 2024 edit undo Cewbot (talk \| contribs) Bots 8,373,693 edits m Maintain {{WPBS}} and vital articles: 1 WikiProject template. Create {{WPBS}}. Keep majority rating "Start" in {{WPBS}}. Remove 1 same rating as {{WPBS}} in {{WikiProject Computer science}}. Tag: Talk banner shell conversion
(42 intermediate revisions by 19 users not shown)
Line 1: {{WikiProject banner shell\|class=Start\| {{WikiProject Computer science\|importance=Low}} }} == Nomenclature issue == Line 33 ⟶ 36: Batch Processing Batch processing occurs when files are processed off-line, i.e. they are saved until some convenient time (e.g. at night) when the computer system is not otherwise being used. Updating files: the updates, i.e. the changed records, are "batched" together, i.e. saved, and sorted into key order. Then the master file is read, one record at a time, until a record with the same key as the first transaction is found. This master file record is then updated and then the process is repeated. The updating is done as follows: * For magnetic tape. Put the updates in a separate file and merge the master and the update to give a new master. * For a disk, the easiest method is to associate a pointer with each record leading to the next one (linked list). Individual records can be scrubbed and replaced by reseting pointers. In general, if only one master file is available, it is not altered, but a new master is produced by merging. Clearly in this context the new master file should not contain both the record from the old master file and the record from the update file, which would be the necessary consequence of interpreting "merge" as "produce multiset union".) Line 51 ⟶ 63: The kragen-hacks reference carries no weight in this case, because kragen-hacks is written by the same person who promulgated this questionable nomenclature in this Wikipedia article in the first place: me. If we decide to accept the term "merge" for this larger class of algorithms beyond just multiset union, it would have to be on the basis of documented terminology usage by people besides me. In my eyes, the fact that Knuth uses the term with a clearly more restricted meaning in mind creates a substantial burden of proof. -- [[User:Kragen]] ~~-- KragenSitaker~~ == Minor change to python example == Line 75 ⟶ 86: Should I add comments, and rewrite the samples to more closely match the pseudocode? [[User:Mikeblas\|Mikeblas]] 21:48, 28 December 2005 (UTC) ==Sample code== I intend to remove the sample implementations section if and when [[:Wikipedia:Articles for deletion/Insertion sort implementations]] is successful. It may be worthwhile to transwiki them to WikiBooks, WikiSource, or the Literate Programming wiki if anyone feels strongly about them. —<tt>[[User:donhalcon\|donhalcon]]</tt><sup>[[User talk:donhalcon\|╤]]</sup> 06:09, 5 March 2006 (UTC) : That AFD was successful, but there are still a couple of sample implementations. What's going on with requiring references for sample code in articles? -- [[User:Mikeblas\|Mikeblas]] 15:43, 5 September 2007 (UTC) Despite the 2006 debate, sample implementations in the form of pseudo code to explain algorithms are part of both insertion sort and selection sort articles. Many other wiki articles use pseudo code as a method to explain algorithms in general, sorting, data compression, error correction code, data encryption, extended Euclidean algorithm, ... . I don't see why merge algorithm should be singled out for not having at least some explanation of the algorithm in the form of pseudo code. [[User:Rcgldr\|Rcgldr]] ([[User talk:Rcgldr\|talk]]) 19:00, 29 November 2015 (UTC) ==BEST KNOWN MERGE ALGORITHM== I would like to know which is the fastest of knowns merge algorithms, I've read a lot... but I didn't find an answer. Trying to find an optimal sorting algorithm, I came to the conclusion that the optimal merge is a generalization of the binary search algorithm. Someone thinks that may be true, or even already known? I've writen in ADA this algorithm and if someone finds the best known merge algorithm, I will benchmark my proof of concept against. Meanwhile, I would try to compare it to clasic merge.--[[User:Azrael60\|Azrael60]] 18:55, 26 September 2007 (UTC) As a side note, I think a mention to Hwang-Lin binary merging algorithm should be done. Also the information theory limit for the worst case comparaison.--[[User:Azrael60\|Azrael60]] ([[User talk:Azrael60\|talk]]) 16:05, 13 January 2008 (UTC)salem ghnaimat ==Applications - statement about k-way sorts in memory == This section includes the statement: ''A k-way in-memory merge sort can also be constructed from a k-way merge algorithm, but his provides no benefits over the ordinary binary merge sort.'' On a processor with 16 registers, such as a PC in 64 bit mode, a 4-way merge sort using 10 or so pointers that a compiler optimizes by using registers for all those pointers will result in a 4-way merge sort being somewhat faster than a binary merge sort, and about as fast as quick sort. PC based benchmarks showed the 4-way merge sort was about 15% faster than binary merge sort. [[User:Rcgldr\|Rcgldr]] ([[User talk:Rcgldr\|talk]]) 19:19, 29 November 2015 (UTC) :I made the statement more precise by specifying that the number of comparisons is being counted. [[User:Qwertyus\|Q<small>VVERTYVS</small>]] <small>([[User talk:Qwertyus\|hm?]])</small> 19:55, 29 November 2015 (UTC) ::I would recommend just removing the statement. A 4 way merge averages 3 compares for every move, versus 2 way merge averaging 1 compare for every move, but the 4 way does 1/2 the moves. So the number of compares of 4 way is 1.5 times the number of compares for 2 way, while the number of moves for 4 way is 1/2 the number of moves for 2 way. This is probably too much detail for the main article, which is about merge algorithms as opposed to sort algorithms. [[User:Rcgldr\|Rcgldr]] ([[User talk:Rcgldr\|talk]]) 22:26, 3 December 2015 (UTC) :::Removed it. We might want to add a section about this to [[merge sort]] later. [[User:Qwertyus\|Q<small>VVERTYVS</small>]] <small>([[User talk:Qwertyus\|hm?]])</small> 09:49, 4 December 2015 (UTC) To clarify - a k-way in memory merge sort (using min-heap, recursion, or something similar to reduce k-1 compares to log2(k) compares), has the same time complexity as a 2-way merge. For n = power of k, then complexity = n logk(n) log2(k) = n log2(n), same as 2 way regardless of k. However the statement is ''no benefit'' as opposed to ''same time complexity''. Even with simple k-1 compare logic (3 compares per move), a 4 way merge sort of integers ends up being slightly faster than a 2 way merge sort, at least in the case of a PC in 64 bit mode (16 registers). The situation would probably be reversed if sorting an array of pointers to objects, since the compare overhead would be greater than the move overhead. The other issue is that a statement about the sort complexity belongs in the sort articles, instead of a merge article. [[User:Rcgldr\|Rcgldr]] ([[User talk:Rcgldr\|talk]]) 13:36, 7 December 2015 (UTC) ::What the original source cares about (and what I sloppily summarized as "no benefit") is the exact expected number of comparisons, not the big-O complexity. [[User:Qwertyus\|Q<small>VVERTYVS</small>]] <small>([[User talk:Qwertyus\|hm?]])</small> 15:22, 7 December 2015 (UTC) :::The number of moves is also a factor, but the total number of operations remains the same, k=2, n log2(n) compares + n log2(n) moves = 2 n log2(n) total operations. k=4, 1 1/2 n log2(n) compares + 1/2 n log2(n) moves = 2 n log2(n) total operations. On my system, Intel 2600K 3.4ghz, in 64 bit mode (16 registers), a 4 way merge sort of psuedo random 64 bit unsigned integers is about 12% to 15% depending on the number of integers (12% for 4 million, 15% for 64 million). The 4 way merge sort on integers is about as fast as quick sort. The relative performance is affected by the overhead of compares versus the overhead of moves. [[User:Rcgldr\|Rcgldr]] ([[User talk:Rcgldr\|talk]]) 18:28, 7 December 2015 (UTC) * Computational complexity is not about efficient use of registers on a particular machine. Furthermore, comparisons are usually not simple integer operations even though books on algorithms use that for clarity. [[User:Glrx\|Glrx]] ([[User talk:Glrx\|talk]]) 04:24, 9 December 2015 (UTC) ::As mentioned above, the issue with the removed statement was how it was worded ''no benefit'' as opposed to ''same time complexity'', and it probably belongs in sort algorithms and/or merge sort, not merge algorithms. [[User:Rcgldr\|Rcgldr]] ([[User talk:Rcgldr\|talk]]) 02:31, 12 December 2015 (UTC) == Fake algorithms == A source code for <code>merge_int_dac</code> routine (consequently, for <code>merge_int_dac_mp</code>, too), added in [https://en.wikipedia.org/w/index.php?title=Merge_algorithm&oldid=693744853 this revision] by [[Special:Contributions/Mia0303\|Mia0303]] under the [[Merge algorithm#Parallel merge]] section is incorrect. It '''would not even compile''' due to at least two major errors, let alone running and producing reasonable results! Even when fixed, it will also be dramatically ineffective. Additionally it lacks description of the parameters, so it's hard to tell how it should be invoked. One can try to guess the parameters meaning, but that would mean reinventing the algorithm rather than learning it, which strongly contradicts an encyclopedic style. --[[User:CiaPan\|CiaPan]] ([[User talk:CiaPan\|talk]]) 08:52, 9 February 2016 (UTC) :I've replaced the whole section by a summary of CLRS's parallel merge. [[User:Qwertyus\|Q<small>VVERTYVS</small>]] <small>([[User talk:Qwertyus\|hm?]])</small> 10:59, 9 February 2016 (UTC) == Merge K-way merge algorithm article into this one == The article [[K-way merge algorithm]] largely duplicates content in the K-way merging section in this article, so I think it should be merged (this is not just a pun, I promise). I also suspect that the topic is not notable enough to merit its own article. — '''[[User:Pzoxicuvybtnrm\|<span style="color:#750404;">P</span>]][[User talk:Pzoxicuvybtnrm\|<span style="color:#254117;">C</span>]][[Special:Contributions/Pzoxicuvybtnrm\|<span style="color:#062159;">B</span>]]''' 21:59, 9 December 2017 (UTC) :I agree, the two articles have far too much overlap to be separate. Possibly if this article were expanded in a different direction the topics could be separately notable, but with the current content they are not. —[[User:David Eppstein\|David Eppstein]] ([[User talk:David Eppstein\|talk]]) 22:13, 9 December 2017 (UTC) *'''Oppose''' The article is written in appropriate [[WP:SUMMARYSTYLE]], the K-way merging section summarize this article while this article gives significantly more details [[User:Trialpears\|Trialpears]] ([[User talk:Trialpears\|talk]]) 11:25, 15 May 2019 (UTC) ::Closing, given the uncontested policy-based argument. [[User:Klbrain\|Klbrain]] ([[User talk:Klbrain\|talk]]) 13:23, 23 June 2019 (UTC) == Language support - C++ - std::inplace_merge == The term in place may be mis-leading here. std::inplace_merge may use auxiliary storage to implement the merge. It can throw an exception if it can't allocate enough memory.<ref>https://en.cppreference.com/w/cpp/algorithm/inplace_merge</ref>. Microsoft / Visual Studio attempts to allocate memory, but can perform the merge even if memory allocation fails.<ref>https://docs.microsoft.com/en-us/previous-versions/awc3s8k6(v=vs.140)</ref>. {{reflist_talk}} == Possible typo == should not it be a s+1 there? (the lower line beneath): in parallel do merge(A[i...r], B[k...s], C[p...t]) merge(A[r+1...j], B[s...ℓ], C[t+1...q]) <!-- Template:Unsigned IP --><small class="autosigned">— Preceding [[Wikipedia:Signatures\|unsigned]] comment added by [[Special:Contributions/37.47.172.117\|37.47.172.117]] ([[User talk:37.47.172.117#top\|talk]]) </small> == Difference between the merge algorithm and merge sort? == What is the difference between the merge algorithm and [[merge sort]]? The article says that the merge algorithm plays a critical role in the merge sort algorithm, but according to how the article describes the merge algorithm, it seems to me that it ''is the same thing'' as merge sort. So, what is the difference between the two algorithms, if there is a difference? I think this should be clarified early on in the article. —[[User:Kri\|Kri]] ([[User talk:Kri\|talk]]) 17:52, 23 August 2023 (UTC) : {{re\|Kri}} The merge algorithm takes two sorted chains (lists, sequences, arrays, ...) of data and creates a new chain (sometimes by copying the items of source data, sometimes by interleaving them, depending on the data structure in use), containing all items of both source chains in the proper order. Merge sort takes unsorted chain of data, decomposes it into multiple chains and successively merges them to construct a finał sorted chain of all source data. --[[User:CiaPan\|CiaPan]] ([[User talk:CiaPan\|talk]]) 20:14, 23 August 2023 (UTC) ::Yes. The merge algorithm is a subroutine in merge sort, not the whole merge sort. It is also used as a subroutine in some other algorithms unrelated to sorting. —[[User:David Eppstein\|David Eppstein]] ([[User talk:David Eppstein\|talk]]) 01:14, 24 August 2023 (UTC) :::{{re\|David Eppstein}} Thank you for clarification. {{smiley}} [[User:CiaPan\|CiaPan]] ([[User talk:CiaPan\|talk]]) 10:33, 29 August 2023 (UTC) ::::Thank you for the answers; that makes sense! I have updated the text under the image in the article to reflect that. —[[User:Kri\|Kri]] ([[User talk:Kri\|talk]]) 19:42, 29 August 2023 (UTC) : {{re\|Kri}} What David said about a ''𝄪subroutine𝄪'' above applies to words ''𝄪successively merges them𝄪'' in my answer. --[[User:CiaPan\|CiaPan]] ([[User talk:CiaPan\|talk]]) 10:33, 29 August 2023 (UTC)