Talk:Maximum subarray problem: Difference between revisions

Content deleted Content added
No edit summary
syntaxhighlight & fix lint
 
(20 intermediate revisions by 11 users not shown)
Line 1:
{{WikiProject banner shell|class=|
{{WikiProject Computer science}}
}}
 
== Possibly Erroneous Initialization of s ==
Line 5 ⟶ 7:
In the O(n) algorithm, s (the largest calculated sum so far) is initialized as follows:
 
<sourcesyntaxhighlight lang="cpp">
int s=1<<31;
</syntaxhighlight>
</source>
 
First, this assumes that each integer is exactly 4 bytes wide, which is not the case for all architectures.
Line 16 ⟶ 18:
 
: I've struck out my second proposition, which would fail for sequences having exactly one element. For my first proposition, I think s is best initialized to -Infinity as (for C++)
: <sourcesyntaxhighlight lang="cpp">
int s = numeric_limits<int>::min()
</syntaxhighlight>
</source>
: or to MIN_INT in C
 
Line 32 ⟶ 34:
The correct algorithm is:
 
<sourcesyntaxhighlight lang="pascal">
Kadane_s_Algorithm(arr[1..n])
begin
Line 52 ⟶ 54:
return (max, a, b)
end
</syntaxhighlight>
</source>
 
 
Line 65 ⟶ 67:
<s>Applying the algorithm as shown in the page, I get a maximum subarray of 11, whereas the maximum subarray should be 13 (3 + 10)</s> <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/187.194.146.244|187.194.146.244]] ([[User talk:187.194.146.244|talk]]) 17:31, 16 November 2013 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->
(Scratch previous comment, the algorithm is correct)
 
The solution given in python is incorrect. Please remove the code.
 
it fails following case: [-2,1,-3,4,-1,2,1,-5,4]. Should produce 6, instead of 10.
 
 
== Clarification of maximum sum when maximum sum is < 0. ==
Line 83 ⟶ 90:
 
The divide and conquer version of the maximum subarray is notoriously O(n lg n) (see CLRM for instance as a reference.)
I have trouble understanding the solution, for instance, what sum[f - 1] and sum[f]? the only array defined clearly in the problem is a, I don't know what sum is supposed to contain. An additional explanation would be great. <!-- Template:Unsigned --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Poitevinpm|Poitevinpm]] ([[User talk:Poitevinpm#top|talk]] • [[Special:Contributions/Poitevinpm|contribs]]) 21:25, 7 January 2018 (UTC)</small> <!--Autosigned by SineBot-->
:For my own part, I don't understand why other editors insist on devoting so much of the space in the article (and in textbooks like CLRS) to bad algorithms like the brute force and divide and conquer ones, when Kadane's algorithm is so simple (simpler than the divide and conquer) and in all respects better. —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 21:47, 7 January 2018 (UTC)
:: The pseudo code for Divide and Conquer has pretty bad notation and mainly wrong complexity, it is surely O(n * log(n)). The equation at the end is wrong, time complexity can be described as T(n) = 2T(n/2) + O(n), T(1) = O(1) (see the CLRS book - Introduction to Algorithms, the algorithm is described from p. 68 in 3rd ed.). The O(n) term is coming from searching for overlapping sums in each iteration. Apart from that, I agree that Kadane's algorithm is really simple and the most efficient, so I would rather have it mentioned first. The value of other two solutions is only for showing different approaches and comparing complexities. [[User:Protivinsky|Protivinsky]] ([[User talk:Protivinsky|talk]]) 19:18, 25 February 2018 (UTC)
:::The pseudocode given in this article is clearly 2T(n/2)+O(1) (where do you think it takes more than constant time to combine solutions)? If CLRS gives an algorithm with different complexity then it is surely a different algorithm. —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 19:46, 25 February 2018 (UTC)
::::You are right indeed, I got confused by the pseudocode and thought it is the same implementation as in CLRS (where the computation of center problem has linear cost). I didn't realize that the code uses array of cumulative sums without bothering to define it or explain it. I put together python code for the divide and conquer algorithm instead of that confusing pseudocode and I will replace it. [[User:Protivinsky|Protivinsky]] ([[User talk:Protivinsky|talk]]) 23:23, 27 February 2018 (UTC)
 
== Discussion: Pharlan35244 ==
 
{{re|Pharlan35244}}: I moved this discussion to the talk page, the proper page for things like this. [[User:Hdjensofjfnen|Hdjensofjfnen]] (If you want to [[WP:TROUT|trout me]], [[User talk:Hdjensofjfnen|go ahead]]!) 03:47, 20 September 2018 (UTC)
 
Pharlan35244 said:
 
"(actually theta(n) is a better answer because both the lower bound and the upper bound has the same runtime. By saying the runtime is , you do not clarify that Kadane's algorithm has the same lower and upper bound, thus it is not completely correct for the run time to be )"
 
== the solution for keeping track of the indices for the subarray is wrong ==
here is a working script
 
<syntaxhighlight lang="text">
max_ending_here = A[0]
 
startOld = start = end = max_so_far = 0
 
for i, x in (zip (range (1, len (A)), A[1:])):
 
if max_ending_here + x > x:
max_ending_here = max_ending_here + x
 
else:
max_ending_here = x;
start = i
 
if max_so_far < max_ending_here:
max_so_far = max_ending_here;
end = i
 
print (max_so_far , start, end)
</syntaxhighlight>
 
== Issues with the "History" section, regarding brute-force ==
 
It seems incorrect that "the brute force running time" of the one-dimensional problem is "''O''(''n''<sup>3</sup>)". This is not right.
 
Rather, the brute-force runtime is ''O''(''n''<sup>2</sup>). What, then, I wonder, did Ulf Grenander's improvement do? This is confusing.
 
I suspect that there are similar issues with the description of the two-dimensional problem's history.
 
I'm not clear on the history, or else I would edit this myself...!
:The brute force algorithm tries <math>O(n^2)</math> different subarrays and takes time <math>O(n)</math> to compute the sum for each of them, giving <math>O(n^3)</math> total. Maybe you are already thinking of optimizing it to be faster? But then it's not the most obvious and basic version of the algorithm any more. And optimizing a bad algorithm is wasted effort. —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 04:06, 7 August 2019 (UTC)
 
== History is weird ==
 
<blockquote>
Soon after, Shamos described the one-dimensional problem and its history at a Carnegie Mellon University seminar attended by Jay Kadane, who designed within a minute an O(n)-time algorithm, which is clearly as fast as possible.
</blockquote>
 
The previous comment wonders out loud about the history section, and I have to wonder, too.
 
<blockquote>
Knuth began the project, originally conceived as a single book with twelve chapters, in 1962.
:
The first three volumes of what was then expected to be a seven-volume set were published in '''1968''', '''1969''', and '''1973'''.
:
Work began in earnest on Volume 4 in 1973, but was suspended in 1977 for work on typesetting.
</blockquote>
 
Supposedly, this anecdote takes place in 1977. By 1977, if you couldn't derive the O(1) algorithm for the one-dimensional case within 30 minutes, you were seriously in the wrong professional already. The only viable excuse would be a three martini lunch.
 
What ''really'' was going on here? Because I doubt any of these men were quite that dense upstairs.
 
For example, I suspect the guy agonizing over the divide and conquer thought he was solving a somewhat generalized version of the problem statement here, for which some of that heavy machinery would have actually been appropriate.
 
As written, this does not pass my historical smell test. &mdash; [[user:MaxEnt|MaxEnt]] 02:09, 12 May 2020 (UTC)
 
== Empty subarrays admitted vs. forbidden ==
 
In order to reduce the number of good-faith edits like [https://en.wikipedia.org/w/index.php?title=Maximum_subarray_problem&diff=1174756409&oldid=1165256313 the most recent one], what about presenting the "''No empty subarrays admitted''" variant first, and in full detail, and then just the changes needed to turn it into the "''Empty subarrays admitted''" variant? If we are lucky, most over-hasty ad-hoc page editors are familar with the "''No empty subarrays admitted''" variant, and won't change the text, since it meets their expectations. However, if we are unlucky, the favorite variants are distributed 50:50, and we'd experience lots of thoughtless edits in the opposite direction. I think, implementing my suggestion is worth a try. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 14:55, 10 September 2023 (UTC)
 
:I'm skeptical that this will help but it seems harmless enough and worth a try. —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 18:45, 10 September 2023 (UTC)
::{{done}}. Some complications were caused by the fact that both Bentley and Gries refer to the "empty" variant, while we start with the "nonempty" variant. The worst of them is the new [note6] which now has to explain a modification the Bentley should have made in the ''previous'' (rather than the ''current'') variant. - I gave the essential loop invariants for both algorithm variants; they could (after some days of trial and error) be verified by [[Frama-C]], so they shouldn't be flawed. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 15:28, 13 September 2023 (UTC)