Content deleted Content added
→Empty subarrays admitted: introduce line numbers everywhere in code |
→Empty subarrays admitted: fix misunderstanding: described changes obtain the algorithm variant |
||
(32 intermediate revisions by 16 users not shown) | |||
Line 20:
The maximum subarray problem was proposed by [[Ulf Grenander]] in 1977 as a simplified model for [[maximum likelihood estimation]] of patterns in digitized images.{{sfn|Bentley|1984|p=868-869}}
Grenander was looking to find a rectangular subarray with maximum sum, in a two-dimensional array of real numbers. A brute-force algorithm for the two-dimensional problem runs in ''O''(''n''<sup>6</sup>) time; because this was prohibitively slow, Grenander proposed the one-dimensional problem to gain insight into its structure. Grenander derived an algorithm that solves the one-dimensional problem in ''O''(''n''<sup>2</sup>) time
|By using a precomputed table of cumulative sums <math>S[k] = \sum_{x=1}^k A[x]</math> to compute the subarray sum <math>\sum_{x=i}^j A[x] = S[j] - S[i-1]</math> in constant time<!--sentence fragment-->
}}, improving the brute force running time of ''O''(''n''<sup>3</sup>). When [[Michael Shamos]] heard about the problem, he overnight devised an ''O''(''n'' log ''n'') [[divide-and-conquer algorithm]] for it.
Soon after, Shamos described the one-dimensional problem and its history at a [[Carnegie Mellon University]] seminar attended by [[Jay Kadane]], who designed within a minute an ''O''(''n'')-time algorithm,{{sfn|Bentley|1984|p=868-869}}{{sfn|Bentley|1989|p=76-77}}{{sfn|Gries|1982|p=211}} which is as fast as possible.<ref group=note>since every algorithm must at least scan the array once which already takes ''O''(''n'') time</ref> In 1982, [[David Gries]] obtained the same ''O''(''n'')-time algorithm by applying [[Edsger W. Dijkstra |Dijkstra]]'s "standard strategy";{{sfn|Gries|1982|p=209-211}} in 1989, [[Richard Bird (computer scientist)|Richard Bird]] derived it by purely algebraic manipulation of the brute-force algorithm using the [[Bird–Meertens formalism]].{{sfn|Bird|1989|loc=Sect.8, p.126}}
Line 29 ⟶ 28:
== Applications ==
Maximum subarray problems arise in many fields, such as genomic [[sequence analysis]] and [[computer vision]].
Genomic sequence analysis employs maximum subarray algorithms to identify important biological segments of protein sequences
In [[computer vision]], bitmap images generally consist only of positive values, for which the maximum subarray problem is trivial: the result is always the whole array. However, after subtracting a threshold value (such as the average pixel value) from each pixel, so that above-average pixels will be positive and below-average pixels will be negative, the maximum subarray problem can be applied to the modified image to detect bright areas within it.<ref>{{harvtxt|Bae|Takaoka|2006}}; {{harvtxt|Weddell|Read|Thaher|Takaoka|2013}}</ref>
==Kadane's algorithm==
===
[[Joseph Born Kadane|Kadane's]] algorithm scans the given array <math>A[1\ldots n]</math> from left to right.
In the <math>j</math>th step, it computes the subarray with the largest sum ending at <math>j</math>; this sum is maintained in variable <code>current_sum</code>.{{NoteTag
|named <code>MaxEndingHere</code> in {{harvtxt|Bentley|1989}}, and <code>c</code> in {{harvtxt|Gries|1982}}
Line 52 ⟶ 46:
and easily obtained as the maximum of all values of <code>current_sum</code> seen so far, cf. line 7 of the algorithm.
As a [[loop invariant]], in the <math>j</math>th step, the old value of <code>current_sum</code> holds the maximum over all <math>i \in \{ 1,\ldots, j-1 \}</math> of the sum <math>A[i]+\cdots+A[j-1]</math>.
Therefore, <code>current_sum</code><math>+A[j]</math>{{NoteTag|
In the Python code below, <math>A[j]</math> is expressed as <code>x</code>, with the index <math>j</math> left implicit.
}}
is the maximum over all <math>i \in \{ 1,\ldots, j-1 \}</math> of the sum <math>A[i]+\cdots+A[j]</math>. To extend the latter maximum to cover also the case <math>i=j
Thus, the problem can be solved with the following code,{{sfn|Bentley|1989|p=
<syntaxhighlight lang="python" line>
def max_subarray(numbers):
"""Find the largest sum of any contiguous subarray."""
best_sum =
current_sum = 0
for x in numbers:
current_sum = max(
best_sum = max(best_sum, current_sum)
return best_sum
</syntaxhighlight>
If the input contains no positive element, the returned value is that of the largest element (i.e., the value closest to 0), or negative infinity if the input was empty. For correctness, an exception should be raised when the input array is empty, since an empty array has no maximum nonempty subarray. If the array is nonempty, its first element could be used in place of negative infinity, if needed to avoid mixing numeric and non-numeric values.
The algorithm can be adapted to the case which
This algorithm calculates the maximum subarray ending at each position from the maximum subarray ending at the previous position, so it can be viewed as a
===
{| align="right" class="wikitable collapsible collapsed"
! Example run
|-
| [[File:Kadane run −2,1,−3,4,−1,2,1,−5,4.gif|thumb|500px|Execution of Kadane's algorithm on the [[#top|above]] example array. ''{{color|#0000c0|Blue}}:'' subarray with largest sum ending at ''i''; ''{{color|#00c000|green}}:'' subarray with largest sum encountered so far; a lower case letter indicates an empty array; variable ''i'' is left implicit in Python code.]]
|}
Kadane's algorithm, as originally published, is for solving the problem variant which allows empty subarrays.{{sfn|Bentley|1989|p=74}}{{sfn|Gries|1982|p=211}}
In such variant, the answer is 0 when the input contains no positive elements (including when the input is empty).
The variant is obtained with two changes in code: in line 3, <code>best_sum</code> should be initialized to 0 to account for the empty subarray <math>A[0 \ldots -1]</math>
<syntaxhighlight lang="python" line start="3">
best_sum =
</syntaxhighlight>
and
|While
}}
<syntaxhighlight lang="python" line start="6">
current_sum = max(
</syntaxhighlight>
As a [[loop invariant]], in the <math>j</math>th step, the old value of <code>current_sum</code> holds the maximum over all <math>i \in \{ 1,\ldots, j \}</math> of the sum <math>A[i]+\cdots+A[j-1]</math>.{{NoteTag
|This sum is <math>0</math> when <math>i=j</math>, corresponding to the empty subarray <math>A[j\ldots j-1]</math>.
}}
Therefore, <code>current_sum</code><math>+A[j]</math>
is the maximum over all <math>i \in \{ 1,\ldots, j \}</math> of the sum <math>A[i]+\cdots+A[j]</math>. To extend the latter maximum to cover also the case <math>i=j+1</math>, it is sufficient to consider also the empty subarray <math>A[j+1 \; \ldots \; j]</math>. This is done in line 6 by assigning <math>\max(0,</math><code>current_sum</code><math>+A[j])</math> as the new value of <code>current_sum</code>, which after that holds the maximum over all <math>i \in \{ 1, \ldots, j+1 \}</math> of the sum <math>A[i]+\cdots+A[j]</math>. Machine-verified [[C (programming language)|C]] / [[Frama-C]] code of both variants can be found [[:commons:File:Kadane run −2,1,−3,4,−1,2,1,−5,4.gif#Source code|here]].
===Computing the best subarray's position===
Line 111 ⟶ 115:
{{NoteFoot|30em}}
==
{{Reflist}}
==References==
*{{citation
| last1 = Alves | first1 = Carlos E. R.
| last2 = Cáceres | first2 = Edson
| last3 = Song | first3 = Siang W.
| editor1-last = Kranzlmüller | editor1-first = Dieter
| editor2-last = Kacsuk | editor2-first = Péter
| editor3-last = Dongarra | editor3-first = Jack J.
| contribution = BSP/CGM Algorithms for Maximum Subsequence and Maximum Subarray
| doi = 10.1007/978-3-540-30218-6_24
| pages = 139–146
| publisher = Springer
| series = Lecture Notes in Computer Science
| title = Recent Advances in Parallel Virtual Machine and Message Passing Interface, 11th European PVM/MPI Users' Group Meeting, Budapest, Hungary, September 19-22, 2004, Proceedings
| volume = 3241
| year = 2004| isbn = 978-3-540-23163-9
}}
*{{citation
| last1 = Backurs
Line 129 ⟶ 150:
| pages = 81:1–81:13
| doi = 10.4230/LIPIcs.ICALP.2016.81
| doi-access = free
| s2cid = 12720136
}}
Line 144 ⟶ 166:
| s2cid=2681670
}}.
*{{citation
| last1 = Bae | first1 = Sung Eun
| last2 = Takaoka | first2 = Tadao
| doi = 10.1093/COMJNL/BXL007
| issue = 3
| journal = The Computer Journal
| pages = 358–374
| title = Improved Algorithms for the \emph{K}-Maximum Subarray Problem
| volume = 49
| year = 2006}}
*{{citation
| last1=Bengtsson
Line 182 ⟶ 214:
}}
*{{citation
| first=Richard S.
| last= Bird
Line 193 ⟶ 224:
| year=1989
| doi=10.1093/comjnl/32.2.122
}}
*{{citation
| first1 = Gerth Stølting | last1 = Brodal
| first2 = Allan Grønlund | last2 = Jørgensen
| title = Mathematical Foundations of Computer Science 2007
| contribution = A linear time algorithm for the ''k'' maximal sums problem
| publisher = Springer-Verlag
| series = Lecture Notes in Computer Science
| volume = 4708 | year = 2007 | pages = 442–453 | doi = 10.1007/978-3-540-74456-6_40
| isbn = 978-3-540-74455-9
}}.
*{{citation
| url=https://core.ac.uk/download/pdf/82596333.pdf
Line 216 ⟶ 248:
| hdl=1813/6370
}}
*{{citation
| last1 = Ruzzo | first1 = Walter L.
| last2 = Tompa | first2 = Martin
| editor1-last = Lengauer | editor1-first = Thomas
| editor2-last = Schneider | editor2-first = Reinhard
| editor3-last = Bork | editor3-first = Peer
| editor4-last = Brutlag | editor4-first = Douglas L.
| editor5-last = Glasgow | editor5-first = Janice I.
| editor6-last = Mewes | editor6-first = Hans-Werner
| editor7-last = Zimmer | editor7-first = Ralf
| contribution = A Linear Time Algorithm for Finding All Maximal Scoring Subsequences
| contribution-url = https://www.aaai.org/Library/ISMB/1999/ismb99-027.php
| pages = 234–241
| publisher = AAAI
| title = Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, August 6–10, 1999, Heidelberg, Germany
| year = 1999}}
*{{citation
| first = Tadao | last = Takaoka
Line 237 ⟶ 285:
| pages = 446–452
| doi =
| isbn = 978-0-89871-410-4
| access-date = November 17, 2018
}}
*{{citation
| last1 = Weddell | first1 = Stephen John
| last2 = Read | first2 = Tristan
| last3 = Thaher | first3 = Mohammed
| last4 = Takaoka | first4 = Tadao
| doi = 10.1117/1.JEI.22.4.043011
| issue = 4
| journal = Journal of Electronic Imaging
| page = 043011
| title = Maximum subarray algorithms for use in astronomical imaging
| volume = 22
| year = 2013| bibcode = 2013JEI....22d3011W
}}
== External links ==
Line 251 ⟶ 313:
[[Category:Optimization algorithms and methods]]
[[Category:Dynamic programming]]
[[Category:Polynomial-time problems]]
[[Category:Articles with example Python (programming language) code]]
|