Maximum subarray problem

This is an old revision of this page, as edited by 95.80.65.253 (talk) at 15:35, 27 October 2011. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science, the maximum subarray problem is the task of finding the contiguous subarray within a one-dimensional array of numbers (containing at least one positive number) which has the largest sum. For example, for the sequence of values −2, 1, −3, 4, −1, 2, 1, −5, 4; the contiguous subarray with the largest sum is 4, −1, 2, 1, with sum 6.

The problem was first posed by Ulf Grenander of Brown University in 1977, as a simplified model for maximum likelihood estimation of patterns in digitized images. A linear time algorithm was found soon afterwards by Jay Kadane of Carnegie-Mellon University (Bentley 1984).

Kadane's algorithm

Kadane's algorithm consists of a scan through the array values, computing at each position the maximum subarray ending at that position. This subarray is either empty (in which case its sum is zero) or consists of one more element than the maximum subarray ending at the previous position. Thus, the problem can be solved with the following code, expressed here in Python:

def max_subarray(A):
    max_so_far = max_ending_here = 0
    for x in A:
        max_ending_here = max(0, max_ending_here + x)
        max_so_far = max(max_so_far, max_ending_here)
    return max_so_far

The algorithm can also be easily modified to keep track of the starting and ending indices of the maximum subarray.

Because of the way this algorithm uses optimal substructures (the maximum subarray ending at each position is calculated in a simple way from a related but smaller and overlapping subproblem, the maximum subarray ending at the previous position) this algorithm can be viewed as a simple example of dynamic programming.


Panov and Synkov's algorithm

More efficient algorithm was invented by two russian programmers. It takes O(nlg(n)) operations.

private static int[] maxSubarray(int[] array,
                                     int startIdx,
                                     int endIdx)
    {
        final int len = endIdx - startIdx;

        if (len < 2)
        {
            return new int[] {startIdx, endIdx, array[startIdx]};
        }

        final int midIdx = startIdx + len / 2;

        final int[] left  = maxSubarray(array, startIdx, midIdx),
                    right = maxSubarray(array, midIdx, endIdx);
        int crossingMaxLeftStartIdx = midIdx - 1;
        int crossingMaxLeft = array[crossingMaxLeftStartIdx];
        int tmpSum = crossingMaxLeft;

        for (int idx = crossingMaxLeftStartIdx - 1; idx >= startIdx; --idx)
        {
            tmpSum += array[idx];
            if (tmpSum > crossingMaxLeft)
            {
                crossingMaxLeft = tmpSum;
                crossingMaxLeftStartIdx = idx;
            }
        }

        int crossingMaxRightEndIdx = midIdx + 1;
        int crossingMaxRight = array[midIdx];
        tmpSum = crossingMaxRight;

        for (int idx = crossingMaxRightEndIdx; idx < endIdx; ++idx)
        {
            tmpSum += array[idx];
            if (tmpSum > crossingMaxRight)
            {
                crossingMaxRight = tmpSum;
                crossingMaxRightEndIdx = idx + 1;
            }
        }

        final int crossingMax = crossingMaxLeft + crossingMaxRight;
        int bestStartIdx = left[0],
            bestEndIdx = left[1],
            bestSum = left[2];

        if (right[2] > bestSum)
        {
            bestStartIdx = right[0];
            bestEndIdx = right[1];
            bestSum = right[2];
        }

        if (crossingMax > bestSum)
        {
            bestStartIdx = crossingMaxLeftStartIdx;
            bestEndIdx = crossingMaxRightEndIdx;
            bestSum = crossingMax;
        }

        return new int[] {bestStartIdx, bestEndIdx, bestSum};
    }

This function returns an array of three integers. First element is the start index of a maximum subarray, second one is the index of the subarray's last element +1, and third element contains the sum of the subarray. Note that there can be several max subarrays, the presented algorithm finds one of them.

Generalizations

Similar problems may be posed for higher dimensional arrays, but their solution is more complicated; see, e.g., Takaoka (2002). Brodal & Jørgensen (2007) showed how to find the k largest subarray sums in a one-dimensional array, in the optimal time bound O(n + k).

References

  • Bentley, Jon (1984), "Programming pearls: algorithm design techniques", Communications of the ACM, 27 (9): 865–873, doi:10.1145/358234.381162.
  • Brodal, Gerth Stølting; Jørgensen, Allan Grønlund (2007), "A linear time algorithm for the k maximal sums problem", Mathematical Foundations of Computer Science 2007, Lecture Notes in Computer Science, vol. 4708, Springer-Verlag, pp. 442–453, doi:10.1007/978-3-540-74456-6_40.
  • Takaoka, T. (2002), "Efficient algorithms for the maximum subarray problem by distance matrix multiplication", Electronic Notes in Theoretical Computer Science, 61.