Divide-and-conquer algorithm: Difference between revisions

Content deleted Content added
m Advantages: punct., wiki
Line 38:
 
In all these examples, the D&C approach led to an improvement in the [[asymptotic complexity|asymptotic cost]] of the solution.
For example, if (a) the [[Recursion (computer science)|base cases]] have constant-bounded size, the work of splitting the problem and combining the partial solutions is proportional to the problem's size ''n'', and (b) there is a bounded number ''p'' of subproblems of size ~ ''n''/''p'' at each stage, then the cost of the divide-and-conquer algorithm will be O(''n'' log log<sub>''p''</sub>''n'').
 
=== Parallelism ===
Divide -and -conquer algorithms are naturally adapted for execution in multi-processor machines, especially [[shared-memory]] systems where the communication of data between processors does not need to be planned in advance, because distinct sub-problems can be executed on different processors.
 
=== Memory access ===
Divide-and-conquer algorithms naturally tend to make efficient use of [[memory cache]]s. The reason is that once a sub-problem is small enough, it and all its sub-problems can, in principle, be solved within the cache, without accessing the slower main memory. An algorithm designed to exploit the cache in this way is called ''[[cache-oblivious algorithm|cache-oblivious]]'', because it does not contain the cache size as an explicit parameter.<ref name="cahob">{{cite journal | author = M. Frigo |author2=C. E. Leiserson |author3=H. Prokop | title = Cache-oblivious algorithms | journal = Proc. 40th Symp. on the Foundations of Computer Science | year = 1999}}</ref>
Moreover, D&C algorithms can be designed for important algorithms (e.g., sorting, FFTs, and matrix multiplication) to be ''optimal'' cache-oblivious algorithms–they use the cache in a probably optimal way, in an asymptotic sense, regardless of the cache size. In contrast, the traditional approach to exploiting the cache is ''blocking'', as in [[loop nest optimization]], where the problem is explicitly divided into chunks of the appropriate size—this can also use the cache optimally, but only when the algorithm is tuned for the specific cache size(s) of a particular machine.
 
The same advantage exists with regards to other hierarchical storage systems, such as [[Non-Uniformuniform Memorymemory Accessaccess|NUMA]] or [[virtual memory]], as well as for multiple levels of cache: once a sub-problem is small enough, it can be solved within a given level of the hierarchy, without accessing the higher (slower) levels.
 
=== Roundoff control ===
In computations with rounded arithmetic, e.g. with [[floating -point]] numbers, a divide-and-conquer algorithm may yield more accurate results than a superficially equivalent iterative method. For example, one can add ''N'' numbers either by a simple loop that adds each datum to a single variable, or by a D&C algorithm called [[pairwise summation]] that breaks the data set into two halves, recursively computes the sum of each half, and then adds the two sums. While the second method performs the same number of additions as the first, and pays the overhead of the recursive calls, it is usually more accurate.<ref>Nicholas J. Higham, "The accuracy of floating point summation", ''SIAM J. Scientific Computing'' '''14''' (4), 783–799 (1993).</ref>
 
== Implementation issues ==