Dynamic array

This is an old revision of this page, as edited by Dcoetzee (talk | contribs) at 03:26, 19 November 2005 (References: Fix year). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A dynamic array, growable array, dynamic table, or array list is a data structure, an array which is automatically expanded to accomodate new objects if filled beyond its current size. It may also automatically deallocate some of its unused space to save memory. They have become a popular data structure in modern mainstream programming languages, supplied with most standard libraries.

One of the main disadvantages of a simple array is that it has a single fixed size, and although its size can be altered in some environments (for example, with C's realloc function), this is an expensive operation that may involve copying the entire contents of the array.

Dynamic arrays automatically perform this resizing as late as possible, when the programmer attempts to add an element to the end of the array and there is no more space. However, if we added just one element to the array each time it runs out of space, the cost of the resizing operations rapidly becomes prohibitive.

To deal with this, we instead resize the array by a large amount, such as doubling its size. Then, the next time we need to enlarge the array, we just expand it into some of this reserved space. The amount of space we have allocated for the array is called its capacity, and it may be larger than its current logical size. Here's how the operation adding an element to the end might work:

function insertEnd(dynarray a, element e)
    if a.size = a.capacity {
        resize a to twice its current capacity
        a.capacity = a.capacity × 2
    }
    a[a.size] := e
    a.size := a.size + 1

Using amortized analysis, it can be shown that as long as we expand the array by some fixed percentage each time, the cost for inserting n elements will be O(n); we say the insertions have an amortized cost of O(1) each.

Many dynamic arrays also deallocate some of the underlying storage if its size drops below a certain threshold, such as 30% of the capacity.

In choosing the percentage by which to enlarge the table, there is a time-space tradeoff: the average time per insertion operation is about a/(a−1), where a is the multiplier factor such as 2 by which the table size increases each time. On the other hand, capacity minus size space is allocated but not in use, a quantity bounded above by (a−1)n. The choice a=2 is a commonly-used value, but depending on the application many values between about a=1.2 and a=4 may be suitable.

Dynamic arrays benefit from many of the advantages of arrays, including good locality of reference and data cache utilization, compactness (low memory use), and random access. They usually have only a small fixed additional overhead for storing information about the size and capacity. This makes dynamic arrays an attractive tool for building cache-friendly data structures.

In computer science education, dynamic arrays often serve as an elementary example of amortized analysis because of their simplicity and familiarity.

Language support

C++'s std::vector is an implementation of dynamic arrays, as are the ArrayList classes supplied with the Java API and the .NET Framework. The generic List<> class supplied with version 2.0 of the .NET Framework is also implemented with dynamic arrays.

References