Content deleted Content added
m Typo fixing, replaced: ad-hoc → ad hoc, ’s → 's |
authorlinks |
||
Line 118:
Array storage has to accommodate arrays of different dimensions and typically large sizes. A core task is to maintain spatial proximity on disk so as to reduce the number of disk accesses during subsetting. Note that an emulation of multi-dimensional arrays as nested lists (or 1-D arrays) will not per se accomplish this and, therefore, in general will not lead to scalable architectures.
Commonly arrays are partitioned into sub-arrays which form the unit of access. Regular partitioning where all partitions have the same size (except possibly for boundaries) is referred to as ''chunking''.<ref>[[Sunita Sarawagi|Sarawagi, S.]], [[Michael Stonebraker|Stonebraker, M.]]: Efficient Organization of Large Multidimensional Arrays. Proc. ICDE'94, Houston, USA, 1994, pp. 328-336</ref> A generalization which removes the restriction to equally sized partitions by supporting any kind of partitioning is ''tiling''.<ref>Furtado, P., Baumann, P.: [http://www.informatik.uni-trier.de/~ley/db/conf/icde/icde99.html#FurtadoB99 Storage of Multidimensional Arrays based on Arbitrary Tiling]. Proc. ICDE'99, March 23–26, 1999, Sydney, Australia, pp. 328–336</ref> Array partitioning can improve access to array subsets significantly: by adjusting tiling to the access pattern, the server ideally can fetch all required data with only one disk access.
Compression of tiles can sometimes reduce substantially the amount of storage needed. Also for transmission of results compression is useful, as for the large amounts of data under consideration networks bandwidth often constitutes a limiting factor.
|