Array DBMS: Difference between revisions

Content deleted Content added
m link distributed processing
m Typo fixing, replaced: ad-hoc → ad hoc, ’s → 's
Line 1:
{{short description|System that provides database services specifically for arrays}}
'''Array database management systems''' ('''array DBMSs''') provide [[Database management system|database]] services specifically for [[array data structure|array]]s (also called [[Raster graphics|raster data]]), that is: homogeneous collections of data items (often called [[pixel]]s, [[voxel]]s, etc.), sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be [[Big data|Big Data]], with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today’stoday's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.
 
[[File:Euclidean neighborhood in n-D arrays.png|thumb|150px|alt=Euclidean neighborhood of elements in arrays|Euclidean neighborhood of elements in arrays]]
Line 22:
In terms of Array DBMS implementations, the [[rasdaman]] system has the longest implementation track record of n-D arrays with full query support. [[Oracle Spatial|Oracle GeoRaster]] offers chunked storage of 2-D raster maps, albeit without SQL integration. [[TerraLib]] is an open-source GIS software that extends object-relational DBMS technology to handle spatio-temporal data types; while main focus is on vector data, there is also some support for rasters. Starting with version 2.0, [[Postgis|PostGIS]] embeds raster support for 2-D rasters; a special function offers declarative raster query functionality. [[SciQL]] is an array query language being added to the [[MonetDB]] DBMS. [[Michael Stonebraker#SciDB|SciDB]] is a more recent initiative to establish array database support. Like SciQL, arrays are seen as an equivalent to tables, rather than a new attribute type as in rasdaman and PostGIS.
 
For the special case of [[sparse matrix|sparse data]], [[OLAP]] data cubes are well established; they store cell values together with their ___location{{snd}} an adequate compression technique in face of the few locations carrying valid information at all{{snd}} and operate with SQL on them. As this technique does not scale in density, standard databases are not used today for dense data, like satellite images, where most cells carry meaningful information; rather, proprietary ad- hoc implementations prevail in scientific data management and similar situations. Hence, this is where Array DBMSs can make a particular contribution.
 
Generally, Array DBMSs are an emerging technology. While operationally deployed systems exist, like [[Oracle Spatial|Oracle GeoRaster]], [[Postgis|PostGIS 2.0]] and [[rasdaman]], there are still many open research questions, including query language design and formalization, query optimization, parallelization and [[distributed processing]], and scalability issues in general. Besides, scientific communities still appear reluctant in taking up array database technology and tend to favor specialized, proprietary technology.