Block Range Index: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 20:16, 24 May 2016 edit Inactive JF (talk \| contribs) 9 edits No edit summary ← Previous edit		Latest revision as of 18:45, 23 August 2024 edit undo Frap (talk \| contribs) Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 35,599 edits m →Design
(19 intermediate revisions by 14 users not shown)
Line 1: {{Short description\|Database indexing technique}} A '''Block Range Index''' or '''BRIN''' is a [[database index]]ing technique. They are intended to improve performance with extremely large{{efn-lr\|'Large' here refers to the number of rows in a [[database table\|table]], rather than the field sizes or overall size.}} tables. Line 9 ⟶ 10: }}</ref> A BRIN is applicable to an index on a table that is large and where the index key value is easily sorted and evaluated with a [[MinMax function]].{{efn-lr\|A function which efficiently evaluates a large number of data items and returns their minimum and maximum values. The concepts of "minimum" and "maximum" are broad and may be applied to any data type, or their combinations, which is [[sorting algorithm\|sortable]].}} BRIN were originally proposed by Alvaro Herrera of [[~~2nd Quadrant~~2ndQuadrant]] in 2013 as 'Minmax indexes'.<ref name="Herrera, 2013">{{cite web \|title=Minmax indexes \|date=2013-06-14 Line 27 ⟶ 28: \|title=Chapter 62. BRIN Indexes \|url=http://www.postgresql.org/docs/9.5/static/brin-intro.html }}</ref> Other vendors have described some similar features,<ref name="Herrera, 2013"/> including [[Oracle database\|Oracle]],<ref name="Oracle, Exadata Storage Indexes"/><ref name=Solarwinds/> [[Netezza]] 'zone maps',<ref>{{cite web\|url=http://nztips.com/2010/11/netezza-integer-join-keys/ \|publisher=Netezza\|title=With Netezza Always Use Integer Join Keys For Good Compression, Zone Maps, And Joins\|year=2010}}</ref> [[Infobright]] 'data packs',<ref>{{cite web\|title=Data packs \|publisher=[[Infobright]]\|url=http://www.infobright.org/Blog/Entry/organizing_data_and_more_about_rough_data_contest/~~}}{{dead link~~\|archive-url=https://web.archive.org/web/20090627103453/http://www.infobright.org/Blog/Entry/organizing_data_and_more_about_rough_data_contest\|archive-date=~~January 2016~~2009-06-27}}</ref>, [[MonetDB]]<ref>{{cite ~~web~~CiteSeerX \|title=Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS\|year=2007\|~~publisher~~pages=~~MonetDB~~723–734 \|~~url~~citeseerx=10.1.1.108.2662}}</ref> and ~~http~~[[Apache Hive]] with ORC/Parquet.<ref>{{cite web\|title=Hive Optimizations with Indexes, Bloom-Filters and Statistics\|year=2015\|publisher=Jörn Franke\|url=https://~~citeseerx~~snippetessay.~~ist~~wordpress.~~psu.edu~~com/~~viewdoc~~2015/~~summary?doi~~07/25/hive-optimizations-with-indexes-bloom-filters-and-statistics/\|access-date=102016-05-24\|archive-date=2016-03-04\|archive-url=https://web.1archive.1org/web/20160304183122/https://snippetessay.~~108~~wordpress.~~2662~~com/2015/07/25/hive-optimizations-with-indexes-bloom-filters-and-statistics/\|url-status=dead}}</ref> }}</ref> and [[Apache Hive]] with ORC/Parquet<ref>{{cite web\|title=Hive Optimizations with Indexes, Bloom-Filters and Statistics\|year=2015\|publisher=Jörn Franke\|url=https://snippetessay.wordpress.com/2015/07/25/hive-optimizations-with-indexes-bloom-filters-and-statistics/ ~~}}</ref>.~~ == Design == [[File:B-tree.svg\|thumb\|B-tree index structure]] [[File:BRIN index.svg\|thumb\|BRIN index structure]]BRIN operate by "summarising" large blocks of data into a compact form, which can be efficiently tested to exclude many of them from a database query, early on. These tests exclude a large block of data for each comparison. By reducing the data volume so early on, both by representing large blocks as small tuples, and by eliminating many blocks, BRIN substantially reduce the amount of detailed data that must be examined by the database node on a row-by-row basis.<ref>{{Cite web \|last1=Herrera \|first1=Alvaro \|date=7 November 2014 \|title=commitdiff - BRIN: Block Range Indexes \|url=https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7516f5259411c02ae89e49084452dc342aadb2ae \|website=git.postgresql.org \|access-date=2017-10-03}}</ref> ~~\|date=22 November 2014~~ ~~\|title=WAITING FOR 9.5 – BRIN: BLOCK RANGE INDEXES.~~ ~~\|url=http://www.depesz.com/2014/11/22/waiting-for-9-5-brin-block-range-indexes/~~ ~~}}</ref>~~ Data storage in large databases is layered and chunked, with the table storage arranged into 'blocks'. Each block contains perhaps 1MB in each chunk{{efn-lr\| PostgreSQL has a default ~~block~~BRIN chunk size of ~~128×~~128× 8k pages, or 1MB.<ref name="Wahts new"/> Oracle terms these 'storage regions' and gives them a default size of 1MB.<ref>{{cite web\|title=When is ~~Exadata’s~~Exadata's storage indexes used?\|website=OakTable.net\|url=http://www.oaktable.net/blog/when-exadata%E2%80%99s-storage-indexes-used?page=2}}</ref>}}<ref name="Richard Foote, 2012, I"/> and they are retrieved by requesting specific blocks from a disk-based storage layer. BRIN are a lightweight in-memory summary layer above this: each tuple in the index summarises one block as to the range of the data contained therein: its minimum and maximum values, and if the block contains any non-null data for the column(s) of interest.<ref name="Wong, 2015">{{Cite web \|title=PostgreSQL Performance Presentation \|author=Mark Wong Line 51 ⟶ 46: Some simple benchmarks suggest a five-fold improvement in search performance with an index scan, compared to the unindexed table.<ref name="Wahts new"/> Compared to B-trees, they avoid their maintenance overhead.<ref name="Herrera, 2013"/> As BRIN are so lightweight, they may be held entirely in memory, thus avoiding disk overhead during the scan. The same may not be true of B-tree: B-tree requires a tree node for every approximately ''N rows'' in the table, where N is the capacity of a single node, thus the index size is large. As BRIN only requires a tuple for each block (of many rows), the index becomes sufficiently small to make the difference between disk and memory. For a 'narrow' table{{efn-lr\|The table columns are barely wider than the indexed columns.}} the B-tree index volume approaches that of the table itself; the BRIN may be only ~~5-15~~5–15% of it.<ref name="Python Sweetness">{{cite web \|title= Block Range (BRIN) Indexes in PostgreSQL 9.5 \|date=May 22, 2015 Line 72 ⟶ 67: \|url=http://axleproject.eu/2014/11/30/progress-on-online-upgrade/}}</ref> With BRIN, the slowdown from maintaining the index is much reduced compared to B-tree.<ref>{{cite ~~web~~news \|title=Index Overhead on a Growing Table \|date=10 October 2014 \|author=Mark Wong \|newspaper=2Ndquadrant \| Postgresql \|url=http://blog.2ndquadrant.com/index-overhead-growing-table/ }}</ref> Wong reports that B-tree slowed down additions to an unindexed 10GB table by 85%, but a comparable BRIN only had an overhead of 11%.<ref name="Wong, 2014" /> Line 87 ⟶ 83: === Dependence on table ordering === Multiple BRIN may be defined for different columns on a single table. However, there are restrictions. BRIN are only efficient if the ordering of the key values follows the organisation of blocks in the storage layer.<ref name="Richard Foote, 2012, I"/><ref name="Python Sweetness"/> In the simplest case, this could require the physical ordering of the table, which is often the creation order of the rows within it, to match the key's order. Where this key is a creation date, that may be a trivial requirement.<ref name="Wong, 2015"/>{{rp\|9}} Line 96 ⟶ 92: === Exadata Storage Indexes === BRIN have some similarities to [[Oracle Exadata]] ”"[[Storage Index]]~~es”~~es".<ref name="Herrera, 2013"/><ref name="Oracle, Exadata Storage Indexes">{{cite journal \|title=Smart Scans Meet Storage Indexes \|author=Arup Nanda Line 109 ⟶ 105: }}</ref> Exadata has the strong concept of a 'storage layer' in its architecture stack. Table data is held in blocks or 'storage cells' on the storage servers. These storage cells are [[block-level storage\|opaque to the storage server]] and are returned to the database engine on request, by their identifier. Previously, the database nodes must request all the storage cells in order to scan them.<ref name=Solarwinds>{{cite web \|title=Understanding Storage Indexes in Oracle Exadata \|date=2 June 2015 \|url=http://logicalread.solarwinds.com/storage-indexes-oracle-exadata-mc05/#.Vpjbp27fXU8 }}</ref> Line 118 ⟶ 115: \|url=https://richardfoote.wordpress.com/2012/10/04/exadata-storage-indexes-part-i-beginning-to-see-the-light/ }}</ref>{{efn-lr\|Foote<ref name="Richard Foote, 2012, I"/> describes the Index as holding "a flag to denote whether any Nulls exist". This is probably a typo: Oracle describes them as, "negative indexes" where "they identify the areas that definitely will not contain the values"<ref name="Oracle, Exadata Storage Indexes"/> In such a case, the flag would be more clearly described as denoting either "''Only'' Nulls exist" or if "Any ''non-''Nulls exist".}}<ref>{{cite web \|title=~~Exadata’s~~Exadata's Best Kept Secret: Storage Indexes \|author=Marc Fielding \|date=July 20, 2010 Line 132 ⟶ 129: == Development == Development for PostgreSQL was carried out as part of the [[AXLE project]], (Advanced Analytics for Extremely Large European Databases)<ref>{{Cite web \|title=AXLE project (Advanced Analytics for Extremely Large European Databases) \|year=2014 \|url=http://axleproject.eu/}}</ref> This work was partially funded by the European Union's [[Seventh Framework Programme]] (FP7/2007-2013).<ref>{{Cite web \|title=European Union's [[Seventh Framework Programme]] (FP7/2007-2013) under grant agreement n° 318633. \|url= }}</ref> Line 144 ⟶ 141: \|website=PostgreSQL \| url = http://www.postgresql.org/message-id/E1XmpRL-0001Zh-Sd@gemulon.postgresql.org \| ~~accessdate~~access-date = 2016-01-14 }}</ref> == See also == Line 153 ⟶ 150: == References == {{~~reflist~~Reflist\|30em}} [[Category:Database index techniques]]