Microsoft SQL Server: Difference between revisions

Content deleted Content added
m Fixed k capitalization in frequency and storage capacity units (via WP:JWB)
m Restored kB to KB per Kilobyte#Use of term (via WP:JWB)
Line 103:
Microsoft SQL Server also allows user-defined composite types (UDTs) to be defined and used. It also makes server statistics available as virtual tables and views (called Dynamic Management Views or DMVs). In addition to tables, a database can also contain other objects including [[view (database)|views]], [[stored procedure]]s, [[index (database)|indexes]] and [[constraint (database)|constraints]], along with a transaction log. A SQL Server database can contain a maximum of 2<sup>31</sup> objects, and can span multiple OS-level files with a maximum file size of 2<sup>60</sup> bytes (1 exabyte).<ref name="storageengine" /> The data in the database are stored in primary data files with an extension <code>.mdf</code>. Secondary data files, identified with a <code>.ndf</code> extension, are used to allow the data of a single database to be spread across more than one file, and optionally across more than one file system. Log files are identified with the <code>.ldf</code> extension.<ref name="storageengine" />
 
Storage space allocated to a database is divided into sequentially numbered ''pages'', each 8&nbsp;kBKB in size. A ''page'' is the basic unit of [[Input/output|I/O]] for SQL Server operations. A page is marked with a 96-byte header which stores metadata about the page including the page number, page type, free space on the page and the ID of the object that owns it. The page type defines the data contained in the page. This data includes: data stored in the database, an index, an allocation map, which holds information about how pages are allocated to tables and indexes; and a change map which holds information about the changes made to other pages since last backup or logging, or contain large data types such as image or text. While a page is the basic unit of an I/O operation, space is actually managed in terms of an ''extent'' which consists of 8 pages. A database object can either span all 8 pages in an extent ("uniform extent") or share an extent with up to 7 more objects ("mixed extent"). A row in a database table cannot span more than one page, so is limited to 8&nbsp;kBKB in size. However, if the data exceeds 8&nbsp;kBKB and the row contains ''varchar'' or ''varbinary'' data, the data in those columns are moved to a new page (or possibly a sequence of pages, called an ''allocation unit'') and replaced with a pointer to the data.<ref name="pages" />
 
For physical storage of a table, its rows are divided into a series of partitions (numbered 1 to n). The partition size is user defined; by default all rows are in a single partition. A table is split into multiple partitions in order to spread a database over a [[cluster computing|computer cluster]]. Rows in each partition are stored in either [[B-tree]] or [[heap (data structure)|heap]] structure. If the table has an associated, clustered [[index (database)|index]] to allow fast retrieval of rows, the rows are stored in-order according to their index values, with a B-tree providing the index. The data is in the leaf node of the leaves, and other nodes storing the index values for the leaf data reachable from the respective nodes. If the index is non-clustered, the rows are not sorted according to the index keys. An indexed [[view (database)|view]] has the same storage structure as an indexed table. A table without a clustered index is stored in an unordered heap structure. However, the table may have non-clustered indices to allow fast retrieval of rows. In some situations the heap structure has performance advantages over the clustered structure. Both heaps and B-trees can span multiple allocation units.<ref name="table" />
 
=== Buffer management ===
SQL Server [[data buffer|buffer]]s pages in RAM to minimize disk I/O. Any 8&nbsp;kBKB page can be buffered in-memory, and the set of all pages currently buffered is called the buffer cache. The amount of memory available to SQL Server decides how many pages will be cached in memory. The buffer cache is managed by the ''Buffer Manager''. Either reading from or writing to any page copies it to the buffer cache. Subsequent reads or writes are redirected to the in-memory copy, rather than the on-disc version. The page is updated on the disc by the Buffer Manager only if the in-memory cache has not been referenced for some time. While writing pages back to disc, [[asynchronous I/O]] is used whereby the [[I/O operation]] is done in a background thread so that other operations do not have to wait for the I/O operation to complete. Each page is written along with its [[checksum]] when it is written. When reading the page back, its checksum is computed again and matched with the stored version to ensure the page has not been damaged or tampered with in the meantime.<ref name="Buffer Management" />
 
=== Concurrency and locking ===