Data structure alignment: Difference between revisions

Content deleted Content added
Roboto de Ajvol (talk | contribs)
No edit summary
Line 1:
{{context}}
 
'''Data Structure Alignment''' is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: ''Data Alignment'' and ''Data Structure Padding''. Data Alignment is the offset of a particular [[Data (computing)|datum]] in computer memory from boundaries that depend on the [[Data (computing)|datum]] type and processor characteristics. Aligning data usually refers to allocating memory addresses for data such that each primitive [[Data (computing)|datum]] is assigned a memory address that is a multiple of its size. Data Structure Padding is the insertion of unnamed members in a data structure to preserve the relative alignment of the structure members.
 
Although Data Structure Alignment is a fundamental issue for all modern computers, many computer languages and computer language implementations handle data alignment automatically. Certain C and C++ implementations and assembly language allow at least partial control of data structure padding, which may be useful in certain special circumstances.
Line 8:
A [[computer memory|memory]] address ''a'', is said to be ''n-byte aligned'' when ''n'' is a power of two and ''a'' is a multiple of ''n'' [[byte|bytes]]. In this context a byte is the smallest unit of memory access, i.e. each memory address specifies a different byte. An ''n''-byte aligned address would have ''log<sub>2</sub> n'' least-significant zeros when expressed in [[Binary numeral system|binary]].
 
A memory access is said to be ''aligned'' when the [[Data (computing)|datum]] being accessed is ''n'' bytes long and the [[Data (computing)|datum]] address is ''n''-byte aligned. When a memory access is not aligned, it is said to be ''misaligned''. Note that by definition byte memory accesses are always aligned.
 
A memory pointer that refers to primitive data that is ''n'' bytes long is said to be ''aligned'' if it is only allowed to contain addresses that are ''n''-byte aligned, otherwise it is said to be ''unaligned''. A memory pointer that refers to a data aggregate (a data structure or array) is ''aligned'' if (and only if) each primitive [[Data (computing)|datum]] in the aggregate is aligned.
 
Note that the definitions above assume that each primitive [[Data (computing)|datum]] is an even power of two bytes long. When this is not the case (as with 80-bit floating-point on x86) the context influences the conditions where the [[Data (computing)|datum]] is considered aligned or not.
 
== Problems ==
A computer accesses memory a single memory word at a time. As long as the memory word size is at least as large as the largest primitive data type supported by the computer, aligned accesses will always access a single memory word. This may not be true for misaligned data accesses.
 
If the highest and lowest bytes in a [[Data (computing)|datum]] are not within the same memory word the computer must split the [[Data (computing)|datum]] access into multiple memory accesses. This requires a lot of complex circuitry to generate the memory accesses and coordinate them. To handle the case where the memory words are in different memory pages the processor must either verify that both pages are present before executing the instruction or be able to handle a [[translation lookaside buffer|TLB]] miss or a [[page fault]] on any memory access during the instruction execution.
 
When a single memory word is accessed the operation is atomic, i.e. the whole memory word is read or written at once and other devices must wait until the read or write operation completes before they can access it. This may not be true for unaligned accesses to multiple memory words, e.g. the first word might be read by one device, both words written by another device and then the second word read by the first device so that the value read is neither the original value nor the updated value. Although such failures are rare, they can be very difficult to identify.