Data structure alignment: Difference between revisions

Content deleted Content added
Computing padding: Just call it a bitwise AND; yes, it's a Boolean AND on each of the bits, but it's not a Boolean operation on the entire address. Add a second "is" to make it a bit clearer.
m unpiped links using script
Line 6:
{{anchor|1|2|4|8|16|256|4096|Page|Inpage}}<!-- parked anchors for common alignments/boundaries to improve incoming redirects -->
{{Use dmy dates|date=January 2020|cs1-dates=y}}
 
'''Data structure alignment''' is the way data is arranged and accessed in [[computer memory]]. It consists of three separate but related issues: '''data alignment''', '''data structure padding''', and '''packing'''.
 
The [[Central processing unit|CPU]] in modern computer hardware performs reads and writes to memory most efficiently when the data is ''naturally aligned'', which generally means that the data's memory address is a multiple of the data size. For instance, in a 32-bit architecture, the data may be aligned if the data is stored in four consecutive bytes and the first byte lies on a 4-byte boundary.
 
''Data alignment'' is the aligning of elements according to their natural alignment. To ensure natural alignment, it may be necessary to insert some ''padding'' between structure elements or after the last element of a structure. For example, on a 32-bit machine, a data structure containing a 16-bit value followed by a 32-bit value could have 16 bits of ''padding'' between the 16-bit value and the 32-bit value to align the 32-bit value on a 32-bit boundary. Alternatively, one can ''pack'' the structure, omitting the padding, which may lead to slower access, but uses three quarters as much memory.
Line 79 ⟶ 80:
 
===Computing padding===
The following formulas provide the number of padding bytes required to align the start of a data structure (where ''mod'' is the [[Modulomodulo operation|modulooperator]] operator):
padding = (align - (offset mod align)) mod align
aligned = offset + padding
Line 254 ⟶ 255:
Alignment concerns can affect areas much larger than a C structure when the purpose is the efficient mapping of that area through a hardware [[CPU cache#Address translation|address translation]] mechanism (PCI remapping, operation of a [[memory management unit|MMU]]).
 
For instance, on a 32-bit operating system, a 4&nbsp;[[kibibyte|KiB]] (4096 Bytes) page is not just an arbitrary 4&nbsp;KiB chunk of data. Instead, it is usually a region of memory that's aligned on a 4&nbsp;KiB boundary. This is because aligning a page on a page-sized boundary lets the hardware map a virtual address to a physical address by substituting the higher bits in the address, rather than doing complex arithmetic.
 
Example: Assume that we have a TLB mapping of virtual address 0x2CFC7000 to physical address 0x12345000. (Note that both these addresses are aligned at 4&nbsp;KiB boundaries.) Accessing data located at virtual address va=0x2CFC7ABC causes a TLB resolution of 0x2CFC7 to 0x12345 to issue a physical access to pa=0x12345ABC. Here, the 20/12-bit split luckily matches the hexadecimal representation split at 5/3 digits. The hardware can implement this translation by simply combining the first 20&nbsp;bits of the physical address (0x12345) and the last 12&nbsp;bits of the virtual address (0xABC). This is also referred to as virtually indexed (ABC) physically tagged (12345).