Data structure alignment: Difference between revisions

Content deleted Content added
Fourdee (talk | contribs)
Compatibility: alignment on x86 and x64
Fourdee (talk | contribs)
m moving x86 info, re-arranging a few things
Line 21:
When a single memory word is accessed the operation is atomic, i.e. the whole memory word is read or written at once and other devices must wait until the read or write operation completes before they can access it. This may not be true for unaligned accesses to multiple memory words, e.g. the first word might be read by one device, both words written by another device and then the second word read by the first device so that the value read is neither the original value nor the updated value. Although such failures are rare, they can be very difficult to identify.
 
==Architectures==
==Data Structure Padding==
===RISC===
Although the [[compiler]] (or [[interpreter (computing)|interpreter]]) normally allocates individual data items on aligned boundaries, data structures often have members with different alignment requirements. To maintain proper alignment the translator normally inserts additional unnamed data members so that each member is properly aligned. In addition the data structure as a whole may be padded with a final unnamed member. This allows each member of an array of structures to be properly aligned.
 
Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the amount of padding required to maintain alignment. For example, if members are sorted by ascending or descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure. Computing the maximum amount of padding required is more complicated, but is always less than the sum of the alignment requirements for all members minus twice the sum of the alignment requirements for the least aligned half of the structure members.
 
Although C and C++ do not allow the compiler to reorder structure members to save space, other languages might. It is also possible to tell most C and C++ compilers to "pack" the members of a structure to a certain level of alignment, e.g. "pack(2)" means align data members larger than a byte to a two-byte boundary so that any padding members are at most one byte long.
 
One use for such "packed" structures is to conserve memory. For example, a structure containing a single byte and a four-byte integer would require three additional bytes of padding. A large array of such structures would use 37.5% less memory if they are packed, although accessing each structure might take longer. This compromise may be considered a form of [[space-time tradeoff]].
 
Although use of "packed" structures is most frequently used to conserve memory space, it may also be used to format a data structure for transmission using a standard protocol. Since this depends upon the native byte ordering ([[endianness]]) for the processor matching the byte ordering of the protocol, this usage is not recommended.
 
==Unaligned Pointer Support==
Most RISC processors will generate an alignment fault when a load or store instruction accesses a misaligned address. This allows the operating system to emulate the misaligned access using other instructions. For example, the alignment fault handler might use byte loads or stores (which are always aligned) to emulate a larger load or store instruction.
 
Line 41 ⟶ 31:
Because these instructions are larger and slower than the normal memory load and store instructions they should only be used when necessary. Most C and C++ compilers have an “unaligned” attribute that can be applied to pointers that need the unaligned instructions.
 
==Compatibility=x86 and x64===
While the [[x86]] architecture originally did not require aligned memory access and still works without it, [[SSE2]] instructions on [[x86]] and [[x64]] CPUs ''do'' require the data to be 128-bit (16-byte) aligned and there can be substantial performance advantages from using aligned data on these architectures.
 
===Compatibility===
The advantage to supporting unaligned access is that it is easier to write compilers that do not need to align memory, at the expense of the cost of slower access. One way to increase performance in [[RISC]] processors which are designed to maximize raw performance is to require data to be loaded or
stored on a word boundary. So though memory is commonly addressed by 8 bit bytes, loading a 32 bit integer or 64 bit floating point number would be required to be start at every 64 bits on a 64 bit machine. The processor could flag a fault if it were asked to load a number which was not on such a boundary, but this would result in a slower call to a routine which would need to figure out which word or words contained the data and extract the equivalent value.
 
==Data Structure Padding==
While the [[x86]] originally did not require aligned memory access and still works without it, [[SSE2]] instructions on [[x86]] and [[x64]] CPUs ''do'' require the data to be 128-bit (16-byte) aligned and there can be substantial performance advantages from using aligned data on these architectures.
Although the [[compiler]] (or [[interpreter (computing)|interpreter]]) normally allocates individual data items on aligned boundaries, data structures often have members with different alignment requirements. To maintain proper alignment the translator normally inserts additional unnamed data members so that each member is properly aligned. In addition the data structure as a whole may be padded with a final unnamed member. This allows each member of an array of structures to be properly aligned.
 
Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the amount of padding required to maintain alignment. For example, if members are sorted by ascending or descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure. Computing the maximum amount of padding required is more complicated, but is always less than the sum of the alignment requirements for all members minus twice the sum of the alignment requirements for the least aligned half of the structure members.
 
Although C and C++ do not allow the compiler to reorder structure members to save space, other languages might. It is also possible to tell most C and C++ compilers to "pack" the members of a structure to a certain level of alignment, e.g. "pack(2)" means align data members larger than a byte to a two-byte boundary so that any padding members are at most one byte long.
 
One use for such "packed" structures is to conserve memory. For example, a structure containing a single byte and a four-byte integer would require three additional bytes of padding. A large array of such structures would use 37.5% less memory if they are packed, although accessing each structure might take longer. This compromise may be considered a form of [[space-time tradeoff]].
 
Although use of "packed" structures is most frequently used to conserve memory space, it may also be used to format a data structure for transmission using a standard protocol. Since this depends upon the native byte ordering ([[endianness]]) for the processor matching the byte ordering of the protocol, this usage is not recommended.
 
==Typical alignment of C structs on x86==