This article provides insufficient context for those unfamiliar with the subject. |
A common problem in computer programming is called word alignement. One way to increase performance, especially in RISC processors which are designed to maximize raw performance is to require data to loaded or stored on a word boundary. So though memory is commonly addressed by 8 bit bytes, loading a 32 bit integer or 64 bit floating point number would be required to be start at every 64 bits on a 64 bit machine. The processor could flag a fault if it were asked to load a number which was not on such a boundary, or call a routine which would effectively figure out which word or words contained the data and extract the equivalent value. This caused difficulty when the team from Mosaic Software ported their Twin Spreadsheet to the 68000 based Atari ST. The Intel 8086 architecture had no such restrictions. It would also cause difficulties in porting Microsoft Office to Windows NT on MIPS and PowerPC for NEC and IBM. Since the software was not written with such restrictions in mind, designers had to set a bit in the O/S to enable non-aligned data. However since this bit was masked with other flags, it was impossible to keep the O/S from faulting on non-aligned data when other modules used the other flags. This may have been a major factor in abandoning Windows NT on non-Intel processors as they failed as platforms for hosting common Windows applications and one more reason for the baffling dominance of the x86 architecture over technologically elegant rivals.
Technical View
Data structure members are stored sequentially in a memory so that in the structure below the member Data1 will always precede Data2 and Data2 will always precede Data3:
struct MyData { short Data1; short Data2; short Data3; };
If the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be aligned to a boundary of 2 bytes. Data1 would be at offset 0, Data2 at offset 2 and Data3 at offset 4. The size of this structure after would be 6 bytes.
The type of each member of the structure usually has a required alignment, meaning that it will, unless otherwise requested by the programmer, be aligned on a pre-determined boundary. As a rule of thumb an integral data member will align to a boundary equal to its own size. The following typical requirements are valid for compilers from Microsoft and Borland:
A byte aligns to any byte boundary.
A short word (consisting of two bytes) aligns to a two byte boundary.
A long word (four bytes) aligns to a four byte boundary.
Here is a structure with members of various types, totaling 8 bytes before compilation:
struct MixedData { byte Data1; short Data2; long Data3; byte Data4; };
After compilation the data structure will be supplemented with padding bytes to ensure a proper alignment for each of its member:
struct MixedData (after compilation) { byte Data1; byte Padding[1]; short Data2; long Data3; byte Data4; byte Padding[3]; };
The compiled size of the structure is now 12 bytes. It is important to note that the last member is padded with the number of bytes required to conform to the largest type of the structure. In this case 3 bytes are added to the last member to pad the structure to the size of a long word.
It’s possible to change the alignment of structures to reduce the memory they require (or to conform to an existing format) by changing the compiler’s alignment (or “packing”) of structure members.
Requesting that the “MixedData” structure above be aligned to a one byte boundary will have the compiler discard the pre-determined alignment of the members and no padding bytes would be inserted.
While there is no standard way of defining the alignment of structure member, some compilers use #pragma directives to specify packing inside source files. Here is an example:
#pragma pack( push ) ; push current alignment to stack #pragma pack( 1 ) ; set alignment to 1 byte boundary struct MyPackedData { byte Data1; long Data2; byte Data3; }; #pragma pack( pop ) ; restore original alignment from stack
This structure would have a compiled size of 6 bytes. The above directives are available in compilers from Microsoft, Borland and many others.