Content deleted Content added
Reverted to version as of 18:31, 6 October 2006, links added to promote a site are not good external links per 3rd guideline. |
No edit summary |
||
Line 3:
'''Data structure alignment''' is the way [[Data (computing)|data]] is arranged in [[physical memory]].
==
A memory access is said to be ''aligned'' when the datum being accessed is ''n'' bytes long and the datum address is ''n''-byte aligned. When a memory access is not aligned, it is said to be ''misaligned''. Note that by definition byte memory accesses are always aligned.
A memory pointer that refers to primitive data that is ''n'' bytes long is said to be ''aligned'' if it is only allowed to contain addresses that are ''n''-byte aligned, otherwise it is said to be ''unaligned''. A memory pointer that refers to a data aggregate (a data structure or array) is ''aligned'' if (and only if) each primitive datum in the aggregate is aligned.
Note that the definitions above assume that each primitive datum is an even power of two bytes long. When this is not the case (as with 80-bit floating-point on x86) the context influences the conditions where the datum is considered aligned or not.
== Problems ==
A computer accesses memory a single memory word at a time. If the highest and lowest bytes in a datum are not within the same memory word the computer must split the datum access into multiple memory accesses. This requires a lot of complex circuitry to generate the memory accesses and coordinate them.
If the highest and lowest bytes are in different memory-management pages the problems can be even worse because accessing either page could result in a [[translation lookaside buffer|TLB]] miss or a [[page fault]].
Some of the problems caused by unaligned access are:▼
* Extra transistors on the CPU are required to support accesses which are not word-aligned▼
As long as the memory word size is at least as large as the largest primitive data type supported by the computer, aligned accesses will always access a single memory word.
* Accesses across [[cache-lines]] require evicting two cache-lines.▼
▲* Accesses across [[cache-lines]] may require evicting two cache-lines.
* Accesses across page boundaries can incur two [[translation lookaside buffer|TLB]] misses and could even require swapping in both pages from disk
==Compatibility==
The advantage to supporting unaligned access is that it is easier to write compilers that do not need to align memory, at the expense of the cost of slower access. One way to increase performance in [[RISC]] processors which are designed to maximize raw performance is to require data to be loaded or
stored on a word boundary. So though memory is commonly addressed by 8 bit bytes, loading a 32 bit integer or 64 bit floating point number would be required to be start at every 64 bits on a 64 bit machine. The processor could flag a fault if it were asked to load a number which was not on such a boundary, but this would result in a slower call to a routine which would need to figure out which word or words contained the data and extract the equivalent value.
Line 23 ⟶ 30:
This caused difficulty when the team from [[Mosaic Software]] ported their [[Twin Spreadsheet]] to the [[68000]] based [[Atari ST]]. The Intel [[8086]] architecture had no such restrictions. {{fact}}
It would also cause difficulties in porting Microsoft Office to Windows NT on [[MIPS]], [[DEC Alpha|Alpha]] and [[PowerPC]] for [[NEC]], [[Digital Equipment Corporation|DEC]] and [[IBM]] respectively. Since the software was not written with such restrictions in mind, designers had to set a bit in the operating system to enable non-aligned data. However since this bit was masked with other flags which were used elsewhere, it was impossible to keep the operating system in a state from faulting on non-aligned data.
==Typical alignment of C structs on x86==
|