Content deleted Content added
m links |
misc; added →Why fragmentation occurs |
||
Line 1:
In computing, '''[[file system]] [[fragmentation (computer)|fragmentation]]''', sometimes called '''file system aging''', is the inability of a file system to lay out related data sequentially (contiguously), an inherent phenomenon in [[computer storage|storage]]-backed file systems that allow in-place modification of their contents. It is a special case of [[fragmentation (computer)#Data fragmentation|data fragmentation]]. File system fragmentation introduces disk head seeks which are known to hinder [[throughput]].
==Why fragmentation
Initially, when a file system is initialized on a partition (the partition is formatted for the file system), the entire space alotted is empty.<ref>The partition is not ''completely empty'': some internal file system structures are always created. However, these are typically contiguous, and their existence is negligible. Some file systems, such as [[NTFS]] and [[ext2]]+, might also preallocate empty contiguous regions for special purposes.</ref> This means that the allocator algorithm is completely free to position newly created files anywhere on the disk. For some time after creation, files on the file system can be laid out near-optimally. When the [[operating system]] and [[application software|application]]s are installed, or [[archive (computing)|archive]]s are unpacked, laying out separate files sequentially also means that related files are likely to be positioned close to each other.
However, as existing files are deleted or truncated, new regions of free space are created. When existing files are appended to, it is often impossible to resume the write exactly where the file used to end, as another file may already be allocated there — thus, a new fragment has to be allocated. As time goes on, and the same factors are continuously present, free space as well as frequently appended files tend to fragment more. Shorter regions of free space also mean that the allocator is no longer able to allocate new files contiguously, and has to break them into fragments.
To summarize, factors that typically cause or facilitate fragmentation, include:
* low free space.
* frequent deletion, truncation or extending of files.
==Performance implications==
File system fragmentation is projected to become more problematic with newer hardware due to the increasing disparity between sequential access speed and [[rotational delay]] (and to a lesser extent [[seek time]]), of consumer-grade [[hard disk]]s,<ref name=seagate-future>{{cite conference |author=Dr. Mark H. Kryder |publisher=[[Seagate Technology]] |date=2006-04-03 |title=Future Storage Technologies: A Look Beyond the Horizon |booktitle=Storage Networking World conference |url=http://www.snwusa.com/documents/presentations-s06/MarkKryder.pdf |format=[[PDF]] |accessdate=2006-12-14 }}</ref> which file systems are usually placed on. Thus, fragmentation is an important problem in recent file system research and design. The containment of fragmentation not only depends on the on-disk format of the file system, but also heavily on its implementation.<ref name=mcvoy-extent>{{cite conference |author=L. W. McVoy, S. R. Kleiman |date=1991 winter |title=Extent-like Performance from a UNIX File System |booktitle=Proceedings of [[USENIX]] winter '91 |pages=pages 33–43 |___location=Dallas, Texas |publisher=[[Sun Microsystems, Inc.]] |url=http://www.cis.upenn.edu/~bcpierce/courses/dd/papers/mcvoy-extent.ps |format=[[PostScript]] |accessdate=2006-12-14 }}</ref>
In simple file system [[benchmark (computing)|benchmark]]s, the fragmentation factor is often omitted, as realistic aging and fragmentation is difficult to model. Rather, for simplicity of comparison, file system benchmarks are often run on empty file systems, and unsurprisingly, the results may vary heavily from real-life access patterns.<ref name=workload-benchmarks>{{cite paper |author=Keith Arnold Smith |date=2001-01 |title=Workload-Specific File System Benchmarks |publisher=[[Harvard University]] |url=http://www.eecs.harvard.edu/vino/fs-perf/papers/keith_a_smith_thesis.pdf |format=[[PDF]] |accessdate=2006-12-14 }}</ref>
<!-- TODO: Explain how the efficiency of page cache/buffer cache combined with readahead decreases with fragmentation -->
==Types of fragmentation==
Line 15 ⟶ 23:
====File fragmentation====
Individual file fragmentation occurs when a single file has been broken into multiple pieces (called [[extent]]s on extent-based file systems). While disk file systems attempt to keep individual files contiguous, this is not often possible without significant performance penalties. File system check and defragmentation tools typically only account for file fragmentation in their "fragmentation percentage" statistic.
====Free space fragmentation====
Line 44 ⟶ 52:
* [[Locality of reference]]
==Notes and references==
<!--See http://en.wikipedia.org/wiki/Wikipedia:Footnotes for an explanation of how to generate footnotes using the <ref> and </ref> tags and the tag below -->
<references />
|