File system fragmentation: Difference between revisions

Content deleted Content added
m sp
Rephrase; see also: locality of reference; TODO comments
Line 5:
 
In simple file system [[benchmark (computing)|benchmark]]s, the fragmentation factor is often omitted, as realistic aging and fragmentation is difficult to model. Rather, for simplicity of comparison, file system benchmarks are often run on empty file systems, and unsurprisingly, the results may vary heavily from real-life access patterns.<ref name=workload-benchmarks>{{cite paper |author=Keith Arnold Smith |date=2001-01 |title=Workload-Specific File System Benchmarks |publisher=[[Harvard University]] |url=http://www.eecs.harvard.edu/vino/fs-perf/papers/keith_a_smith_thesis.pdf |format=[[PDF]] |accessdate=2006-12-14 }}</ref>
<!-- TODO: Explain how the efficiency of page cache/buffer cache combined with readahead decreases with fragmentation -->
 
==Types of fragmentation==
Line 29 ⟶ 30:
Most of today's file systems attempt to preallocate longer chunks, or chunks from different free space fragments, to files that are actively appended to. This mainly avoids file fragmentation when several files are concurrently being appended to, thus avoiding them from becoming excessively intertwined.<ref name=mcvoy-extent/>
 
A relatively recent technique is [[delayed allocation]] in [[XFS]] and [[ZFS]]; the same technique is also called allocate-on-flush in [[reiser4]] and [[ext4]]. This means that when the file system is being written to, file system blocks are reserved, but theirthe locations of specific files are not laid down yet. Later, when the file system is forced to flush changes as a result of memory pressure or a transaction commit, the allocator will have much better knowledge of the files' characteristics. Most file systems with this approach try to flush files in a single directory contiguously. Assuming that multiple reads from a single directory are common, locality of reference is improved.<ref name=xfs-scalability>{{cite conference |author=Adam Sweeney, Doug Doucette, Wei Hu, Curtis Anderson, Mike Nishimoto, Geoff Peck |date=1996-01 |title=Scalability in the XFS File System |publisher=[[Silicon Graphics]] |booktitle=Proceedings of the USENIX 1996 Annual Technical Conference |___location=San Diego, California |url=http://www.soe.ucsc.edu/~sbrandt/290S/xfs.pdf |format=[[PDF]] |accessdate=2006-12-14 }}</ref> Reiser4 also orders the layout of files according to the directory [[hash table]], so that when files are being accessed in the natural file system order (as dictated by [[readdir]]), they are always read sequentially.<ref name=reiser4-google>{{cite web |author=Hans Reiser |date=2006-02-06 |title=The Reiser4 Filesystem |work=A lecture given by the author, Hans Reiser |url=http://video.google.com/videoplay?docid=6866770590245111825&q=reiser4 |format=[[Google Video]] |accessdate=2006-12-14 }}</ref>
<!-- TODO: Cylinder groups and locality of reference; XFS allocation groups (are they actually relevant?) -->
 
====Retroactive techniques====
Retroactive techniques attempt to reduce fragmentation, or the effectnegative effects of fragmentation, after it has occurred. Many file systems provide [[defragmentation]] tools, which attempt to reorder fragments of files, and often also increase [[locality of reference]] by keeping smaller files in [[directory (file systems)|directories]], or directory trees, close to each another on the disk. Some file systems, such as [[HFS Plus]], utilize idle time to defragment data on the disk in the background.
 
==See also==
* [[Fragmentation (computer)|Fragmentation]]
* [[Defragmentation]]
* [[Locality of reference]]
 
==References==