File system fragmentation: Difference between revisions

Content deleted Content added
m links
Line 23:
Related file fragmentation refers to the lack of [[locality of reference]] between related files. Unlike the previous two types of fragmentation, related file fragmentation is a much more vague concept, as it heavily depends on the access pattern of specific applications. This also makes objectively measuring or estimating it very difficult. However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small compared to available disk throughput per second.<ref name=filesys-contents>{{cite journal |author=John R. Douceur, William J. Bolosky |date=1999-06 |title=A Large-Scale Study of File-System Contents |publisher=[[Microsoft Research]] |pages=pages&nbsp;59&ndash;70 |journal=[[ACM SIGMETRICS]] Performance Evaluation Review |volume=volume&nbsp;27 |issue=issue&nbsp;1 |url=http://research.microsoft.com/~bolosky/papers/SIGMETRICS99/filesys.pdf |format=[[PDF]] |issn=0163-5999 |accessdate=2006-12-14 }}</ref>
 
To avoid related file fragmentation and improve locality of reference, assumptions about the operation of applications have to be made. A very frequent assumption made is that it is worthwhile to keep smaller files within a single [[file directory|directory]] together, and lay them out in the natural file system order. While it is often a reasonable assumption, it does not always hold. For example, an application might read several different files, perhaps in different directories, in the exact same order they were written. Thus, a file system that simply orders all writes successively, might work faster for the given application.
 
==Techniques for mitigating fragmentation==
Line 37:
 
====Retroactive techniques====
Retroactive techniques attempt to reduce fragmentation, or the negative effects of fragmentation, after it has occurred. Many file systems provide [[defragmentation]] tools, which attempt to reorder fragments of files, and often also increase [[locality of reference]] by keeping smaller files in [[directory (file systems)directory|directories]], or directory trees, close to each another on the disk. Some file systems, such as [[HFS Plus]], utilize idle time to defragment data on the disk in the background.
 
==See also==