This article is actively undergoing a major edit for a little while. To help avoid edit conflicts, please do not edit this page while this message is displayed. This page was last edited at 07:52, 14 December 2006 (UTC) (18 years ago) – this estimate is cached, . Please remove this template if this page hasn't been edited for a significant time. If you are the editor who added this template, please be sure to remove it or replace it with {{Under construction}} between editing sessions. |
In computing, file system fragmentation, sometimes called file system aging is the inability of a file system to lay out related data sequentially (contiguously), an inherent phenomena in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation.
File system fragmentation is projected to become more problematic with newer hardware due to the increasing disparity between sequential access speed and rotational delay (and to a lesser extent seek time), of consumer-grade hard disks,[1] which file systems are usually placed on. Thus, fragmentation is a important problem in recent file system research and design. The containment of fragmentation not only depends on the on-disk format of the file system, but also heavily on its implementation.[2]
In simple file system benchmarks, the fragmentation factor is often omitted, as realistic aging and fragmentation is difficult to model. Rather, for simplicity of comparison, file system benchmarks are often ran on empty file systems, and unsuprisingly, the results may vary heavily from real life access patterns.[3]
File system fragmentation may occur on several levels:
- Fragmentation within individual files and their metadata.
- Free space fragmentation, making it increasingly difficult to lay out new files contiguously.
- The decrease of locality of reference between separate, but related files.
Types of fragmentation
File fragmentation
Individual file fragmentation occurs when a single file has been broken into multiple pieces (called extents on extent-based file systems). While disk file systems attempt to keep individual files contiguous, this is not often possible without significant performance penalties. File fragmentation
Free space fragmentation
Free (unallocated) space fragmentation occurs when there are several unused areas of the file system where new files or metadata can be written to. Unwanted free space fragmentation is generally caused by deletion or truncation of files, but file systems may also intentionally insert fragments (sometimes called "bubbles") of free space in order to facilitate extending nearby files (see proactive techniques below).
Related file fragmentation
Related file fragmentation refers to the lack of locality of reference of related files. Unlike the previous two types of fragmentation, related file fragmentation is a much more vague concept, as it heavily depends on the access pattern of specific applications. This also makes objectively measuring or estimating it very difficult. However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small.[citation needed]
Techniques for mitigating fragmentation
Several techniques have been developed to fight fragmentation. They can usually be classified into two categories: proactive and retroactive. Due to the hard predictability of access patterns, these techniques are most often heuristic in nature, and may degrade performance under unexpected workloads.
Proactive techniques
Proactive techniques attempt to keep fragmentation at a minimum at the time data is being written on the disk. The simplest of such is, perhaps, appending data to an existing fragment in place where possible, instead of allocating new blocks to a new fragment.
Most today's file systems attempt to preallocate longer chunks for files that are actively appended to. This mainly avoids file fragmentation when several files are appended to concurrently, avoiding them from becoming excessively intertwined.[2]
A recent technique is allocate-on-flush in XFS and ZFS, also called delayed allocation in reiser4 and ext4.
Retroactive techniques
Retroactive techniques attempt to reduce fragmentation after it has occurred. Many file systems provide defragmentation tools, which attempt to reorder fragments of files, and often also increase locality of reference by keeping smaller files in directories, or directory trees, close to each another on the disk. Some file systems, such as HFS Plus, exploit idle time to defragment data on the disk in the background.
See also
References
- ^ Dr. Mark H. Kryder (2006-04-03). "Future Storage Technologies: A Look Beyond the Horizon" (PDF). Storage Networking World conference. Seagate Technology. Retrieved 2006-12-14.
{{cite conference}}
: Unknown parameter|booktitle=
ignored (|book-title=
suggested) (help) - ^ a b L. W. McVoy, S. R. Kleiman (1991 winter). "Extent-like Performance from a UNIX File System". Proceedings of USENIX winter '91. Dallas, Texas: Sun Microsystems, Inc. pp. pages 33–43. Retrieved 2006-12-14.
{{cite conference}}
:|pages=
has extra text (help); Check date values in:|date=
(help); Unknown parameter|booktitle=
ignored (|book-title=
suggested) (help) - ^ Keith Arnold Smith (2001-01). "Workload-Specific File System Benchmarks" (PDF). Harvard University. Retrieved 2006-12-14.
{{cite journal}}
: Check date values in:|date=
(help); Cite journal requires|journal=
(help)