File system fragmentation: Difference between revisions

Content deleted Content added
Calalee (talk | contribs)
No edit summary
File scattering: There's no "file sequence" page any more.
 
(3 intermediate revisions by 3 users not shown)
Line 49:
 
===File scattering===
File segmentation, also called related-file fragmentation, or application-level (file) fragmentation, refers to the lack of [[locality of reference]] (within the storing medium) between related files (see [[file sequence]] for more detail). Unlike the previous two types of fragmentation, file scattering is a much more vague concept, as it heavily depends on the access pattern of specific applications. This also makes objectively measuring or estimating it very difficult. However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small compared to available disk throughput per second.<ref name="filesys-contents">{{cite journal | title=A Large-Scale Study of File-System Contents | date=June 1999 | last=Douceur | first=John R. | last2=Bolosky | first2=William J. | journal=ACM SIGMETRICS Performance Evaluation Review | volume=27 | issue=1 | pages=59–70 | doi=10.1145/301464.301480| doi-access=free }}</ref>
{{See also|File sequence}}
 
File segmentation, also called related-file fragmentation, or application-level (file) fragmentation, refers to the lack of [[locality of reference]] (within the storing medium) between related files (see [[file sequence]] for more detail). Unlike the previous two types of fragmentation, file scattering is a much more vague concept, as it heavily depends on the access pattern of specific applications. This also makes objectively measuring or estimating it very difficult. However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small compared to available disk throughput per second.<ref name="filesys-contents">{{cite journal | title=A Large-Scale Study of File-System Contents | date=June 1999 | last=Douceur | first=John R. | last2=Bolosky | first2=William J. | journal=ACM SIGMETRICS Performance Evaluation Review | volume=27 | issue=1 | pages=59–70 | doi=10.1145/301464.301480| doi-access=free }}</ref>
 
To avoid related file fragmentation and improve locality of reference (in this case called ''file contiguity''), assumptions or active observations about the operation of applications have to be made. A very frequent assumption made is that it is worthwhile to keep smaller files within a single [[file directory|directory]] together, and lay them out in the natural file system order. While it is often a reasonable assumption, it does not always hold. For example, an application might read several different files, perhaps in different directories, in exactly the same order they were written. Thus, a file system that simply orders all writes successively, might work faster for the given application.
Line 83 ⟶ 81:
{{Main|Defragmentation}}
 
Retroactive techniques attempt to reduce fragmentation, or the negative effects of fragmentation, after it has occurred. Many file systems provide [[defragmentation]] tools, which attempt to reorder fragments of files, and sometimes also decrease their scattering (i.e. improve their contiguity, or [[locality of reference]]) by keeping either smaller files in [[file directory|directories]], or directory trees, or even [[file sequence]]ssequences close to each other on the disk.
 
The [[HFS Plus]] file system transparently defragments files that are less than 20 [[MiB]] in size and are broken into 8 or more fragments, when the file is being opened.<ref name=osx-intern>{{cite book |first=Amit |last=Singh |year=2007 |title=Mac OS X Internals: A Systems Approach |publisher=[[Addison Wesley]] |chapter=12 The HFS Plus File System |isbn=0321278542<!--Surprisingly, ISBN lookup on Google Books returns nothing. Hence, I supplied a URL.--> |chapter-url=https://books.google.com/books?id=UZ7AmAEACAAJ}}</ref>