Content deleted Content added
→Necessity: adding essay-like template for final paragraph |
Citation bot (talk | contribs) m Alter: journal. Add: chapter-url, journal. Removed parameters. | You can use this bot yourself. Report bugs here. | User-activated. |
||
Line 50:
{{See also|File sequence}}
File segmentation, also called related-file fragmentation, or application-level (file) fragmentation, refers to the lack of [[locality of reference]] (within the storing medium) between related files (see [[file sequence]] for more detail). Unlike the previous two types of fragmentation, file scattering is a much more vague concept, as it heavily depends on the access pattern of specific applications. This also makes objectively measuring or estimating it very difficult. However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small compared to available disk throughput per second.<ref name="filesys-contents">{{cite journal | title=A Large-Scale Study of File-System Contents
To avoid related file fragmentation and improve locality of reference (in this case called ''file contiguity''), assumptions or active observations about the operation of applications have to be made. A very frequent assumption made is that it is worthwhile to keep smaller files within a single [[file directory|directory]] together, and lay them out in the natural file system order. While it is often a reasonable assumption, it does not always hold. For example, an application might read several different files, perhaps in different directories, in exactly the same order they were written. Thus, a file system that simply orders all writes successively, might work faster for the given application.
Line 69:
If the final size of a file subject to modification is known, storage for the entire file may be preallocated. For example, the [[Microsoft Windows]] [[swap file]] (page file) can be resized dynamically under normal operation, and therefore can become highly fragmented. This can be prevented by specifying a page file with the same minimum and maximum sizes, effectively preallocating the entire file.
[[BitTorrent (protocol)|BitTorrent]] and other [[peer-to-peer]] [[filesharing]] applications limit fragmentation by preallocating the full space needed for a file when initiating [[download]]s.<ref>{{cite journal |date=29 March 2009 |first=Jeffrey |last=Layton |title=From ext3 to ext4: An Interview with Theodore Ts'o |
A relatively recent technique is [[delayed allocation]] in [[XFS]], [[HFS+]]<ref>{{cite web |first=Amit |last=Singh |date=May 2004 |title=Fragmentation in HFS Plus Volumes |work=Mac OS X Internals |url=http://osxbook.com/software/hfsdebug/fragmentation.html }}</ref> and [[ZFS]]; the same technique is also called allocate-on-flush in [[reiser4]] and [[ext4]]. When the file system is being written to, file system blocks are reserved, but the locations of specific files are not laid down yet. Later, when the file system is forced to flush changes as a result of memory pressure or a transaction commit, the allocator will have much better knowledge of the files' characteristics. Most file systems with this approach try to flush files in a single directory contiguously. Assuming that multiple reads from a single directory are common, locality of reference is improved.<ref name=xfs-scalability>{{cite conference |first=Adam |last=Sweeney |first2=Doug |last2=Doucette |first3=Wei |last3=Hu |first4=Curtis |last4=Anderson |first5=Mike |last5=Nishimoto |first6=Geoff |last6=Peck |date=January 1996 |title=Scalability in the XFS File System |publisher=[[Silicon Graphics]] |booktitle=Proceedings of the USENIX 1996 Annual Technical Conference |___location=[[San Diego, California]] |url=http://www.soe.ucsc.edu/~sbrandt/290S/xfs.pdf |format=[[PDF]] |accessdate=2006-12-14 }}</ref> Reiser4 also orders the layout of files according to the directory [[hash table]], so that when files are being accessed in the natural file system order (as dictated by [[readdir]]), they are always read sequentially.<ref name=reiser4-google>{{cite web |first=Hans |last=Reiser |date=2006-02-06 |title=The Reiser4 Filesystem |url=http://video.google.com/videoplay?docid=6866770590245111825&q=reiser4 |work=Google TechTalks |accessdate=2006-12-14 |archiveurl=https://web.archive.org/web/20110519215817/http://video.google.com/videoplay?docid=6866770590245111825&q=reiser4 |archivedate=19 May 2011 |deadurl=yes |df= }}</ref>
Line 79:
Retroactive techniques attempt to reduce fragmentation, or the negative effects of fragmentation, after it has occurred. Many file systems provide [[defragmentation]] tools, which attempt to reorder fragments of files, and sometimes also decrease their scattering (i.e. improve their contiguity, or [[locality of reference]]) by keeping either smaller files in [[file directory|directories]], or directory trees, or even [[file sequence]]s close to each other on the disk.
The [[HFS Plus]] file system transparently defragments files that are less than 20 [[MiB]] in size and are broken into 8 or more fragments, when the file is being opened.<ref name=osx-intern>{{cite book |first=Amit |last=Singh |year=2007 |title=Mac OS X Internals: A Systems Approach |publisher=[[Addison Wesley]] |chapter=12 The HFS Plus File System |isbn=0321278542<!--Surprisingly, ISBN lookup on Google Books returns nothing. Hence, I supplied a URL.--> |chapter-url=https://books.google.com/books?id=UZ7AmAEACAAJ}}</ref>
The now obsolete Commodore Amiga [[Smart File System]] (SFS) defragmented itself while the filesystem was in use. The defragmentation process is almost completely stateless (apart from the ___location it is working on), so that it can be stopped and started instantly. During defragmentation data integrity is ensured for both metadata and normal data.
|