Orlov block allocator: Difference between revisions

Content deleted Content added
m Etymology: formatting fix
 
(25 intermediate revisions by 19 users not shown)
Line 1:
The '''Orlov block allocator''' is an [[algorithm]] to define where a particular [[computer file|file]] will reside on a given [[filesystemfile system]] (blockwise), so as to speed up disk operations.
{{Orphan|date=February 2009}}
The ''Orlov block allocator'' is an [[algorithm]] to define where a particular [[computer file|file]] will reside on a given [[filesystem]] (blockwise), so as to speed up disk operations.
 
== Etymology ==
The scheme is named after its creator GrigoryGrigoriy Orlov, who first posted, in 2000, a brief description and implementation for [[OpenBSD]]<ref>{{cite web | url=http://www.ptci.ru/gluk/dirpref/old/dirpref.html brief description]| title=Directory Allocation Algorithm For FFS | author=Grigoriy Orlov |archiveurl = https://web.archive.org/web/20080131082712/http://www.ptci.ru/gluk/dirpref/old/dirpref.html |archivedate = 2008-01-31}}</ref> of the technique, aswhich itwas islater used in the [[Berkeley Software Distribution|BSD]] Fast Filesystem [[Kernel (computeroperating sciencesystem)|kernelskernel]] variants.
 
== Background ==
The performance of a file system is dependent on many things; one of the crucial factors is just how that filesystem lays out files on the disk. In general, it is best to keep related items together. The Linux [[ext2]] and [[ext3]] filesystems, for instance, have tried to spread directories on the cylinders of the disk. Imagine setting up a system with users' [[Home directory|home directories]] in /home.: Ifif all the first-level directories within /home (i.e. the home directories for numerous users) are placed next to each other, there may be no space left for the contents of those directories. User files thus end up being placed far from the directories that contain them, and performance suffers.
 
Spreading directories on the disc allows files in the same directory to remain more or less contiguous as their number and/or size grows, but there are some situations where this causes excessive spreading of the data on the discdisk's surface.
 
== How it works ==
Essentially, the Orlov algorithm tries to spread outdistribute "top-level" directories, on the assumption that theyeach areis unrelated to eachthe otherothers. Directories created in the root directory of a filesystem are considered top-level directories; [[Theodore_TsTheodore Ts'o|Ted]] has added a special [[inode]] flag that allows the system administrator to mark other directories as being top-level directories as well. If <code>/home</code> lives in the root filesystem (and people do set up systems that way), a simple <code>chattr</code> command will make the system treat it as a top-level directory.
 
When creating a directory whichthat is not in a top-level directory, the Orlov algorithm tries to put it into the same cylinder group as its parent. A little more care is taken, however, to ensure that the directory's contents will also be able to fit into that cylinder group; if there are not many inodes or blocks available in the group, the directory will be placed in a different cylinder group whichthat has more resources available. The result of all this, hopefully, is much better locality for files whichthat are truly related to each other and likely to be accessed together.
 
== Performance ==
AsThe ofOrlov thisblock writingallocator (octwas shown to offer performance gains on workloads that traverse directory trees<ref>{{citation|url=http://static.usenix.org/events/usenix02/tech/freenix/dowse.html|title=Recent Filesystem Optimisations in FreeBSD}}</ref> on FreeBSD. {{As of|2007)|10}}, [httponly one benchmark result<ref>{{citation|url=https://lwn.net/Articles/14631/|title=Naive onlybut onespectacular ext3 HTREE+Orlov benchmark|author=Bert result]Hubert}}</ref> withfor theext3, newusing the allocator seems to have have been posted. The results are promising: the time required to traverse through a Linux kernel tree (a dauntingly big thing, these days) was reduced by roughly 30% or so.
 
The speedup, depending on the tests and the conditions, is always positive and may even approach extreme values (60 times faster!) for some operations.
 
== Evolution ==
The Orlov scheme needs more rigorous benchmarking; it also needs some serious stress testing to demonstrate that performance does not degrade as the filesystem is changed over time.
 
== References ==
{{reflist}}
* [http://www.ptci.ru/gluk/dirpref/old/dirpref.html original description] by Grigory Orlov
 
* [http://lwn.net/Articles/14633/ The Orlov block allocator]
== External links ==
* [http://lwn.net/Articles/14631/ naive but spectacular ext3 HTREE+Orlov benchmark]
* [httphttps://lwn.net/Articles/14633/ The Orlov block allocator]
* [https://lwn.net/Articles/14447/ Orlov block allocator for ext3] e-mail from Theodore Ts'o to [[Linus Torvalds]] and [[Alexander Viro]]
 
[[Category:ComputerUnix file systemssystem technology]]
[[Category:Unix]]