Comparison of distributed file systems: Difference between revisions

Content deleted Content added
Chkno (talk | contribs)
FOSS: Fix "|section= ignored" citation template error
removed stub template
 
(44 intermediate revisions by 31 users not shown)
Line 1:
{{short description|None}}
 
{{See also|List of file systems#Distributed parallel fault-tolerant file systems|l1=Comparison of distributed parallel fault-tolerant file systems}}
 
In computing, a [[distributed file system]] (DFS) or network file system is any [[file system]] that allows access from multiple hosts to [[computer file|files]] from multiple hosts [[resource sharing|sharingshared]] via a [[computer network]]. This makes it possible for multiple users on multiple machines to share files and storage resources.
 
Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content.
Line 15 ⟶ 17:
! [[High availability]]
! [[Shard (database architecture)|Shards]]
! [[Erasure code|Efficient ResiliencyRedundancy]]
! Redundancy Granularity
! Initial release year
! Memory requirements (GB)
Line 25 ⟶ 28:
| {{no|hot standby}}
| {{no}}
| {{no|Replication}}<ref>{{cite web |url=https://docs.alluxio.io/os/user/stable/en/core-services/Caching.html#managing-data-replication-in-alluxio |title=Caching: Managing Data Replication in Alluxio}}</ref>
| {{no|Replication}}
| {{yes|File}}<ref>{{cite web |url=https://docs.alluxio.io/os/user/stable/en/core-services/Caching.html#managing-data-replication-in-alluxio |title=Caching: Managing Data Replication in Alluxio}}</ref>
| 2013
|
Line 36 ⟶ 40:
| {{yes}}
| {{yes|Pluggable erasure codes}}<ref>{{cite web |url=https://docs.ceph.com/en/latest/rados/operations/erasure-code-profile/ |title=Erasure Code Profiles }}</ref>
| {{no|Pool}}<ref>{{cite web |url=https://docs.ceph.com/en/latest/rados/operations/pools/ |title=Pools}}</ref>
| 2010
| 1 per TB of storage
Line 46 ⟶ 51:
| {{yes}}
| {{no|Replication}}
| {{no|Volume}}<ref>{{cite journal
|first1=Mahadev |last1=Satyanarayanan
|first2=James J. |last2=Kistler
|first3=Puneet |last3=Kumar
|first4=Maria E. |last4=Okasaki
|first5=Ellen H. |last5=Siegel
|first6=David C. |last6=Steere
|title=Coda: A Highly Available File System for a Distributed Workstation Environment
|url=https://www.csee.umbc.edu/courses/graduate/CMSC621/fall2006/lectures/coda.pdf
}}</ref>
| 1987
|
Line 53 ⟶ 68:
| {{free|GPLv3}}
| libglusterfs, [[Filesystem in Userspace|FUSE]], NFS, SMB, Swift, libgfapi
| {{no|mirror}}
| {{yes}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/ec-implementation.md |title=Erasure coding implementation |website=[[GitHub]] |date=2 November 2021 }}</ref>
| {{yes}}
| {{yesno|Reed-SolomonVolume}}<ref>{{cite web |url=https://githubdocs.com/gluster.org/glusterfsen/bloblatest/masterAdministrator%20Guide/docSetting%20Up%20Volumes/developer-guide/ec-implementation.md |title=ErasureSetting codingup implementationGlusterFS Volumes}}</ref>
| 2005
|
|-
! {{rh}} |[[Moose File System|MooseFSHDFS]]
| CJava
| {{free|GPLv2}}
| [[Posix#POSIX.1|POSIX]], [[Filesystem in Userspace|FUSE]]
| {{no|master}}
| {{no}}
| {{no|Replication}}<ref>Only available in the proprietary version 4.x {{cite web |url=https://github.com/moosefs/moosefs/issues/8 |title=[feature] erasure-coding #8}}</ref>
| 2008
|
|-
! {{rh}} |[[Quantcast File System]]
| C
| {{free|Apache License 2.0}}
| Java and C client, HTTP, FUSE<ref>{{cite web |url=https://cwiki.apache.org/confluence/display/HADOOP2/MountableHDFS |title=MountableHDFS}}</ref>
| C++ client, [[Filesystem in Userspace|FUSE]] (C++ server: MetaServer and ChunkServer are both in C++)
| {{no|transparent master failover}}
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://wwwissues.csapache.utah.eduorg/~harijira/teachingbrowse/bigdata/qfsHDFS-ovsiannikov.pdf7285 |title=TheHDFS-7285 Erasure Coding QuantcastSupport Fileinside SystemHDFS}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep |title=Apache Hadoop: setrep}}</ref>
| 2012
| 2005
|
|-
! {{rh}} |[[IPFS]]
! {{rh}} |[https://github.com/freakmaxi/kertish-dfs Kertish-DFS]
| Go
| {{free|GPLv3Apache 2.0 or MIT}}
| [https://docs.ipfs.io/concepts/ipfs-gateway/ HTTP gateway], [https://github.com/ipfs/go-ipfs/blob/master/docs/fuse.md FUSE], [https://github.com/ipfs/go-ipfs Go client], [https://js.ipfs.io/ Javascript client], [https://docs.ipfs.io/reference/cli/ command line tool]
|HTTP(REST), CLI, C# Client, Go Client
| {{yes}}
| {{yes|with [https://cluster.ipfs.io/ IPFS Cluster]}}
|
| {{no|Replication}}<ref>Erasure coding plan: {{cite web |url=https://github.com/ipfs/notes/issues/196 |title=Reed-Solomon layer over IPFS #196|website=[[GitHub]]}}, {{cite web |url=https://github.com/ipfs/ipfs-cluster/issues/6 |title=Erasure Coding Layer #6|website=[[GitHub]]}}</ref>
| {{no|Replication}}
| {{yes|Block}}<ref>{{cite web |url=https://docs.ipfs.io/reference/cli/#ipfs-bitswap-wantlist |title=CLI Commands: ipfs bitswap wantlist}}</ref>
|2020
| 2015<ref>{{cite web |url=https://techcrunch.com/2015/10/04/why-the-internet-needs-ipfs-before-its-too-late/ |title=Why The Internet Needs IPFS Before It's Too Late|date=4 October 2015 }}</ref>
|
|-
! {{rh}} |[[LizardFS]]<ref>{{cite web |url=https://github.com/lizardfs/lizardfs/issues/805#issuecomment-2238866486 | title=Is LizardFS development still alive?| website=[[GitHub]]}}</ref>
! {{rh}} |[[LizardFS]]
| C++
| {{free|GPLv3}}
Line 96 ⟶ 104:
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://docs.lizardfs.com/adminguide/replication.html |title=Configuring Replication Modes}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://docs.lizardfs.com/adminguide/replication.html#set-and-show-the-goal-of-a-file-directory |title=Configuring Replication Modes: Set and show the goal of a file/directory}}</ref>
| 2013
|
Line 105 ⟶ 114:
| {{yes}}
| {{yes}}
| {{no|No redundancy}}<ref>{{cite web |url=https://doc.lustre.org/lustre_manual.xhtml#understandinglustre.whatislustre |title=Lustre Operations Manual: What a Lustre File System Is (and What It Isn't)}} </ref><ref>Reed-Solomon in progress: {{cite web |url=https://jira.whamcloud.com/browse/LU-10911 |title=LU-10911 FLR2: Erasure coding}}</ref>
| {{no|No redundancy}}<ref>{{cite web |url=https://doc.lustre.org/lustre_manual.xhtml#idm139974537188976 |title=Lustre Operations Manual: Lustre Features}}</ref><ref>File-level redundancy plan: {{cite web |url=https://wiki.lustre.org/File_Level_Redundancy_Solution_Architecture |title=File Level Redundancy Solution Architecture}}</ref>
| 2003
|
Line 112 ⟶ 122:
! {{rh}} |[[MinIO]]
| Go
| {{free|Apache Licence 2AGPL3.0}}
|[[Amazon S3|AWS S3 API]], [[File Transfer Protocol|FTP]], [[SSH File Transfer Protocol|SFTP]]
| {{yes}}
| {{yes}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://docs.min.io/docs/minio-erasure-code-quickstart-guide.html |title=MinIO Erasure Code Quickstart Guide}}</ref>
| {{yes|Object}}<ref>{{cite web |url=https://github.com/minio/minio/tree/master/docs/erasure/storage-class |title=MinIO Storage Class Quickstart Guide|website=[[GitHub]]}}</ref>
| 2014
|
|-
! {{rh}} |[[Moose File System|MooseFS]]
| C
| {{free|GPLv2}}
| [[Posix#POSIX.1|POSIX]], [[Filesystem in Userspace|FUSE]]
| {{no|master}}
| {{no}}
| {{no|Replication}}<ref>Only available in the proprietary version 4.x {{cite web |url=https://github.com/moosefs/moosefs/issues/8 |title=[feature] erasure-coding #8|website=[[GitHub]]}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://fossies.org/linux/moosefs/mfsmanpages/mfsgoal.1 |title=mfsgoal(1)}}</ref>
| 2008
|
|-
Line 127 ⟶ 149:
|
| {{no|Replication}}
| 2000 {{no|Volume}}<ref>https{{cite web |url=http://wwwdocs.openafs.org/releaseAdminGuide/openafs-1.0HDRWQ192.html</ref> |title=Replicating Volumes (Creating Read-only Volumes)
}}</ref>
| 2000<ref>{{Cite web|url=https://www.openafs.org/release/openafs-1.0.html|title=OpenAFS}}</ref>
|
|-
Line 137 ⟶ 161:
|
| {{yes|Pluggable erasure codes}}<ref>{{cite web |url=https://docs.openio.io/latest/source/admin-guide/configuration_ec.html |title=Erasure Coding}}</ref>
| {{yes|Object}}<ref>{{cite web |url=https://docs.openio.io/latest/source/admin-guide/configuration_storagepolicies.html |title=Declare Storage Policies}}</ref>
| 2015
| 0.5
|-
! {{rh}} |[[Quantcast File System]]
| C
| {{free|Apache License 2.0}}
| C++ client, [[Filesystem in Userspace|FUSE]] (C++ server: MetaServer and ChunkServer are both in C++)
| {{no|master}}
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://www.cs.utah.edu/~hari/teaching/bigdata/qfs-ovsiannikov.pdf |title=The Quantcast File System}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://github.com/quantcast/qfs/blob/2.2.2/src/cc/tools/cptoqfs_main.cc#L259 |title=qfs/src/cc/tools/cptoqfs_main.cc|website=[[GitHub]]|date=8 December 2021}}</ref>
| 2012
|
|-
! {{rh}} |[[RozoFS]]
Line 147 ⟶ 183:
|
| {{yes|Mojette}}<ref>{{cite web |url=http://rozofs.github.io/rozofs/develop/AboutRozoFS.html#mojette-transform |title=About RozoFS: Mojette Transform}}</ref>
| 2011{{no|Volume}}<ref>{{cite web |url=httpshttp://rozofs.github.comio/rozofs/rozofsdevelop/commit/9818e92f73fe4432c8d29236158e271da9ee3bf2SettingUpRozoFS.html#exportd-configuration-file |title=InitialSetting commit.up RozoFS: Exportd Configuration File}}</ref>
| 2011<ref>{{cite web |url=https://github.com/rozofs/rozofs/commit/9818e92f73fe4432c8d29236158e271da9ee3bf2 |title=Initial commit.|website=[[GitHub]]}}</ref>
|
|-
! {{rh}} |[https://github.com/chrislusf/seaweedfs SeaweedFS]
| Go, Java
| {{free|Apache License 2.0}}
| HTTP ([[REST]]), [[POSIX]], [[Filesystem in Userspace|FUSE]], [[Amazon S3|S3]], [[HDFS]]
| {{no|requires CockroachDB, undocumented config}}
|
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/chrislusf/seaweedfs/wiki/Erasure-coding-for-warm-storage |title=Erasure Coding for warm storage}}</ref>
| 2015
|
|-
! {{rh}} |[[Tahoe-LAFS]]
| Python
| {{free|[[GNU GPL]] <ref>{{cite web
| url=https://github.com/tahoe-lafs/tahoe-lafs/blob/master/README.rst#licence
| title=About Tahoe-LAFS| website=[[GitHub]]| date=24 February 2022}}</ref>}}
| HTTP (browser or [[Command-line interface|CLI]]), [[SSH File Transfer Protocol|SFTP]], [[File Transfer Protocol|FTP]], [[Filesystem in Userspace|FUSE]] via [[SSHFS]], pyfilesystem
|
|
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/tahoe-lafs/zfec |title=zfec -- a fast C implementation of Reed-Solomon erasure coding|website=[[GitHub]]|date=24 February 2022}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://tahoe-lafs.readthedocs.io/en/latest/architecture.html#file-encoding |title=Tahoe-LAFS Architecture: File Encoding}}</ref>
| 2007
|
|-
! {{rh}} |[[HDFS]]
| Java
| {{free|Apache License 2.0}}
| Java and C client, HTTP
| {{yes|transparent master failover}}
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://issues.apache.org/jira/browse/HDFS-7285 |title=HDFS-7285 Erasure Coding Support inside HDFS}}</ref>
| 2005
|
|-
Line 189 ⟶ 207:
|
| {{no|Replication}}<ref>{{cite web |url=http://www.xtreemfs.org/how_replication_works.php |title=Under the Hood: File Replication}}</ref>
| {{yes|File}}<ref>{{cite web |url=http://www.xtreemfs.org/quickstart_repl.php |title=Quickstart: Replicate A File}}</ref>
| 2009
|
|-
! {{rh}} |Ori<ref>{{cite web
| url=http://ori.scs.stanford.edu
| title=Ori: A Secure Distributed File System}}</ref>
| C, C++
| {{free|MIT}}
| libori, [[Filesystem in Userspace|FUSE]]
|
|
| {{no|Replication}}
| 2012
|
|}
Line 223 ⟶ 230:
GPLv2 client
| [[Posix#POSIX.1|POSIX]]
|-
! {{rh}} |[[Cloudian]]
|C++
| {{proprietary}}
|[[Amazon S3|AWS S3]], NFS, [[SMB/CIFS]], Rest API
|-
! {{rh}} |[[ObjectiveFS]]<ref>{{cite web
Line 234 ⟶ 246:
| C, C++
| {{proprietary}}
| [[POSIX]], NFS, [[Server Message Block|SMB]], Swift, [[Amazon S3|S3]], [[HDFS]]
|-
! {{rh}} |[[MapR FS|MapR-FS]]
| C, C++
| {{proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Network File System|NFS]], [[Filesystem in Userspace|FUSE]], [[Amazon S3|S3]], [[Apache Hadoop#HDFS|HDFS]], CLI
|-
! {{rh}} |[https://www.panasas.com/panfs-architecture/panfs/ PanFS]
| C, C++
| {{proprietary}}
| [[Panasas#DirectFlow|DirectFlow]], [[POSIX]], [[Network File System|NFS]], [[Server Message Block|SMB/CIFS]], [[Hypertext Transfer Protocol|HTTP]], [[Command-line interface|CLI]]
|-
! {{rh}} |[[Infinit (file system)|Infinit]]<ref>{{cite web
| url=http://infinit.sh
| title=The Infinit Storage Platform}}</ref>
| C++
| {{proprietary}} (to be open sourced)<ref>{{cite web
| url=http://infinit.sh/open-source
| title=Infinit's Open Source Projects}}</ref>
| [[Filesystem in Userspace|FUSE]], [[Installable File System]], [[Network File System|NFS]]/[[Server Message Block|SMB]], [[Posix#POSIX.1|POSIX]], [[Command-line interface|CLI]], [[Software Development Kit|SDK]] (libinfinit)
|-
! {{rh}} |[[OneFS distributed file system|Isilon OneFS]]
Line 259 ⟶ 257:
| {{Proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Network File System|NFS]], [[SMB/CIFS]], [[HDFS]], [[HTTP]], [[FTP]], SWIFT Object, [[Command-line interface|CLI]], Rest API
|-
! {{rh}} |[[Qumulo]]
| C/C++
| {{Proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Network File System|NFS]], [[SMB/CIFS]], [[Command-line interface|CLI]], [[Amazon S3|S3]], Rest API
|-
! {{rh}} |[[Scality]]
Line 265 ⟶ 268:
| [[Filesystem in Userspace|FUSE]], [[Network File System|NFS]], [[Representational state transfer|REST]], [[AWS S3]]
|-
! {{rh}} |[https://www.quobyte.com/product Quobyte[VaultFS]]
| Java, C/C++
| {{proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Filesystem in Userspace|FUSE]], [[Network File System|NFS]], [[SMB/CIFS]], [[HDFS]],Command-line [[AWS S3]], [[TensorFlowinterface| TensorFlow PluginCLI]], [[Command-lineAmazon interfaceS3|CLIS3]], Rest API
|}
 
Line 294 ⟶ 297:
| [[HTTP]] ([[REST]])
|-
| {{rh}} |[[IBM Cloud Object Storage]]
|[[IBM]] (formerly [[Cleversafe Inc.|Cleversafe]])<ref>{{cite web|url=https://www-03.ibm.com/press/us/en/pressrelease/47776.wss|archive-url=https://web.archive.org/web/20151008001155/http://www-03.ibm.com/press/us/en/pressrelease/47776.wss|url-status=dead|archive-date=October 8, 2015|title=IBM Plans to Acquire Cleversafe for Object Storage in Cloud|date=2015-10-05|website=www-03.ibm.com|language=en-US|access-date=2019-05-06}}</ref>
| [[HTTP]] ([[REST]])
|}
Line 302 ⟶ 305:
Some researchers have made a functional and experimental analysis of several distributed file systems including HDFS, Ceph, Gluster, Lustre and old (1.6.x) version of MooseFS, although this document is from 2013 and a lot of information are outdated (e.g. MooseFS had no HA for Metadata Server at that time).<ref>{{cite web|last1=Séguin|first1=Cyril|last2=Depardon|first2=Benjamin|last3=Le Mahec|first3=Gaël|title=Analysis of Six Distributed File Systems|url=https://hal.archives-ouvertes.fr/file/index/docid/789086/filename/a_survey_of_dfs.pdf|website=HAL}}</ref>
 
The cloud based remote distributed storage from major vendors have different APIs and different consistency models.<ref>{{cite web|title=Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google Cloud Storage and Windows Azure Storage|url=https://www.systutorials.com/3551/data-consistency-models-of-public-cloud-storage-services-amazon-s3-google-cloud-storage-and-windows-azure-storage/|website=SysTutorials|accessdatedate=4 February 2014|access-date=19 June 2017}}</ref>
 
==See also==
Line 316 ⟶ 319:
[[Category:Network file systems]]
[[Category:Software comparisons|Distributed file systems]]
[[Category:Long stubs with short prose]]
{{compu-storage-stub}}