Comparison of distributed file systems: Difference between revisions

Content deleted Content added
Chkno (talk | contribs)
FOSS: Rename column: "Efficient Resiliency" → "Efficient Redundancy" to be more clear about the nature of the resiliency tracked in this column.
removed stub template
 
(38 intermediate revisions by 29 users not shown)
Line 1:
{{short description|None}}
 
{{See also|List of file systems#Distributed parallel fault-tolerant file systems|l1=Comparison of distributed parallel fault-tolerant file systems}}
 
In computing, a [[distributed file system]] (DFS) or network file system is any [[file system]] that allows access from multiple hosts to [[computer file|files]] from multiple hosts [[resource sharing|sharingshared]] via a [[computer network]]. This makes it possible for multiple users on multiple machines to share files and storage resources.
 
Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content.
Line 66 ⟶ 68:
| {{free|GPLv3}}
| libglusterfs, [[Filesystem in Userspace|FUSE]], NFS, SMB, Swift, libgfapi
| {{no|mirror}}
| {{yes}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/ec-implementation.md |title=Erasure coding implementation |website=[[GitHub]] |date=2 November 2021 }}</ref>
| {{yes}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/ec-implementation.md |title=Erasure coding implementation }}</ref>
| {{no|Volume}}<ref>{{cite web |url=https://docs.gluster.org/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/ |title=Setting up GlusterFS Volumes}}</ref>
| 2005
|
|-
! {{rh}} |[[Moose File System|MooseFSHDFS]]
| CJava
| {{free|GPLv2}}
| [[Posix#POSIX.1|POSIX]], [[Filesystem in Userspace|FUSE]]
| {{no|master}}
| {{no}}
| {{no|Replication}}<ref>Only available in the proprietary version 4.x {{cite web |url=https://github.com/moosefs/moosefs/issues/8 |title=[feature] erasure-coding #8}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://fossies.org/linux/moosefs/mfsmanpages/mfsgoal.1 |title=mfsgoal(1)}}</ref>
| 2008
|
|-
! {{rh}} |[[Quantcast File System]]
| C
| {{free|Apache License 2.0}}
| Java and C client, HTTP, FUSE<ref>{{cite web |url=https://cwiki.apache.org/confluence/display/HADOOP2/MountableHDFS |title=MountableHDFS}}</ref>
| C++ client, [[Filesystem in Userspace|FUSE]] (C++ server: MetaServer and ChunkServer are both in C++)
| {{no|transparent master failover}}
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://wwwissues.csapache.utah.eduorg/~harijira/teachingbrowse/bigdata/qfsHDFS-ovsiannikov.pdf7285 |title=TheHDFS-7285 Erasure Coding QuantcastSupport Fileinside SystemHDFS}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://githubhadoop.comapache.org/quantcastdocs/qfs/blob/2r2.24.21/srchadoop-project-dist/cchadoop-common/tools/cptoqfs_mainFileSystemShell.cchtml#L259setrep |title=qfs/src/cc/tools/cptoqfs_main.ccApache Hadoop: setrep}}</ref>
| 20122005
|
|-
Line 101 ⟶ 92:
| {{yes}}
| {{yes|with [https://cluster.ipfs.io/ IPFS Cluster]}}
| {{no|Replication}}<ref>Erasure coding plan: {{cite web |url=https://github.com/ipfs/notes/issues/196 |title=Reed-Solomon layer over IPFS #196|website=[[GitHub]]}}, {{cite web |url=https://github.com/ipfs/ipfs-cluster/issues/6 |title=Erasure Coding Layer #6|website=[[GitHub]]}}</ref>
| {{yes|Block}}<ref>{{cite web |url=https://docs.ipfs.io/reference/cli/#ipfs-bitswap-wantlist |title=CLI Commands: ipfs bitswap wantlist}}</ref>
| 2015<ref>{{cite web |url=https://techcrunch.com/2015/10/04/why-the-internet-needs-ipfs-before-its-too-late/ |title=Why The Internet Needs IPFS Before It’sIt's Too Late|date=4 October 2015 }}</ref>
|
|-
! {{rh}} |[https://github.com/freakmaxi/kertish-dfs Kertish-DFS]
| Go
| {{free|GPLv3}}
|HTTP(REST), CLI, C# Client, Go Client
|{{yes}}
|
| {{no|Replication}}
|
|2020
|
|-
! {{rh}} |[[LizardFS]]<ref>{{cite web |url=https://github.com/lizardfs/lizardfs/issues/805#issuecomment-2238866486 | title=Is LizardFS development still alive?| website=[[GitHub]]}}</ref>
! {{rh}} |[[LizardFS]]
| C++
| {{free|GPLv3}}
Line 134 ⟶ 114:
| {{yes}}
| {{yes}}
| {{no|No redundancy}}<ref>{{cite web |url=https://doc.lustre.org/lustre_manual.xhtml#understandinglustre.whatislustre |title=Lustre Operations Manual: What a Lustre File System Is (and What It Isn't)}} </ref><ref>Reed-Solomon in progress: {{cite web |url=https://jira.whamcloud.com/browse/LU-10911 |title=LU-10911 FLR2: Erasure coding}}</ref>
| {{no|No redundancy}}<ref>{{cite web |url=https://doc.lustre.org/lustre_manual.xhtml#understandinglustre.whatislustreidm139974537188976 |title=Lustre Operations Manual: What a Lustre File System Is (and What It Isn't)Features}} </ref><ref>File-level redundancy plan: {{cite web |url=https://wiki.lustre.org/File_Level_Redundancy_Solution_Architecture |title=File Level Redundancy Solution Architecture}}</ref>
| 2003
|
Line 142 ⟶ 122:
! {{rh}} |[[MinIO]]
| Go
| {{free|Apache Licence 2AGPL3.0}}
|[[Amazon S3|AWS S3 API]], [[File Transfer Protocol|FTP]], [[SSH File Transfer Protocol|SFTP]]
| {{yes}}
| {{yes}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://docs.min.io/docs/minio-erasure-code-quickstart-guide.html |title=MinIO Erasure Code Quickstart Guide}}</ref>
| {{yes|Object}}<ref>{{cite web |url=https://github.com/minio/minio/tree/master/docs/erasure/storage-class |title=MinIO Storage Class Quickstart Guide|website=[[GitHub]]}}</ref>
| 2014
|
|-
! {{rh}} |[[Moose File System|MooseFS]]
| C
| {{free|GPLv2}}
| [[Posix#POSIX.1|POSIX]], [[Filesystem in Userspace|FUSE]]
| {{no|master}}
| {{no}}
| {{no|Replication}}<ref>Only available in the proprietary version 4.x {{cite web |url=https://github.com/moosefs/moosefs/issues/8 |title=[feature] erasure-coding #8|website=[[GitHub]]}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://fossies.org/linux/moosefs/mfsmanpages/mfsgoal.1 |title=mfsgoal(1)}}</ref>
| 2008
|
|-
Line 160 ⟶ 151:
| {{no|Volume}}<ref>{{cite web |url=http://docs.openafs.org/AdminGuide/HDRWQ192.html |title=Replicating Volumes (Creating Read-only Volumes)
}}</ref>
| 2000 <ref>{{Cite web|url=https://www.openafs.org/release/openafs-1.0.html|title=OpenAFS}}</ref>
|
|-
Line 173 ⟶ 164:
| 2015
| 0.5
|-
! {{rh}} |[[Quantcast File System]]
| C
| {{free|Apache License 2.0}}
| C++ client, [[Filesystem in Userspace|FUSE]] (C++ server: MetaServer and ChunkServer are both in C++)
| {{no|master}}
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://www.cs.utah.edu/~hari/teaching/bigdata/qfs-ovsiannikov.pdf |title=The Quantcast File System}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://github.com/quantcast/qfs/blob/2.2.2/src/cc/tools/cptoqfs_main.cc#L259 |title=qfs/src/cc/tools/cptoqfs_main.cc|website=[[GitHub]]|date=8 December 2021}}</ref>
| 2012
|
|-
! {{rh}} |[[RozoFS]]
Line 182 ⟶ 184:
| {{yes|Mojette}}<ref>{{cite web |url=http://rozofs.github.io/rozofs/develop/AboutRozoFS.html#mojette-transform |title=About RozoFS: Mojette Transform}}</ref>
| {{no|Volume}}<ref>{{cite web |url=http://rozofs.github.io/rozofs/develop/SettingUpRozoFS.html#exportd-configuration-file |title=Setting up RozoFS: Exportd Configuration File}}</ref>
| 2011<ref>{{cite web |url=https://github.com/rozofs/rozofs/commit/9818e92f73fe4432c8d29236158e271da9ee3bf2 |title=Initial commit.|website=[[GitHub]]}}</ref>
|
|-
! {{rh}} |[https://github.com/chrislusf/seaweedfs SeaweedFS]
| Go, Java
| {{free|Apache License 2.0}}
| HTTP ([[REST]]), [[POSIX]], [[Filesystem in Userspace|FUSE]], [[Amazon S3|S3]], [[HDFS]]
| {{no|requires CockroachDB, undocumented config}}
|
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/chrislusf/seaweedfs/wiki/Erasure-coding-for-warm-storage |title=Erasure Coding for warm storage}}</ref>
| {{no|Volume}}<ref>{{cite web |url=https://github.com/chrislusf/seaweedfs/wiki/Replication |title=Replication}}</ref>
| 2015
|
|-
! {{rh}} |[[Tahoe-LAFS]]
| Python
| {{free|[[GNU GPL]] <ref>{{cite web
| url=https://github.com/tahoe-lafs/tahoe-lafs/blob/master/README.rst#licence
| title=About Tahoe-LAFS| website=[[GitHub]]| date=24 February 2022}}</ref>}}
| HTTP (browser or [[Command-line interface|CLI]]), [[SSH File Transfer Protocol|SFTP]], [[File Transfer Protocol|FTP]], [[Filesystem in Userspace|FUSE]] via [[SSHFS]], pyfilesystem
|
|
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://github.com/tahoe-lafs/zfec |title=zfec -- a fast C implementation of Reed-Solomon erasure coding|website=[[GitHub]]|date=24 February 2022}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://tahoe-lafs.readthedocs.io/en/latest/architecture.html#file-encoding |title=Tahoe-LAFS Architecture: File Encoding}}</ref>
| 2007
|
|-
! {{rh}} |[[HDFS]]
| Java
| {{free|Apache License 2.0}}
| Java and C client, HTTP, FUSE<ref>{{cite web |url=https://cwiki.apache.org/confluence/display/HADOOP2/MountableHDFS |title=MountableHDFS}}</ref>
| {{yes|transparent master failover}}
| {{no}}
| {{yes|Reed-Solomon}}<ref>{{cite web |url=https://issues.apache.org/jira/browse/HDFS-7285 |title=HDFS-7285 Erasure Coding Support inside HDFS}}</ref>
| {{yes|File}}<ref>{{cite web |url=https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep |title=Apache Hadoop: setrep}}</ref>
| 2005
|
|-
Line 229 ⟶ 209:
| {{yes|File}}<ref>{{cite web |url=http://www.xtreemfs.org/quickstart_repl.php |title=Quickstart: Replicate A File}}</ref>
| 2009
|
|-
! {{rh}} |Ori<ref>{{cite web
| url=http://ori.scs.stanford.edu
| title=Ori: A Secure Distributed File System}}</ref>
| C, C++
| {{free|MIT}}
| libori, [[Filesystem in Userspace|FUSE]]
|
|
| {{no|Replication}}
| {{no|Filesystem}}<ref>{{cite journal
|first1=Ali Jose |last1=Mashtizadeh
|first2=Andrea |last2=Bittau
|first3=Yifeng Frank |last3=Huang
|first4=David |last4=Mazières
|title=Replication, History, and Grafting in the Ori File System
|url=http://sigops.org/s/conferences/sosp/2013/papers/p151-mashtizadeh.pdf
}}</ref>
| 2012
|
|}
Line 270 ⟶ 230:
GPLv2 client
| [[Posix#POSIX.1|POSIX]]
|-
! {{rh}} |[[Cloudian]]
|C++
| {{proprietary}}
|[[Amazon S3|AWS S3]], NFS, [[SMB/CIFS]], Rest API
|-
! {{rh}} |[[ObjectiveFS]]<ref>{{cite web
Line 281 ⟶ 246:
| C, C++
| {{proprietary}}
| [[POSIX]], NFS, [[Server Message Block|SMB]], Swift, [[Amazon S3|S3]], [[HDFS]]
|-
! {{rh}} |[[MapR FS|MapR-FS]]
| C, C++
| {{proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Network File System|NFS]], [[Filesystem in Userspace|FUSE]], [[Amazon S3|S3]], [[Apache Hadoop#HDFS|HDFS]], CLI
|-
! {{rh}} |[https://www.panasas.com/panfs-architecture/panfs/ PanFS]
| C, C++
| {{proprietary}}
| [[Panasas#DirectFlow|DirectFlow]], [[POSIX]], [[Network File System|NFS]], [[Server Message Block|SMB/CIFS]], [[Hypertext Transfer Protocol|HTTP]], [[Command-line interface|CLI]]
|-
! {{rh}} |[[Infinit (file system)|Infinit]]<ref>{{cite web
| url=http://infinit.sh
| title=The Infinit Storage Platform}}</ref>
| C++
| {{proprietary}} (to be open sourced)<ref>{{cite web
| url=http://infinit.sh/open-source
| title=Infinit's Open Source Projects}}</ref>
| [[Filesystem in Userspace|FUSE]], [[Installable File System]], [[Network File System|NFS]]/[[Server Message Block|SMB]], [[Posix#POSIX.1|POSIX]], [[Command-line interface|CLI]], [[Software Development Kit|SDK]] (libinfinit)
|-
! {{rh}} |[[OneFS distributed file system|Isilon OneFS]]
Line 306 ⟶ 257:
| {{Proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Network File System|NFS]], [[SMB/CIFS]], [[HDFS]], [[HTTP]], [[FTP]], SWIFT Object, [[Command-line interface|CLI]], Rest API
|-
! {{rh}} |[[Qumulo]]
| C/C++
| {{Proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Network File System|NFS]], [[SMB/CIFS]], [[Command-line interface|CLI]], [[Amazon S3|S3]], Rest API
|-
! {{rh}} |[[Scality]]
Line 312 ⟶ 268:
| [[Filesystem in Userspace|FUSE]], [[Network File System|NFS]], [[Representational state transfer|REST]], [[AWS S3]]
|-
! {{rh}} |[https://www.quobyte.com/product Quobyte[VaultFS]]
| Java, C/C++
| {{proprietary}}
| [[Posix#POSIX.1|POSIX]], [[Filesystem in Userspace|FUSE]], [[Network File System|NFS]], [[SMB/CIFS]], [[HDFS]],Command-line [[AWS S3]], [[TensorFlowinterface| TensorFlow PluginCLI]], [[Command-lineAmazon interfaceS3|CLIS3]], Rest API
|}
 
Line 341 ⟶ 297:
| [[HTTP]] ([[REST]])
|-
| {{rh}} |[[IBM Cloud Object Storage]]
|[[IBM]] (formerly [[Cleversafe Inc.|Cleversafe]])<ref>{{cite web|url=https://www-03.ibm.com/press/us/en/pressrelease/47776.wss|archive-url=https://web.archive.org/web/20151008001155/http://www-03.ibm.com/press/us/en/pressrelease/47776.wss|url-status=dead|archive-date=October 8, 2015|title=IBM Plans to Acquire Cleversafe for Object Storage in Cloud|date=2015-10-05|website=www-03.ibm.com|language=en-US|access-date=2019-05-06}}</ref>
| [[HTTP]] ([[REST]])
|}
Line 349 ⟶ 305:
Some researchers have made a functional and experimental analysis of several distributed file systems including HDFS, Ceph, Gluster, Lustre and old (1.6.x) version of MooseFS, although this document is from 2013 and a lot of information are outdated (e.g. MooseFS had no HA for Metadata Server at that time).<ref>{{cite web|last1=Séguin|first1=Cyril|last2=Depardon|first2=Benjamin|last3=Le Mahec|first3=Gaël|title=Analysis of Six Distributed File Systems|url=https://hal.archives-ouvertes.fr/file/index/docid/789086/filename/a_survey_of_dfs.pdf|website=HAL}}</ref>
 
The cloud based remote distributed storage from major vendors have different APIs and different consistency models.<ref>{{cite web|title=Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google Cloud Storage and Windows Azure Storage|url=https://www.systutorials.com/3551/data-consistency-models-of-public-cloud-storage-services-amazon-s3-google-cloud-storage-and-windows-azure-storage/|website=SysTutorials|accessdatedate=4 February 2014|access-date=19 June 2017}}</ref>
 
==See also==
Line 363 ⟶ 319:
[[Category:Network file systems]]
[[Category:Software comparisons|Distributed file systems]]
[[Category:Long stubs with short prose]]
{{compu-storage-stub}}