Content deleted Content added
m →Bibliography: HTTP → HTTPS for Carnegie Mellon CS, replaced: http://www.cs.cmu.edu/ → https://www.cs.cmu.edu/ |
m Minor Clean Up and Fixes, typo(s) fixed: ’s → 's (2) |
||
Line 65:
Each file is split to multiple chunks of 64 megabytes. Each chunk is stored in a chunk server. A chunk is identified by a chunk handle, which is a globally unique 64-bit number that is assigned by the master when the chunk is first created.
The master maintains all of the files's metadata, including file names, directories, and the mapping of files to the list of chunks that contain each
===== Fault tolerance =====
Line 94:
On an HDFS cluster, a file is split into one or more equal-size blocks, except for the possibility of the last block being smaller. Each block is stored on multiple DataNodes, and each may be replicated on multiple DataNodes to guarantee availability. By default, each block is replicated three times, a process called "Block Level Replication".<ref name="admaov_2">{{harvnb|Adamov|2012|p=2}}</ref>
The NameNode manages the file system namespace operations such as opening, closing, and renaming files and directories, and regulates file access. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for servicing read and write requests from the file
When a client wants to read or write data, it contacts the NameNode and the NameNode checks where the data should be read from or written to. After that, the client has the ___location of the DataNode and can send read or write requests to it.
|