Simple file verification: Difference between revisions

Content deleted Content added
GreenC bot (talk | contribs)
Rescued 1 archive link; reformat 1 link. Wayback Medic 2.5 per WP:USURPURL and JUDI batch #27am
wording: "CRC-32"
Line 17:
| latest release version =
| latest release date =
| genre = [[Plain text]] list of [[CRC32CRC-32]] [[checksum]]s
| container for =
| contained by =
Line 25:
| url =
}}
'''Simple file verification''' ('''SFV''') is a file format for storing [[CRC32CRC-32]] [[checksum]]s of files to verify the integrity of files. SFV is used to verify that a file has not been [[data corruption|corrupted]], but it does not otherwise verify the file's [[Information security#Authenticity|authenticity]]. The <code>.sfv</code> [[file extension]] is usually used for SFV files.<ref name="stealthisbook"/>
 
== Checksum ==
Files can become corrupted for a variety of reasons, including faulty [[Computer Storage|storage media]], errors in [[Transmission (telecommunications)|transmission]], write errors during [[copying]] or moving, and [[software bug]]s. SFV verification ensures that a file has not been corrupted by comparing the file's [[cyclic redundancy check|CRC]] [[Hash function|hash]] value to a previously calculated value.<ref name="stealthisbook"/> Due to the nature of hash functions, [[hash collision]]s may result in [[false positive]]s, but the likelihood of collisions is usually negligible with random corruption. (The number of possible checksums is limited though large, so that with any checksum scheme many files will have the same checksum. However, the probability of a corrupted file having the same checksum as its original is exceedingly small, unless deliberately constructed to maintain the checksum.)
 
SFV cannot be used to verify the authenticity of files, as CRC32CRC-32 is not a [[collision resistance|collision resistant]] hash function; even if the hash sum file is not tampered with, it is computationally trivial for an attacker to cause deliberate hash collisions, meaning that a malicious change in the file is not detected by a hash comparison. In cryptography, this attack is called a [[collision attack]]. For this reason, the [[md5sum]] and [[sha1sum]] utilities are often preferred in [[Unix]] operating systems, which use the [[MD5]] and [[SHA-1]] [[cryptographic hash function]]s respectively.
 
Even a single-bit error causes both SFV's CRC and md5sum's cryptographic hash to fail, requiring the entire file to be re-fetched.
The [[Parchive]] and [[rsync]] utilities are often preferred for verifying that a file has not been accidentally corrupted in transmission, since they can correct common small errors with a much shorter download.
 
Despite the weaknesses of the SFV format, it is popular due to the relatively small amount of time taken by SFV utilities to calculate the CRC32CRC-32 checksums when compared to the time taken to calculate cryptographic hashes such as MD5 or SHA-1.
 
SFV uses a [[plain text]] file containing one line for each file and its checksum<ref name="stealthisbook"/> in the format ''FILENAME<whitespaces>CHECKSUM''. Any line starting with a semicolon ';' is considered to be a comment and is ignored for the purposes of file verification. The delimiter between the filename and checksum is always one or several spaces; tabs are never used. A sample SFV file is: