Content deleted Content added
Undid revision 1115426231 by 67.162.198.56 (talk) |
|||
(22 intermediate revisions by 15 users not shown) | |||
Line 1:
{{Short description|Type of data transmission method}}
{{Distinguish|Elias delta coding|delta modulation}}
Line 5 ⟶ 6:
The differences are recorded in discrete files called "deltas" or "diffs". In situations where differences are small – for example, the change of a few words in a large document or the change of a few records in a large table – delta encoding greatly reduces data redundancy. Collections of unique deltas are substantially more space-efficient than their non-encoded equivalents.
From a logical point of view, the difference between two data values is the information required to obtain one value from the other – see [[relative entropy]]. The difference between identical values (under some [[equivalence relation|equivalence]]) is often called ''0'' or the neutral element.
==Simple example==
Perhaps the simplest example is storing values of bytes as differences (deltas) between sequential values, rather than the values themselves. So, instead of 2, 4, 6, 9, 7, we would store 2, 2, 2, 3, −2. This reduces the [[variance]] (range) of the values when neighbor samples are correlated, enabling a lower bit usage for the same data. [[Interchange File Format|IFF]] [[8SVX]] sound format applies this encoding to raw sound data before applying compression to it.
==Definition==
A delta can be defined in 2 ways, ''symmetric delta'' and ''directed delta''. A ''symmetric delta'' can be expressed as
: <math>\Delta(
A ''directed delta'', also called a change, is a sequence of (elementary) change operations which, when applied to
===Variants===
Line 55:
==Examples==
===Binary delta compression===
{{excerpt|Binary delta compression}}
===Delta encoding in HTTP===
Line 69 ⟶ 71:
=== Delta copying ===
''Delta copying'' is a fast way of copying a file that is partially changed, when a previous version is present on the destination ___location. With delta copying, only the changed part of a file is copied. It is usually used in [[backup]] or [[file copying]] software, often to save [[Bandwidth (computing)|bandwidth]] when copying between computers over a private network or the internet. One notable open-source example is [[rsync]].<ref>{{Cite web |url=http://www.2brightsparks.com/bb/viewtopic.php?t=4473 |title=Feature request: Delta copying - 2BrightSparks |access-date=2016-04-29 |archive-date=2016-03-13 |archive-url=https://web.archive.org/web/20160313021725/http://www.2brightsparks.com/bb/viewtopic.php?t=4473 |url-status=dead }}</ref><ref>{{Cite web|url=https://www.bvckup2.com/support/forum/topic/739|title = Bvckup 2 | Forum | How delta copying works}}</ref><ref>[http://www.eggheadcafe.com/software/aspnet/33678264/delta-copying.aspx]{{Dead link|date=July 2019 |bot=InternetArchiveBot |fix-attempted=yes }}</ref>
==== Online backup ====
Line 86 ⟶ 88:
{{main|Git (software)}}
The Git source code control system employs delta compression in an auxiliary "[http://git-scm.com/docs/git-repack git repack]" operation. Objects in the repository that have not yet been delta-compressed ("loose objects") are compared against a heuristically chosen subset of all other objects, and the common data and differences are concatenated into a "pack file" which is then compressed using conventional methods. In common use cases, where source or data files are changed incrementally between commits, this can result in significant space savings. The repack operation is typically performed as part of the "
The format is documented in the pack-format page of the Git documentation. It implements a directed delta.<ref>{{cite web |title=Git - pack-format Documentation |url=https://git-scm.com/docs/pack-format |website=Git documentation |access-date=13 January 2020}}</ref>
Line 95 ⟶ 97:
===GDIFF===
Generic Diff Format (GDIFF) is another directed delta encoding format. It was submitted to [[W3C]] in 1997.<ref>
===bsdiff===
Bsdiff is a binary diff program using [[Suffix array|suffix sorting]]. For executables that contain many changes in pointer addresses, it performs better than VCDIFF-type "copy and literal" encodings. The intent is to find a way to generate a small diff without needing to parse assembly code (as in Google's Courgette). Bsdiff
Bsdiff is useful for delta updates. Google uses bsdiff in Chromium and Android. The ''deltarpm'' feature of the [[RPM Package Manager]] is based on a heavily
Since the 4.3 release of bsdiff in 2005, various improvements or fixes have been produced for it. Google maintains multiple versions of the code for each of its products.<ref>{{cite web |title=xtraeme/bsdiff-chromium: README.chromium |url=https://github.com/xtraeme/bsdiff-chromium/blob/master/README.chromium |website=GitHub |language=en |date=2012}}; {{cite web |title=courgette/third_party/bsdiff/README.chromium - chromium/src |url=https://chromium.googlesource.com/chromium/src/+/master/courgette/third_party/bsdiff/README.chromium |website=Git at Google}}; {{cite web |title= android/platform/external/bsdiff/|url=https://android.googlesource.com/platform/external/bsdiff/+/refs/heads/master|website=Git at Google}}</ref> FreeBSD takes many of Google's compatible changes, mainly a vulnerability fix and a switch to the faster {{code|divsufsort}} suffix-sorting routine.<ref>{{cite web|url=https://github.com/freebsd/freebsd/commits/master/usr.bin/bsdiff|website=GitHub|title=History for freebsd/usr.bin/bsdiff}}</ref> [[Debian]] has a series of performance tweaks to the program.<ref>{{cite web|url=https://sources.debian.org/patches/bsdiff/ |website=Debian Patch Tracker |title=Package: bsdiff}}</ref>
Line 107 ⟶ 109:
==See also==
*
*
*
*
* [[Xdelta]]: open-source delta encoder
Line 117 ⟶ 120:
==External links==
*
{{Compression methods}}
Line 126 ⟶ 129:
[[Category:Data differencing]]
[[Category:Articles with example C code]]
[[Category:Data compression]]
|