Revision as of 18:47, 12 March 2025 edit Eelmealdeal (talk \| contribs) 274 edits m Added mention of Hamming codes ← Previous edit		Revision as of 17:13, 19 April 2025 edit undo FrozenBirdXD (talk \| contribs) 15 edits Clean up/copyedit, improve readability Tag: Visual edit Next edit →
Line 40: \|} ]] In [[computer science]] and [[information theory]], a '''Huffman code''' is a particular type of optimal [[prefix code]] that is commonly used for [[lossless data compression]]. The process of finding or using such a code is '''Huffman coding''', an algorithm developed by [[David A. Huffman]] while he was a [[Doctor of Science\|Sc.D.]] student at [[Massachusetts Institute of Technology\|MIT]], and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".<ref name=":0">{{Cite journal \| last1 = Huffman \| first1 = D. \|author-link1=David A. Huffman\| title = A Method for the Construction of Minimum-Redundancy Codes \| doi = 10.1109/JRPROC.1952.273898 \| journal = [[Proceedings of the IRE]]\| volume = 40 \| issue = 9 \| pages = 1098–1101 \| year = 1952 \| url = http://compression.ru/download/articles/huff/huffman_1952_minimum-redundancy-codes.pdf}}</ref> The output from Huffman's algorithm can be viewed as a [[variable-length code]] table for encoding a source symbol (such as a character in a file). The algorithm derives this table from the estimated probability or frequency of occurrence (''weight'') for each possible value of the source symbol. As in other [[entropy encoding]] methods, more common symbols are generally represented using fewer bits than less common symbols. Huffman's method can be efficiently implemented, finding a code in time [[linear time\|linear]] to the number of input weights if these weights are sorted.<ref>{{cite journal \| first = Jan \| last = Van Leeuwen \| author-link = Jan van Leeuwen \| url = http://www.staff.science.uu.nl/~leeuw112/huffman.pdf \| title = On the construction of Huffman trees \| journal = ICALP \| year =1976 \| pages = 382–410 \| access-date = 20 February 2014}}</ref> However, although optimal among methods encoding symbols separately, Huffman coding [[#Optimality\|is not always optimal]] among all compression methods – it is replaced with [[arithmetic coding]]<ref name="LiDrew2014">{{cite book\|author1=Ze-Nian Li\|author2=Mark S. Drew\|author3=Jiangchuan Liu\|title=Fundamentals of Multimedia\|url=https://books.google.com/books?id=R6vBBAAAQBAJ\|date=9 April 2014\|publisher=Springer Science & Business Media\|isbn=978-3-319-05290-8}}</ref> or [[asymmetric numeral systems]]<ref name=PCS2015>[https://ieeexplore.ieee.org/document/7170048 J. Duda, K. Tahboub, N. J. Gadil, E. J. Delp, ''The use of asymmetric numeral systems as an accurate replacement for Huffman coding''], Picture Coding Symposium, 2015.</ref> if a better compression ratio is required. Line 248: === ''n''-ary Huffman coding === The '''''n''-ary Huffman''' algorithm uses ~~the~~an alphabet of size ''n'', typically {0, 1, ..., ''n~~'' −~~ -1} ~~alphabet~~, to encode ~~message~~messages and build an ''n''-ary tree. This approach was considered by Huffman in his original paper. The same algorithm applies as for binary (<math alt="''n'' equals 2">n = 2</math>) codes, ~~except that the ''n'' least probable symbols are taken together,~~but instead of ~~just~~combining the 2two least ~~probable.~~likely ~~Note that for ''n'' greater than 2~~symbols, ~~not all sets of source words can properly form an ''n''-ary tree for Huffman coding. In these cases, additional 0-probability place holders must be added. This is because~~ the ~~tree must form an~~ ''n'' toleast 1likely ~~contractor;{{clarify~~symbols ~~\|date=September~~are ~~2023}}~~grouped ~~for binary coding, this is a 2 to 1 contractor, and any sized set can form such a contractor~~together. ~~If the number of source words is congruent to 1 modulo ''n'' − 1, then the set of source words will form a proper Huffman tree.~~ Note that for ''n'' > 2, not all sets of source words can properly form a complete ''n''-ary tree for Huffman coding. In these cases, additional placeholder symbols with 0 probability may need to be added. This is because the structure of the tree needs to repeatedly join ''n'' branches into one - also known as an "''n'' to 1" combination. For binary coding, this is a "2 to 1" combination, which works with any number of symbols. For ''n''-ary coding, a complete tree is only possible when the total number of symbols (real + placeholders) leaves a remainder of 1 when divided by (n-1). <ref name=":0" /> === Adaptive Huffman coding ===

Huffman coding: Difference between revisions