Audio coding format: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Added bibcode. Removed URL that duplicated identifier. Removed parameters. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 203/1032
 
(20 intermediate revisions by 6 users not shown)
Line 1:
{{short description|Digitally coded format for audio signals}}
[[File:Opus quality comparison colorblind compatible.svg|thumb|Comparison of coding efficiency between popular audio formats]]
An '''audio coding format'''<ref>The term "audio coding" can be seen in e.g. the name [[Advanced Audio Coding]], and is analogous to the term [[video coding format|video coding]]</ref> (or sometimes '''audio compression format''') is a [[Contentencoded format|content representation format]] for storage or transmission of [[digital audio]], (such as in [[digital television]], [[digital radio]] and in audio and video files). Examples of audio coding formats include [[MP3]], [[Advanced Audio Coding|AAC]], [[Vorbis]], [[FLAC]], and [[Opus (audio format)|Opus]]. A specific software or hardware implementation capable of [[Data_compression#Audio|audio compression]] and decompression to/from a specific audio coding format is called an ''[[audio codec]]''; an example of an audio codec is [[LAME]], which is one of several different codecs which implements encoding and decoding audio in the [[MP3]] audio coding format in software.
 
Some audio coding formats are documented by a detailed [[technical specification]] document known as an '''audio coding specification'''. Some such specifications are written and approved by [[standardization organization]]s as [[technical standard]]s, and are thus known as an '''audio coding standard'''. The term "standard" is also sometimes used for [[de facto standard|''de facto'' standards]] as well as formal standards.
Line 21:
[[File:Placa-audioPC-925.jpg|right|thumb|Solidyne 922: The world's first commercial audio bit compression [[sound card]] for PC, 1990]]
 
In 1950, [[Bell Labs]] filed the patent on [[differential pulse-code modulation]] (DPCM).<ref name="DPCM">{{US patent reference|inventor=C. Chapin Cutler|title=Differential Quantization of Communication Signals|number=2605361|A-Datum=1950-06-29|issue-date=1952-07-29}}</ref> [[Adaptive DPCM]] (ADPCM) was introduced by P. Cummiskey, [[Nikil Jayant|Nikil S. Jayant]] and [[James L. Flanagan]] at [[Bell Labs]] in 1973.<ref>{{cite journal|doi=10.1002/j.1538-7305.1973.tb02007.x|url=https://ieeexplore.ieee.org/document/6770730|title=Adaptive Quantization in Differential PCM Coding of Speech|year=1973|last1=Cummiskey|first1=P.|last2=Jayant|first2=N. S.|last3=Flanagan|first3=J. L.|journal=Bell System Technical Journal|volume=52|issue=7|pages=1105–1118|url-access=subscription}}</ref><ref>{{cite journal |last1=Cummiskey |first1=P. |last2=Jayant |first2=Nikil S. |last3=Flanagan |first3=J. L. |title=Adaptive quantization in differential PCM coding of speech |journal=The Bell System Technical Journal |date=1973 |volume=52 |issue=7 |pages=1105–1118 |doi=10.1002/j.1538-7305.1973.tb02007.x |issn=0005-8580}}</ref>
 
[[Perceptual coding]] was first used for [[speech coding]] compression, with [[linear predictive coding]] (LPC).<ref name="Schroeder2014">{{cite book |last1=Schroeder |first1=Manfred R. |title=Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder |date=2014 |publisher=Springer |isbn=9783319056609 |chapter=Bell Laboratories |page=388 |chapter-url=https://books.google.com/books?id=d9IkBAAAQBAJ&pg=PA388}}</ref> Initial concepts for LPC date back to the work of [[Fumitada Itakura]] ([[Nagoya University]]) and Shuzo Saito ([[Nippon Telegraph and Telephone]]) in 1966.<ref>{{cite journal |last1=Gray |first1=Robert M. |title=A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |journal=Found. Trends Signal Process. |date=2010 |volume=3 |issue=4 |pages=203–303 |doi=10.1561/2000000036 |url=https://ee.stanford.edu/~gray/lpcip.pdf |issn=1932-8346|doi-access=free }}</ref> During the 1970s, [[Bishnu S. Atal]] and [[Manfred R. Schroeder]] at [[Bell Labs]] developed a form of LPC called [[adaptive predictive coding]] (APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the [[code-excited linear prediction]] (CELP) algorithm which achieved a significant compression ratio for its time.<ref name="Schroeder2014"/> Perceptual coding is used by modern audio compression formats such as [[MP3]]<ref name="Schroeder2014"/> and [[Advanced Audio Codec|AAC]].
 
[[Discrete cosine transform]] (DCT), developed by [[Nasir Ahmed (engineer)|Nasir Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974,<ref name="DCT">{{cite journal |author1=Nasir Ahmed |author1-link=N. Ahmed |author2=T. Natarajan |author3=Kamisetty Ramamohan Rao |journal=IEEE Transactions on Computers |title=Discrete Cosine Transform |volume=C-23 |issue=1 |pages=90–93 |date=January 1974 |doi=10.1109/T-C.1974.223784 |s2cid=149806273 |url=https://www.ic.tu-berlin.de/fileadmin/fg121/Source-Coding_WS12/selected-readings/Ahmed_et_al.__1974.pdf |access-date=2019-10-20 |archive-date=2016-12-08 |archive-url=https://web.archive.org/web/20161208075733/https://www.ic.tu-berlin.de/fileadmin/fg121/Source-Coding_WS12/selected-readings/Ahmed_et_al.__1974.pdf |url-status=dead }}</ref> provided the basis for the [[modified discrete cosine transform]] (MDCT) used by modern audio compression formats such as MP3<ref name="Guckert">{{cite web |last1=Guckert |first1=John |title=The Use of FFT and MDCT in MP3 Audio Compression |url=http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website=[[University of Utah]] |date=Spring 2012 |access-date=14 July 2019}}</ref> and AAC. MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987,<ref>{{cite book|doi=10.1109/ICASSP.1987.1169405|chapter-url=https://ieeexplore.ieee.org/document/1169405|chapter=Subband/Transform coding using filter bank designs based on time ___domain aliasing cancellation|title=ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing|year=1987|last1=Princen|first1=J.|last2=Johnson|first2=A.|last3=Bradley|first3=A.|volume=12|pages=2161–2164|s2cid=58446992}}</ref> following earlier work by Princen and Bradley in 1986.<ref>{{cite journal|doi=10.1109/TASSP.1986.1164954|url=https://ieeexplore.ieee.org/document/1164954|title=Analysis/Synthesis filter bank design based on time ___domain aliasing cancellation|year=1986|last1=Princen|first1=J.|last2=Bradley|first2=A.|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing|volume=34|issue=5|pages=1153–1161}}</ref> The MDCT is used by modern audio compression formats such as [[Dolby Digital]],<ref name="Luo">{{cite book |last1=Luo |first1=Fa-Long |title=Mobile Multimedia Broadcasting Standards: Technology and Practice |date=2008 |publisher=[[Springer Science & Business Media]] |isbn=9780387782638 |page=590 |url=https://books.google.com/books?id=l6PovWat8SMC&pg=PA590}}</ref><ref>{{cite journal |last1=Britanak |first1=V. |title=On Properties, Relations, and Simplified Implementation of Filter Banks in the Dolby Digital (Plus) AC-3 Audio Coding Standards |journal=IEEE Transactions on Audio, Speech, and Language Processing |date=2011 |volume=19 |issue=5 |pages=1231–1241 |doi=10.1109/TASL.2010.2087755|bibcode=2011ITASL..19.1231B |s2cid=897622 }}</ref> [[MP3]],<ref name="Guckert"/> and [[Advanced Audio Coding]] (AAC).<ref name=brandenburg>{{cite web|url=http://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|title=MP3 and AAC Explained|last=Brandenburg|first=Karlheinz|year=1999|url-status=live|archive-url=https://web.archive.org/web/20170213191747/https://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|archive-date=2017-02-13}}</ref>
 
==List of lossy formats==
Line 31:
{| class="wikitable sortable"
|-
! rowspan="2" | Basic compression algorithm
! rowspan="2" | Audio coding standard
! rowspan="2" | Abbreviation
! rowspan="2" | Introduction
! colspan="2" | Market share {{small|(20192023)}}<ref name="Bitmovin">{{cite web |url=https://cdn2.hubspot.net/hubfs/3411032/Bitmovin%20Magazine/Video%20Developer%20Report%202019/bitmovin-video-developer-report-2019.pdf |title=Video Developer Report 2019 |website=[[Bitmovin]] |year=2019 |access-date=5 November 2019}}</ref>
! {{Abbr|Ref|Reference(s)}}
|-
!Production
| rowspan="9" | [[Modified discrete cosine transform]] (MDCT)
!Streaming
!
|-
| rowspan="911" | [[Modified discrete cosine transform]] (MDCT)
| [[Dolby Digital]] (AC-3)
| AC3
| 1991
| rowspan="2" | 36–54%{{refn|group=n|name=MarketShareNote|The report combines AC-3 & E-AC-3 and separates [[Dolby Atmos]] from its market share calculation. Dolby Atmos can be encoded either lossily with E-AC-3/[[Dolby AC-4|AC-4]]<ref>{{Cite web |date=2023-05-23 |title=Does Dolby AC-4 support Dolby Atmos? |url=https://professionalsupport.dolby.com/s/article/Does-Dolby-AC-4-support-Dolby-Atmos |access-date=2024-11-08 |website=Dolby Professional Support}}</ref> or losslessly with [[Dolby TrueHD]]. [[Music streaming service|Music]] and [[Video on demand|video streaming]] providers typically use Dolby Digital Plus augmented with Dolby Atmos, whereas [[Music download|digital downloads]] and [[Blu-ray|Blu-ray discs]] typically use Dolby TrueHD augmented with Dolby Atmos.<ref>{{Cite web |date=2023-05-03 |title=Just wait until you hear lossless Dolby Atmos Music |url=https://www.digitaltrends.com/home-theater/lossless-spatial-audio-dolby-atmos-music/ |access-date=2024-11-08 |website=Digital Trends |language=en}}</ref>}}
| 58%
| rowspan="2" |37–61%{{refn|group=n|name=MarketShareNote}}
| <ref name="Luo" /><ref name="Britanak2011">{{cite journal |last1=Britanak |first1=V. |title=On Properties, Relations, and Simplified Implementation of Filter Banks in the Dolby Digital (Plus) AC-3 Audio Coding Standards |journal=IEEE Transactions on Audio, Speech, and Language Processing |date=2011 |volume=19 |issue=5 |pages=1231–1241 |doi=10.1109/TASL.2010.2087755|bibcode=2011ITASL..19.1231B |s2cid=897622 }}</ref>
|-
|[[Dolby Digital|Dolby Digital Plus]] (E-AC-3)
|EAC3
|2004
|<ref>{{cite web |last1=Andersen |first1=Robert Loring |last2=Crockett |first2=B. |last3=Davidson |first3=G. |last4=Davis |first4=Mark |last5=Fielder |first5=L. |last6=Turner |first6=Stephen C. |last7=Vinton |first7=M. |last8=Williams |first8=P. |date=1 October 2004 |title=Introduction to Dolby Digital Plus, an Enhancement to the Dolby Digital Coding System |url=https://www.dolby.com/us/en/technologies/aes-convention-paper-intro-to-dolby-digital-plus.pdf |archive-url=https://web.archive.org/web/20161119192949/https://www.dolby.com/us/en/technologies/aes-convention-paper-intro-to-dolby-digital-plus.pdf |archive-date=2016-11-19 |website=Journal of The Audio Engineering Society}}</ref><ref>{{Citation |title=Digital Audio Compression (AC-3, Enhanced AC-3) Standard |date=20 September 2017 |url=https://www.etsi.org/deliver/etsi_ts/102300_102399/102366/01.04.01_60/ts_102366v010401p.pdf |access-date=21 September 2023 |publisher=European Telecommunications Standards Institute |id=ETSI TS 102 366 V1.4.1 (2017-09}}</ref>
|-
| [[ATRAC|Adaptive Transform Acoustic Coding]]
Line 49 ⟶ 59:
| 1992
| {{unk}}
| {{unk}}
| <ref name="Luo" />
|-
| [[MPEG Layer III]]
| MP3
| 1993
| 4915%
|19%
| <ref name="Guckert" /><ref name="Stankovic">{{cite journal |last1=Stanković |first1=Radomir S. |last2=Astola |first2=Jaakko T. |title=Reminiscences of the Early Work in DCT: Interview with K.R. Rao |journal=Reprints from the Early Days of Information Sciences |date=2012 |volume=60 |url=http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf |access-date=13 October 2019}}</ref>
|-
| [[Advanced Audio Coding]] ([[MPEG-2]] / [[MPEG-4]])
| AAC
| 1997
| 8883%
|87%
| <ref name="brandenburg" /><ref name="Luo" />
|-
| [[Windows Media Audio]]
Line 67 ⟶ 80:
| 1999
| {{unk}}
| {{unk}}
| <ref name="Luo" />
|-
| [[Ogg]] [[Vorbis]]
| Ogg
| 2000
| 76%
|4%
| <ref name="vorbis-mdct">{{cite web |author=Xiph.Org Foundation |publisher=Xiph.Org Foundation |url=http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-50001.1.2 |title=Vorbis I specification - 1.1.2 Classification |date=2009-06-02 |access-date=2009-09-22}}</ref><ref name="Luo" />
|-
| [[CELT|Constrained Energy Lapped Transform]]
| CELT
| 2011
| {{n/a}}
| {{n/a}}
| <ref name="presentation">{{cite AV media|url=http://people.xiph.org/~greg/video/linux_conf_au_CELT_2.ogv|title=Presentation of the CELT codec|first=Timothy B.|last=Terriberry|transcript-url=http://www.celt-codec.org/presentations/misc/lca-celt.pdf|transcript=Presentation}}</ref>
Line 84 ⟶ 100:
| Opus
| 2012
| 812%
|9%
| <ref>{{cite conference|last1=Valin|first1=Jean-Marc|last2=Maxwell|first2=Gregory|last3=Terriberry|first3=Timothy B.|last4=Vos|first4=Koen|date=October 2013|title=High-Quality, Low-Delay Music Coding in the Opus Codec|conference=135th AES Convention|publisher=[[Audio Engineering Society]]|arxiv=1602.04845}}</ref>
|-
|[[Dolby AC-4]]
|AC4
|2014
| {{unk}}
| {{unk}}
|<ref name="DolbyAC4ServicesJune2015Dolby">{{cite news |date=2015-06-01 |title=Dolby AC-4: Audio Delivery for Next-Generation Entertainment Services |url=https://www.dolby.com/in/en/technologies/ac-4/Next-Generation-Entertainment-Services.pdf |url-status=dead |archive-url=https://web.archive.org/web/20151204095813/http://www.dolby.com/in/en/technologies/ac-4/Next-Generation-Entertainment-Services.pdf |archive-date=2015-12-04 |access-date=2016-04-26 |publisher=[[Dolby Laboratories]]}}</ref>
|-
| [[LDAC (codec)|LDAC]]
| LDAC
| 2015
| {{unk}}
| {{unk}}
| <ref name="Darko 2017">{{cite web | last=Darko | first=John H. | title=The inconvenient truth about Bluetooth audio | website=DAR__KO | date=2017-03-29 | url=http://www.digitalaudioreview.net/2017/03/the-inconvenient-truth-about-bluetooth-audio/ | access-date=2018-01-13 | archive-url=https://web.archive.org/web/20180114020200/http://www.digitalaudioreview.net/2017/03/the-inconvenient-truth-about-bluetooth-audio/ | archive-date=2018-01-14 | url-status=dead }}</ref><ref name="AVHub 2015">{{cite web|url=http://www.avhub.com.au/news/sound-image/what-is-sony-ldac-and-how-does-it-do-it-408285|title=What is Sony LDAC, and how does it do it?|last=Ford|first=Jez|date=2015-08-24|website=AVHub|access-date=2018-01-13}}</ref>
Line 97 ⟶ 122:
| aptX
| 1989
| {{unk}}
| {{unk}}
| <ref name="AVHub 2016">{{cite web|url=http://www.avhub.com.au/news/sound-image/aptx-hd---lossless-or-lossy-442124|title=aptX HD - lossless or lossy?|last=Ford|first=Jez|date=2016-11-22|website=AVHub|access-date=2018-01-13}}</ref>
|-
| [[DTS (sound systemcompany)#DTS audio codecDTS_Digital_Surround|Digital Theater Systems]]
| DTS
| 1990
| 148%
|6%
| <ref>{{cite web |title=Digital Theater Systems Audio Formats |url=https://www.loc.gov/preservation/digital/formats/fdd/fdd000232.shtml |website=[[Library of Congress]] |access-date=10 November 2019 |date=27 December 2011}}</ref><ref>{{cite book |last1=Spanias |first1=Andreas |last2=Painter |first2=Ted |last3=Atti |first3=Venkatraman |title=Audio Signal Processing and Coding |date=2006 |publisher=[[John Wiley & Sons]] |isbn=9780470041963 |page=338 |url=https://books.google.com/books?id=a1RULRErhOYC&pg=PA338}}</ref>
|-
Line 110 ⟶ 137:
| 2014
| {{unk}}
| {{unk}}
|
|-
| rowspan="23" | [[Sub-band coding]] (SBC)
| [[MPEG-1 Audio Layer II]]
| MP2
| 1993
| rowspan="2" {{unk}}
| rowspan="2" | {{unk}}
|<ref name="11172-32">{{cite web |year=1993 |title=ISO/IEC 11172-3:1993 – Information technology — Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s — Part 3: Audio |url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=22412 |access-date=2010-07-14 |publisher=ISO}}</ref>
|-
| [[Musepack]]
| MPC
| 1997
|
|-
|[[SBC (codec)|SBC]]
|SBC
|2003
| {{unk}}
| {{unk}}
|<ref name="a2dp">Bluetooth SIG, Specification of the Bluetooth System, Profiles, Advanced Audio Distribution Profile version 1.3. https://www.bluetooth.org/docman/handlers/DownloadDoc.ashx?doc_id=260859&vId=290074</ref>
|}
 
Line 134 ⟶ 171:
** [[Low-delay CELP]] (LD-CELP)
** [[Adaptive Multi-Rate audio codec|Adaptive Multi-Rate]] (used in [[GSM]] and [[3GPP]])
** [[Codec2Codec 2]] (noted for its lack of patent restrictions)
** [[Speex]] (noted for its lack of patent restrictions)
* [[Modified discrete cosine transform]] (MDCT)
** [[AAC-LD]]
** [[CELT|Constrained Energy Lapped Transform]] (CELT)
** [[Opus (codecaudio format)|Opus]] (mostly for real-time applications)
 
== List of lossless formats ==
* [[Apple Lossless Audio Codec|Apple Lossless]] (ALAC – Apple Lossless Audio Codec)
* [[ATRAC|Adaptive Transform Acoustic Coding]] (ATRAC)
* [[Audio Lossless Coding]] (also known as MPEG-4 ALS)
* [[Super Audio CD#DST|Direct Stream Transfer]] (DST)
* [[Dolby TrueHD]]
* [[DTS-HD Master Audio]]
* [[FLAC|Free Lossless Audio Codec]] (FLAC)
* [[Discrete cosine transform|Lossless discrete cosine transform]] (LDCT)
* [[Meridian Lossless Packing]] (MLP)
Line 156 ⟶ 193:
* [[Original Sound Quality]] (OSQ)
* [[RealPlayer]] (RealAudio Lossless)
* [[Shorten (file formatcodec)|Shorten]] (SHN)
* TTA (True Audio Lossless)
* [[WavPack]] (WavPack lossless)
Line 166 ⟶ 203:
* [[Audio file format]]
* [[List of audio compression formats]]
 
== Notes ==
<references group="n" />
 
==References==