Revision as of 09:33, 18 January 2025 edit Fasterthanlime (talk \| contribs) 3 edits →TDAC for the windowed MDCT: Fixed typo Tags: Mobile edit Mobile web edit ← Previous edit		Revision as of 15:29, 25 February 2025 edit undo Citation bot (talk \| contribs) Bots 5,869,882 edits Added bibcode. \| Use this bot. Report bugs. \| Suggested by Dominic3203 \| Category:Fourier analysis \| #UCB_Category 54/126 Next edit →
Line 3: The '''modified discrete cosine transform''' ('''MDCT''') is a transform based on the type-IV [[discrete cosine transform]] (DCT-IV), with the additional property of being [[lapped transform\|lapped]]: it is designed to be performed on consecutive blocks of a larger [[dataset]], where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid [[compression artifact\|artifacts]] stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used [[lossy compression]] technique in [[audio data compression]]. It is employed in most modern [[audio coding standards]], including [[MP3]], [[Dolby Digital]] (AC-3), [[Vorbis]] (Ogg), [[Windows Media Audio]] (WMA), [[ATRAC]], [[Cook codec\|Cook]], [[Advanced Audio Coding]] (AAC),<ref name="Luo">{{cite book \|last1=Luo \|first1=Fa-Long \|title=Mobile Multimedia Broadcasting Standards: Technology and Practice \|date=2008 \|publisher=[[Springer Science & Business Media]] \|isbn=9780387782638 \|page=590 \|url=https://books.google.com/books?id=l6PovWat8SMC&pg=PA590}}</ref> [[High-Definition Coding]] (HDC),<ref>{{cite book \|last1=Jones \|first1=Graham A. \|last2=Layer \|first2=David H. \|last3=Osenkowsky \|first3=Thomas G. \|title=National Association of Broadcasters Engineering Handbook: NAB Engineering Handbook \|date=2013 \|publisher=[[Taylor & Francis]] \|isbn=978-1-136-03410-7 \|pages=558–9 \|url=https://books.google.com/books?id=K9N1TVhf82YC&pg=PA558}}</ref> [[LDAC (codec)\|LDAC]], [[Dolby AC-4]],<ref>{{cite web \|title=Dolby AC-4: Audio Delivery for Next-Generation Entertainment Services \|url=https://www.dolby.com/us/en/technologies/ac-4/Next-Generation-Entertainment-Services.pdf \|website=[[Dolby Laboratories]] \|date=June 2015 \|access-date=11 November 2019}}</ref> and [[MPEG-H 3D Audio]],<ref>{{cite journal \|last1=Bleidt \|first1=R. L. \|last2=Sen \|first2=D. \|last3=Niedermeier \|first3=A. \|last4=Czelhan \|first4=B. \|last5=Füg \|first5=S. \|display-authors=etal \|title=Development of the MPEG-H TV Audio System for ATSC 3.0 \|journal=IEEE Transactions on Broadcasting \|date=2017 \|volume=63 \|issue=1 \|pages=202–236 \|doi=10.1109/TBC.2017.2661258 \|s2cid=30821673 \|url=https://www.iis.fraunhofer.de/content/dam/iis/en/doc/ame/Conference-Paper/BleidtR-IEEE-2017-Development-of-MPEG-H-TV-Audio-System-for-ATSC-3-0.pdf}}</ref> as well as [[speech coding]] standards such as [[AAC-LD]] (LD-MDCT),<ref>{{cite conference \|last1=Schnell \|first1=Markus \|last2=Schmidt \|first2=Markus \|last3=Jander \|first3=Manuel \|last4=Albert \|first4=Tobias \|last5=Geiger \|first5=Ralf \|last6=Ruoppila \|first6=Vesa \|last7=Ekstrand \|first7=Per \|last8=Bernhard \|first8=Grill \|title=MPEG-4 Enhanced Low Delay AAC - A New Standard for High Quality Communication \|conference=125th AES Convention \|date=October 2008 \|publisher=[[Audio Engineering Society]] \|url=https://www.iis.fraunhofer.de/content/dam/iis/de/doc/ame/conference/AES-125-Convention_AAC-ELD-NewStandardForHighQualityCommunication_AES7503.pdf \|website=[[Fraunhofer IIS]] \|access-date=20 October 2019}}</ref> [[G.722.1]],<ref>{{cite conference \|last1=Lutzky \|first1=Manfred \|last2=Schuller \|first2=Gerald \|last3=Gayer \|first3=Marc \|last4=Krämer \|first4=Ulrich \|last5=Wabnik \|first5=Stefan \|title=A guideline to audio codec delay \|url=https://www.iis.fraunhofer.de/content/dam/iis/de/doc/ame/conference/AES-116-Convention_guideline-to-audio-codec-delay_AES116.pdf \|website=[[Fraunhofer IIS]] \|conference=116th AES Convention \|publisher=[[Audio Engineering Society]] \|date=May 2004 \|access-date=24 October 2019}}</ref> [[G.729.1]],<ref name="Nagireddi">{{cite book \|last1=Nagireddi \|first1=Sivannarayana \|title=VoIP Voice and Fax Signal Processing \|date=2008 \|publisher=[[John Wiley & Sons]] \|isbn=9780470377864 \|page=69 \|url=https://books.google.com/books?id=5AneeZFE71MC&pg=PA69}}</ref> [[CELT]],<ref name="presentation">[http://people.xiph.org/~greg/video/linux_conf_au_CELT_2.ogv Presentation of the CELT codec] {{Webarchive\|url=https://web.archive.org/web/20110807182250/http://people.xiph.org/~greg/video/linux_conf_au_CELT_2.ogv \|date=2011-08-07 }} by Timothy B. Terriberry (65 minutes of video, see also [http://www.celt-codec.org/presentations/misc/lca-celt.pdf presentation slides] {{Webarchive\|url=https://web.archive.org/web/20231116105544/http://www.celt-codec.org/presentations/misc/lca-celt.pdf \|date=2023-11-16}} in PDF)</ref> and [[Opus (audio format)\|Opus]].<ref name="homepage">{{cite web \|url=http://opus-codec.org/ \|title=Opus Codec \|work=Opus \|publisher=Xiph.org Foundation \|type=Home page \|access-date=July 31, 2012}}</ref><ref name="ars-role">{{cite web \|url=https://arstechnica.com/gadgets/2012/09/newly-standardized-opus-audio-codec-fills-every-role-from-online-chat-to-music/ \|title=Newly standardized Opus audio codec fills every role from online chat to music \|first=Peter \|last=Bright \|work=[[Ars Technica]] \|date=2012-09-12 \|access-date=2014-05-28}}</ref> The [[discrete cosine transform]] (DCT) was first proposed by [[N. Ahmed\|Nasir Ahmed]] in 1972,<ref name="Ahmed">{{cite journal \|last=Ahmed \|first=Nasir \|author-link=N. Ahmed \|title=How I Came Up With the Discrete Cosine Transform \|journal=[[Digital Signal Processing (journal)\|Digital Signal Processing]] \|date=January 1991 \|volume=1 \|issue=1 \|pages=4–5 \|doi=10.1016/1051-2004(91)90086-Z \|bibcode=1991DSP.....1....4A \|url=https://www.cse.iitd.ac.in/~pkalra/col783-2017/DCT-History.pdf}}</ref> and demonstrated by Ahmed with T. Natarajan and [[K. R. Rao]] in 1974.<ref name="pubDCT">{{Citation \|first1=Nasir \|last1=Ahmed \|author1-link=N. Ahmed \|first2=T. \|last2=Natarajan \|first3=K. R. \|last3=Rao \|title=Discrete Cosine Transform \|journal=IEEE Transactions on Computers \|date=January 1974 \|volume=C-23 \|issue=1 \|pages=90–93 \|doi=10.1109/T-C.1974.223784\|s2cid=149806273 }}</ref> The MDCT was later proposed by John P. Princen, A.W. Johnson and Alan B. Bradley at the [[University of Surrey]] in 1987,<ref>{{cite book \|last1=Princen \|first1=John P. \|last2=Johnson \|first2=A.W. \|last3=Bradley \|first3=Alan B. \|title=ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing \|chapter=Subband/Transform coding using filter bank designs based on time ___domain aliasing cancellation \|date=1987 \|volume=12 \|pages=2161–2164 \|doi=10.1109/ICASSP.1987.1169405\|s2cid=58446992 }}</ref> following earlier work by Princen and Bradley (1986)<ref>John P. Princen, Alan B. Bradley: ''Analysis/synthesis filter bank design based on time ___domain aliasing cancellation'', IEEE Trans. Acoust. Speech Signal Processing, ''ASSP-34'' (5), 1153–1161, 1986. Described a precursor to the MDCT using a combination of discrete cosine and sine transforms.</ref> to develop the MDCT's underlying principle of '''time-___domain aliasing cancellation''' (TDAC), described below. (There also exists an analogous transform, the MDST, based on the [[discrete sine transform]], as well as other, rarely used, forms of the MDCT based on different types of DCT or DCT/DST combinations.) In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a 32-band [[polyphase quadrature filter]] (PQF) bank. The output of this MDCT is postprocessed by an alias reduction formula to reduce the typical aliasing of the PQF filter bank. Such a combination of a filter bank with an MDCT is called a ''hybrid'' filter bank or a ''subband'' MDCT. AAC, on the other hand, normally uses a pure MDCT; only the (rarely used) [[MPEG-4 AAC-SSR]] variant (by [[Sony]]) uses a four-band PQF bank followed by an MDCT. Similar to MP3, [[ATRAC]] uses stacked [[quadrature mirror filter]]s (QMF) followed by an MDCT.

Modified discrete cosine transform: Difference between revisions