Audio compression (data): Difference between revisions

Content deleted Content added
TMC1221 (talk | contribs)
m add ATRAC
Category
 
(331 intermediate revisions by more than 100 users not shown)
Line 1:
*#REDIRECT [[dataData compression#Audio]]
''Note: This article is about audio data compression, which reduces the data rate of digital audio signals. This should not be confused with [[audio level compression]] (also known as [[companding]]), which reduces the dynamic range of audio signals.''
 
[[Category:Audio compression]]
------------------------
 
{{Redirect category shell|1=
'''Audio compression''' is a form of [[data compression]] designed to reduce the size of audio data files. Audio compression algorithms are typically referred to as ''audio codecs''. As with other specific forms of data compression, there exist many "[[Lossless data compression|lossless]]" and "[[Lossy data compression|lossy]]" algorithms to achieve the compression effect.
{{R to section}}
 
{{R from merge}}
 
{{R with Wikidata item}}
== Lossless Compression ==
}}
 
Compared with [[image compression]], lossless compression algorithms are not nearly as widely used in audio compression. The primary users of lossless compression are [[audio engineer|audio engineers]] and those consumers who disdain the quality loss from lossy compression techniques such as [[Vorbis]] and [[MP3]].
 
First, the vast majority of sound recordings are natural sounds, recorded from the real world, and such data doesn't compress well. In a similar manner, [[photo]]s compress less efficiently with lossless methods than computer-generated images do. But worse, even computer generated sounds can contain very complicated [[waveform]]s that present a challenge to many compression algorithms. This is due to the nature of audio waveforms, which are generally difficult to simplify without a (necessarily lossy) conversion to frequency information, as performed by the human ear.
 
The second reason is that values of audio [[sample (signal)|sample]]s change very quickly, so generic data compression [[algorithm]]s don't work well for audio, and strings of consecutive bytes don't generally appear very often. However, [[convolution]] with the filter [-1 1] (that is, taking the first difference) tends to [[white noise|whiten]] the spectrum a bit and allows traditional lossless compression to do its job; integration restores the original signal. More advanced codecs such as Shorten ([[SHN]]) and [[FLAC]] use [[linear prediction]] to come up with an optimal whitening filter.
 
== Examples ==
 
Some examples of popular lossless audio codecs:
* [[SHN|Shorten]]
* [[FLAC|Free Lossless Audio Codec (FLAC)]]
* [[Lossless Predictive Audio Compression (LPAC)]]
* [[Lossless Transform Audio Compression (LTAC)]]
* OptimFROG
* Monkey's Audio
* LA (Lossless Audio)
 
Lossless audio codecs have no quality issues, so the usabilty can be estimated by
* Speed of compression and decompression
* Compression factor
* Software support
 
See http://web.inter.nl.net/users/hvdh/lossless/All.htm
 
== Lossy Compression ==
 
:''Note:'' Actually this is not a '''compression''' (i.e. redundancy reduction = reversible), but an '''irrelevance coding''' (i.e. an irrelevance reduction).
 
Most lossy audio compression algorithms are based on simple transforms like the [[modified discrete cosine transform]] (MDCT), that convert sampled waveforms into their component frequencies. Some modern algorithms use [[wavelet]]s, but it is still not certain if such algorithms will work significantly better than those based on MDCT because of the inherent periodicity of audio signals, which wavelets seem not to handle well. Some algorithms try to merge the two approaches.
 
Most algorithms don't try to minimize mathematical error, but instead maximize subjective human feeling of fidelity. As the human ear cannot analyze all components of an incoming sound, a file can be considerably modified without changing the subjective experience of a listener. For example, a codec can drop some information about very low and very high frequencies, which are almost inaudible to humans. Similarly, frequencies which are "masked" by other frequencies due to the nature of the human [[cochlea]], are represented with decreased accuracy. Another form of masking is that a quiet sound is not discernable if it is immediately preceded or followed by a loud sound.
A model of the human ear-brain combination incorporating such effects is often called a [[psychoacoustic model]] or "psycho-model" for short.
 
Due to the nature of lossy algorithms, [[audio quality]] suffers when a file is decompressed and recompressed (generation losses). This makes lossily-compressed files less than ideal for audio engineering applications, such as sound editing and multitrack recording. However, they are very popular with end users (particularly [[MP3]]), as a megabyte can store about a minute's worth of music at adequate quality.
 
== Examples ==
 
Some examples of popular audio codecs:
* [[MP2]] Layer 2 audio codec ([[MPEG]]-1, [[MPEG]]-2 and non-ISO MPEG-2.5)
* [[MP3]] Layer 3 audio codec ([[MPEG]]-1, [[MPEG]]-2 and non-ISO MPEG-2.5)
* [[MPC]] Musepack
* [[Vorbis]] Ogg Vorbis
* [[Advanced audio coding|AAC]] Advanced Audio Coding ([[MPEG]]-2 and [[MPEG]]-4)
* [[WMA]] Windows Media Audio
* [[ATRAC]] Adaptive TRansform Acoustic Coding (Used in [[MiniDisc]])
* [[DTS]] DTS Coherent Acoustics
* [[AC3]] AC-3 or Dolby Digital A/52
 
Other examples can be found on the [[codec]] page.
 
== See also ==
*[[Lossless Transform Audio Compression (LTAC)]]
*[[psychoacoustics]]
*[[audio file format]]
*[[audio signal processing]]
*[[data compression]]
*[[video file formats]]
*[[audio storage]]
*[[codec]]
*[[digital signal processing]]
*[[speech encoding]]
*[[digital rights management]]
*[[subband encoding]]