Code-excited linear prediction: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 07:18, 11 June 2014 edit Mikhail Ryazanov (talk \| contribs) Extended confirmed users 24,662 edits m →top ← Previous edit		Latest revision as of 23:51, 5 December 2024 edit undo IznoRepeat (talk \| contribs) Extended confirmed users 50,782 edits m add WP:TEMPLATECAT to remove from template; genfixes Tag: AWB
(26 intermediate revisions by 18 users not shown)
Line 1: {{Short description\|Speech coding algorithm}} '''Code-excited linear prediction''' ('''CELP''') is a [[speech coding]] algorithm originally proposed by [[Manfred_R._Schroeder\|M. R. Schroeder]] and [[Bishnu S. Atal\|B. S. Atal]] in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as [[residual-excited linear prediction]] and [[linear predictive coding]] [[vocoders]] (e.g., [[FS-1015]]). Along with its variants, such as [[algebraic CELP]], [[relaxed CELP]], [[low-delay CELP]] and [[vector sum excited linear prediction]], it is currently the most widely used speech coding algorithm. It is also used in [[MPEG-4 Audio]] speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.▼ {{No footnotes\|date=May 2022}} ▲'''Code-excited linear prediction''' ('''CELP''') is a [[linear predictive coding\|linear predictive]] [[speech coding]] algorithm originally proposed by [[~~Manfred_R._Schroeder\|M.~~Manfred R. Schroeder]] and [[Bishnu ~~S. Atal\|B.~~ S. Atal]] in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as [[residual-excited linear prediction]] (RELP) and [[linear predictive coding]] (LPC) [[vocoders]] (e.g., [[FS-1015]]). Along with its variants, such as [[algebraic CELP]], [[relaxed CELP]], [[low-delay CELP]] and [[vector sum excited linear prediction]], it is currently the most widely used speech coding algorithm{{Citation needed\|reason=No sources to back this claim up.\|date=November 2016}}. It is also used in [[MPEG-4 Audio]] speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec. ==~~Introduction~~Background== The CELP algorithm is based on four main ideas: * Using the [[source-filter model of speech production]] through [[linear prediction]] (LP) (see the textbook "speech coding algorithm"); * Using an adaptive and a fixed codebook as the input (excitation) of the LP model; * Performing a search in closed-loop in a ~~“perceptually~~"perceptually weighted ~~___domain”~~___domain". * Applying [[vector quantization]] (VQ) Line 13 ⟶ 15: [[File:Celp decoder.svg\|300px\|thumb\|Figure 1: CELP decoder]] Before exploring the complex encoding process of CELP we introduce the decoder here. Figure 1 describes a generic CELP decoder. The excitation is produced by summing the contributions from ~~an adaptive~~fixed (~~aka~~a.k.a. ~~pitch~~stochastic or innovation) ~~codebook~~ and ~~a stochastic~~adaptive (~~aka~~a.k.a. ~~innovation or fixed~~pitch) ~~codebook~~codebooks: :<math>e[n]=~~e_a~~e_f[n]+~~e_f~~e_a[n]\,</math> where <math>e_{af}[n]</math> is the ~~adaptive~~fixed (~~[[Pitch~~a.k.a. ~~(music)\|pitch]]~~stochastic or innovation) codebook contribution and <math>e_{fa}[n]</math> is the ~~stochastic~~adaptive (~~innovation~~[[Pitch ~~or fixed~~(music)\|pitch]]) codebook contribution. The fixed codebook is a [[vector quantization]] dictionary that is (implicitly or explicitly) hard-coded into the codec. This codebook can be algebraic ([[ACELP]]) or be stored explicitly (e.g. [[Speex]]). The entries in the adaptive codebook consist of delayed versions of the excitation. This makes it possible to efficiently code periodic signals, such as voiced sounds. The filter that shapes the excitation has an all-pole model of the form <math>1/A(z)</math>, where <math>A(z)</math> is called the prediction filter and is obtained using linear prediction ([[Levinson recursion\|Levinson–Durbin algorithm]]). An all-pole filter is used because it is a good representation of the human vocal tract and because it is easy to compute. ==CELP encoder== The main principle behind CELP is called ~~[[Analysis~~analysis-by-~~Synthesis\|Analysis-by-Synthesis~~synthesis (AbS)]] and means that the encoding (analysis) is performed by perceptually optimizing the decoded (synthesis) signal in a closed loop. In theory, the best CELP stream would be produced by trying all possible bit combinations and selecting the one that produces the best-sounding decoded signal. This is obviously not possible in practice for two reasons: the required complexity is beyond any currently available hardware and the “best sounding” selection criterion implies a human listener. In order to achieve real-time encoding using limited computing resources, the CELP search is broken down into smaller, more manageable, sequential searches using a simple perceptual weighting function. Typically, the encoding is performed in the following order: * [[Linear ~~Prediction~~predictive ~~Coefficients~~coding\|Linear prediction coefficients]] (LPC) are computed and quantized, usually as [[~~Line~~line spectral pairs~~\|LSPs~~]] (LSPs). * The adaptive (pitch) codebook is searched and its contribution removed * The ~~fixed~~adaptive (~~innovation~~pitch) codebook is searched and its contribution removed. * The fixed (innovation) codebook is searched. ===Noise weighting=== Line 38 ⟶ 41: ==See also== * [[MPEG-4 Part 3]] (CELP as an MPEG-4 Audio Object Type) * [[G.728]] -– Coding of speech at 16  kbit/s using low-delay code excited linear prediction * [[G.718]] -– uses CELP for the lower two layers for the band (50–6400 Hz) in a two -stage coding structure * [[G.729.1]] -– uses CELP coding for the lower band (50–4000 Hz) in a three-stage coding structure * [[Comparison of audio ~~codecs~~coding formats]] * [[CELT]] is a related audio codec that borrows some ideas from CELP. ==References==▼ * B.S. Atal, "The History of Linear Prediction," ''IEEE Signal Processing Magazine'', vol. 23, no. 2, March 2006, pp. 154–161.▼ * M. R. Schroeder and B. S. Atal, "Code-excited linear prediction (CELP): high-quality speech at very low bit rates," in ''Proceedings of the IEEE [[International Conference on Acoustics, Speech, and Signal Processing]]'' (ICASSP), vol. 10, pp. 937–940, 1985.▼ ==External links== * This article is based on a [http://people.xiph.org/~jm/papers/speex_lca2006.pdf paper] presented at [http://linux.conf.au/ Linux.Conf.Au] * Some parts based on the [[Speex]] codec [~~http~~https://www.speex.org/docs/ manual] * [http://www.speech.cs.cmu.edu/comp.speech/Section3/Software/celp-3.2a.html reference implementations] of CELP 1016A (CELP 3.2a) and LPC 10e. {{Webarchive\|url=https://web.archive.org/web/20161212000335/http://www.speech.cs.cmu.edu/comp.speech/Section3/Software/celp-3.2a.html \|date=2016-12-12 }} * [https://web.archive.org/web/20090602220112/http://www.otolith.com/otolith/olt/lpc.html Linear Predictive Coding (LPC)] === Selected readings === * [~~http~~https://www.speex.org/docs/manual/speex-manual/node9.html Introduction to CELP Coding] * [~~http~~https://cnx.org/content/m10482/latest/ Speech Processing: Theory of LPC Analysis and Synthesis] {{Webarchive\|url=https://web.archive.org/web/20140615041652/http://cnx.org/content/m10482/latest/ \|date=2014-06-15 }} ▲==References== ▲* B.S. Atal, "The History of Linear Prediction," ''IEEE Signal Processing Magazine'', vol. 23, no. 2, March 2006, pp. 154–161. ▲* M. R. Schroeder and B. S. Atal, "Code-excited linear prediction (CELP): high-quality speech at very low bit rates," in ''Proceedings of the IEEE [[International Conference on Acoustics, Speech, and Signal Processing]]'' (ICASSP), vol. 10, pp. 937–940, 1985. {{Compression Methods}} Line 62 ⟶ 65: [[Category:Speech codecs]] [[Category:Data compression]]