Revision as of 15:51, 25 February 2002 edit Conversion script (talk \| contribs) 10 edits m Automated conversion ← Previous edit		Revision as of 01:01, 20 March 2002 edit undo Tbackstr (talk \| contribs) Extended confirmed users 557 edits major rewrite Next edit →
Line 1: '''Speech coding''' is the [[audio compression\|compression]] of speech (into a [[code]]) for [[telecommunication\|transmission]] using [[audio signal processing]] and [[speech processing]] techniques. The two most important applications using speech coding are [[mobile phone\|mobile phones]] and [[internet phone\|internet phones]]. '''Speech coding''' is the compression of speech (into a code) for transmission. Generally, the [[spectrum\|spectral]] [[spectral envelope\|envelope]] of the input [[signal]] is represented by an all-pole filter which is excited by a pulse train. The most common filter generation method is [[linear predictive coding]] (LPC) by the autocorrelation method. However, the filter coefficients are sensitive to errors and their range is largely unknown. The coefficients are therefore coded into some other representation, which is more tolerant to errors. Such representations are, among others, [[line spectrum pair]] (LSP), [[log-area ratios]] (LAR) and reflection coefficients (related to [[lattice filter\|lattice filters]] and [[Levinson-Durbin recursion]]). The most widely used of these is the LSP, which is used for example in the [[GSM]] standard. ~~Speech~~The techniques used in speech coding ~~methods~~are ~~apply~~similar ~~theory~~to ~~from~~that in [[audio compression]] and [[audio ~~signal processing~~coding]], bywhere ~~concentrating~~knowledge ~~only~~in on[[psychoacoustics]] ~~information~~is used ofto ~~the~~transmit ~~signal~~only data that is ~~audible~~relevant to the human auditory system. For example, in narrow-band speech coding, only information in the frequency band 400Hz to 3500Hz is transmitted but the reconstructed signal is still adequate for illegbility. However, speech coding differs from audio coding in that there is a lot more statistical information available about the properties of speech. In addition, some auditory information which is relevant in audio coding can be unneccesary in the speech coding context. In speech coding, the most important criterion is always preservation of intelligiblity of speech, with a constrained amount of transmitted data. But it should be emphasised that intelligibility of speech includes, besides the actual litteral content, also speaker identity, emotions, intonation, timbre etc. that are all important for perfect intelligibility. The most common speech coding scheme is Code-Excited Linear Predictive (CELP) coding, which is used for example in the [[GSM]] standard. In CELP, the modelling is divided in two stages, a [[linear prediction\|linear predictive]] stage that models the spectral envelope and code-book based model of the residual of the linear predictive model. Major subfields:▼ * [[Wide band speech coding]]▼ * [[Narrow band speech coding]]▼ In addition to the actual speech coding of the signal, it is often neccessary to use [[channel coding]] for transmission, to avoid losses due to transmission errors. Usually, speech coding and channel coding methods have to be chosen in pairs in order to get the best overal coding results. See also: [[Digital signal processing]], [[Speech processing]], [[Audio signal processing]], [[Data compression]], [[Telecommunication]], [[Mobile phone]], [[Psychoacoustic model]].▼ ▲Major subfields: ▲* [[Wide -band speech coding]] [[GSM]] [[NMT]] ▲* [[Narrow -band speech coding]] ▲See also: [[Digital signal processing]], [[Speech processing]], [[Audio signal processing]], [[Data compression]], [[Telecommunication]], [[Mobile phone]], [[Psychoacoustic model]].

Speech coding: Difference between revisions