Speech coding: Difference between revisions

Content deleted Content added
m spelling
m neccessary -> necessary
Line 1:
'''Speech coding''' is the [[audio compression|compression]] of speech (into a [[code]]) for [[telecommunication|transmission]] using [[audio signal processing]] and [[speech processing]] techniques.
 
The two most important applications using speech coding are [[mobile phone|mobile phones]] and [[internet phone|internet phones]].
 
The two most important applications using speech coding are [[mobile phone|mobile phones]] and [[internet phone|internet phones]].
The techniques used in speech coding are similar to that in [[audio compression]] and [[audio coding]] where knowledge in [[psychoacoustics]] is used to transmit only data that is relevant to the human auditory system. For example, in narrow-band speech coding, only information in the frequency band 400Hz to 3500Hz is transmitted but the reconstructed signal is still adequate for intelligibility.
 
 
However, speech coding differs from audio coding in that there is a lot more statistical information available about the properties of speech. In addition, some auditory information which is relevant in audio coding can be unneccesary in the speech coding context. In speech coding, the most important criterion is always preservation of intelligibility of speech, with a constrained amount of transmitted data.
 
 
It should be emphasised that the intelligibility of speech includes, besides the actual literal content, also speaker identity, emotions, intonation, timbre etc. that are all important for perfect intelligibility.
 
The techniques used in speech coding are similar to that in [[audio compression]] and [[audio coding]] where knowledge in [[psychoacoustics]] is used to transmit only data that is relevant to the human auditory system. For example, in narrow-band speech coding, only information in the frequency band 400Hz to 3500Hz is transmitted but the reconstructed signal is still adequate for intelligibility.
In addition, most speech applications require low coding delay, as long coding delays interfere with speech interaction.
 
 
The most common speech coding scheme is Code-Excited Linear Predictive (CELP) coding, which is used for example in the [[GSM]] standard. In CELP, the modelling is divided in two stages, a [[linear prediction|linear predictive]] stage that models the spectral envelope and code-book based model of the residual of the linear predictive model.
 
 
In addition to the actual speech coding of the signal, it is often neccessary to use [[channel coding]] for transmission, to avoid losses due to transmission errors. Usually, speech coding and channel coding methods have to be chosen in pairs in order to get the best overal coding results.
 
However, speech coding differs from audio coding in that there is a lot more statistical information available about the properties of speech. In addition, some auditory information which is relevant in audio coding can be unneccesary in the speech coding context. In speech coding, the most important criterion is always preservation of intelligibility of speech, with a constrained amount of transmitted data.
The [[Speex]] project is an attempt to create a [[free software]] speech coder, unemcumbered by patent restrictions.
 
 
Major subfields:
 
 
* [[Wide-band speech coding]]
 
** [[GSM]]
It should be emphasised that the intelligibility of speech includes, besides the actual literal content, also speaker identity, emotions, intonation, timbre etc. that are all important for perfect intelligibility.
** [[NMT]]
 
* [[Narrow-band speech coding]]
 
 
 
 
In addition, most speech applications require low coding delay, as long coding delays interfere with speech interaction.
 
 
 
 
 
The most common speech coding scheme is Code-Excited Linear Predictive (CELP) coding, which is used for example in the [[GSM]] standard. In CELP, the modelling is divided in two stages, a [[linear prediction|linear predictive]] stage that models the spectral envelope and code-book based model of the residual of the linear predictive model.
 
 
 
 
 
In addition to the actual speech coding of the signal, it is often neccessarynecessary to use [[channel coding]] for transmission, to avoid losses due to transmission errors. Usually, speech coding and channel coding methods have to be chosen in pairs in order to get the best overal coding results.
 
 
 
 
 
The [[Speex]] project is an attempt to create a [[free software]] speech coder, unemcumbered by patent restrictions.
 
 
 
 
 
Major subfields:
 
 
 
 
 
* [[Wide-band speech coding]]
 
 
** [[GSM]]
 
 
** [[NMT]]
 
 
* [[Narrow-band speech coding]]
 
 
 
 
 
See also: [[Digital signal processing]], [[Speech processing]], [[Audio signal processing]], [[Data compression]], [[Telecommunication]], [[Mobile phone]], [[Psychoacoustic model]], [[Vector quantization]].