Speech coding: Difference between revisions

Content deleted Content added
m rvv to Gene
m not only intelligibility is optimised but also "pleasantness"
Line 5:
The techniques used in speech coding are similar to that in [[audio data compression]] and [[audio coding]] where knowledge in [[psychoacoustics]] is used to transmit only data that is relevant to the human auditory system. For example, in [[narrowband]] speech coding, only information in the frequency band 400 Hz to 3500 Hz is transmitted but the reconstructed signal is still adequate for intelligibility.
 
However, speech coding differs from audio coding in that there is a lot more statistical information available about the properties of speech. In addition, some auditory information which is relevant in audio coding can be unnecessary in the speech coding context. In speech coding, the most important criterion is always preservation of intelligibility and "pleasantness" of speech, with a constrained amount of transmitted data.
 
It should be emphasised that the intelligibility of speech includes, besides the actual literal content, also speaker identity, emotions, intonation, [[timbre]] etc. that are all important for perfect intelligibility. The more abstract concept of pleasantness of degraded speech is a different property than intelligibility, since it is possible that degraded speech is completely intelligible, but subjectively annoying to the listener.
 
In addition, most speech applications require low coding delay, as long coding delays interfere with speech interaction.