Pulse-code modulation: Difference between revisions

Content deleted Content added
m unpiped links using script
m unpiped links using script
 
(8 intermediate revisions by 6 users not shown)
Line 8:
| screenshot =
| caption =
| extension = .L16, .WAV, .AIFF, .AU, .PCM<ref name="rfc2586">{{cite journal|first1=Harald Tveit |last1=Alvestrand |last2=Salsman |first2=James |url=http://tools.ietf.org/html/rfc2586 |title=RFC 2586 – The Audio/L16 MIME content type |date=May 1999 |publisher=The Internet Society |doi=10.17487/RFC2586 |access-date=2010-03-16|url-access=subscription }}</ref>
| mime = audio/L16, audio/L8,<ref name="rfc4856">{{cite journal|first=S. |last=Casner |url=http://tools.ietf.org/html/rfc4856#page-17 |title=RFC 4856 – Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences – Registration of Media Type audio/L8 |date=March 2007 |publisher=The IETF Trust |doi=10.17487/RFC4856 |access-date=2010-03-16}}</ref> audio/L20, audio/L24<ref name="rfc3190">{{cite journal |last1=Bormann |first1=C. |last2=Casner |first2=S. |last3=Kobayashi |first3=K. |last4=Ogawa |first4=A.
|url=http://tools.ietf.org/html/rfc3190 |title=RFC 3190 – RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio |date=January 2002 |publisher=The Internet Society |doi=10.17487/RFC3190 |access-date=2010-03-16|doi-access=free }}</ref><ref>{{cite web |url=https://www.iana.org/assignments/media-types/audio/ |title=Audio Media Types |publisher=Internet Assigned Numbers Authority |access-date=2010-03-16}}</ref>
Line 30:
{{Modulation techniques}}
 
'''Pulse-code modulation''' ('''PCM''') is a method used to [[Digital signal (signal processing)|digitally]] represent [[analog signal]]s. It is the standard form of [[digital audio]] in computers, [[compact disc]]s, [[digital telephony]] and other digital audio applications. In a PCM [[Stream (computing)|stream]], the [[amplitude]] of the analog signal is [[Sampling (signal processing)|sampled]] at uniform intervals, and each sample is [[Quantization (signal processing)|quantized]] to the nearest value within a range of digital steps. [[Alec Reeves]], [[Claude Shannon]], [[Barney Oliver]] and [[John R. Pierce]] are credited with its invention.<ref>{{Cite book |last=Noll |first=A. Michael |url=https://books.google.com/books?id=rpkuAgAAQBAJ&dq=pulse+code+modulation+claude+shannon&pg=PA50 |title=Highway of Dreams: A Critical View Along the Information Superhighway |date=1997 |publisher=Erlbaum |isbn=978-0-8058-2557-2 |edition=Revised |series=Telecommunications |___location=Mahwah, NJ |pages=50 |language=en}}</ref><ref>{{Cite web |last=Leibson |first=Steven |date=2021-09-07 |title=A Brief History of the Single-Chip DSP, Part I |url=https://www.eejournal.com/article/a-brief-history-of-the-single-chip-dsp-part-i/ |access-date=2024-09-19 |website=EEJournal |language=en-US}}</ref><ref>{{Cite book |last=Barrett |first=G. Douglas |url=https://books.google.com/books?id=r9-SEAAAQBAJ&dq=Audio+Engineering+claude+shannon&pg=PA102 |title=Experimenting the Human: Art, Music, and the Contemporary Posthuman |publisher=[[The University of Chicago Press]] |year=2023 |isbn=978-0-226-82340-9 |___location=Chicago London |pages=102 |language=en}}</ref>
 
'''Linear pulse-code modulation''' ('''LPCM''') is a specific type of PCM in which the quantization levels are linearly uniform.<ref name="LOC_LPCM" /> This is in contrast to PCM encodings in which quantization levels vary as a function of amplitude (as with the [[A-law algorithm]] or the [[μ-law algorithm]]). Though ''PCM'' is a more general term, it is often used to describe data encoded as LPCM.
Line 45:
The first transmission of [[speech]] by digital techniques, the [[SIGSALY]] encryption equipment, conveyed high-level [[Allies of World War II|Allied communications]] during [[World War II]]. In 1943 the [[Bell Labs]] researchers who designed the SIGSALY system became aware of the use of PCM binary coding as already proposed by Reeves. In 1949, for the Canadian Navy's [[DATAR]] system, [[Ferranti Canada]] built a working PCM radio system that was able to transmit digitized radar data over long distances.<ref>{{cite book |author=Porter, Arthur |title=So Many Hills to Climb |date=2004 |publisher=Beckham Publications Group |isbn=9780931761188}}{{page needed|date=September 2017}}</ref>
 
PCM in the late 1940s and early 1950s used a [[Cathode ray tube|cathode-ray]] [[:File:US02632058 Gray.png|coding tube]] with a [[plate electrode]] having encoding perforations.<ref>{{cite book journal|url=https://archive.org/details/bstj27-1-44 |author=Sears, R. W. |workjournal=Bell SystemsSystem Technical Journal |volume=27 |title=Electron Beam Deflection Tube for Pulse Code Modulation |pages=44–57 |publisher=[[Bell Labs]] |date=January 1948 |doi=10.1002/j.1538-7305.1948.tb01330.x |access-date=14 May 2017}}</ref> As in an [[oscilloscope]], the beam was swept horizontally at the sample rate while the vertical deflection was controlled by the input analog signal, causing the beam to pass through higher or lower portions of the perforated plate. The plate collected or passed the beam, producing current variations in binary code, one bit at a time. Rather than natural binary, the grid of Goodall's later tube was perforated to produce a glitch-free [[Gray code]] and produced all bits simultaneously by using a fan beam instead of a scanning beam.<ref>{{cite bookjournal |url=https://archive.org/details/bstj30-1-33 |author=Goodall, W. M. |workjournal=Bell SystemsSystem Technical Journal |volume=30 |title=Television by Pulse Code Modulation |pages=33–49 |publisher=[[Bell Labs]] |date=January 1951 |doi=10.1002/j.1538-7305.1951.tb01365.x |access-date=14 May 2017}}</ref>
 
In the United States, the [[National Inventors Hall of Fame]] has honored [[Bernard M. Oliver]]<ref>
Line 133:
 
==Standard sampling precision and rates==
Common sample depths for LPCM are 8, 16, 20 or 24 bits per [[sample (signal)|sample]].<ref name="rfc2586" /><ref name="rfc4856" /><ref name="rfc3190" /><ref>{{cite journal |url=http://tools.ietf.org/html/rfc3108#page-62 |title=RFC 3108 – Conventions for the use of the Session Description Protocol (SDP) for ATM Bearer Connections |date=May 2001 |access-date=2010-03-16|last1=Mostafa |first1=Mohamed |last2=Kumar |first2=Rajesh |doi=10.17487/RFC3108 |url-access=subscription }}</ref>
 
LPCM encodes a single sound channel. Support for multichannel audio depends on file format and relies on synchronization of multiple LPCM streams.<ref name=LOC_LPCM/><ref>{{Cite web|publisher=Library of Congress |url=https://www.loc.gov/preservation/digital/formats/fdd/fdd000016.shtml |title=PCM, Pulse Code Modulated Audio |date=April 6, 2022 |access-date=2022-09-05}}</ref> While two channels (stereo) is the most common format, systems can support up to 8 audio channels (7.1 surround)<ref name="rfc4856"/><ref name="rfc3190"/> or more.
 
Common sampling frequencies are 48 [[kHz]] as used with [[DVD]] format videos, or 44.1&nbsp;kHz as used in CDs. Sampling frequencies of 96&nbsp;kHz or 192&nbsp;kHz can be used on some equipment, but the [[High-resolution audio#Controversy|benefits have been debated]].<ref>{{Cite web|last=Christopher|first=Montgometry|title=24/192 Music Downloads, and why they do not make sense|url=http://people.xiph.org/~xiphmont/demo/neil-young.html|url-status=dead|archive-url=https://web.archive.org/web/20140906115306/http://people.xiph.org/~xiphmont/demo/neil-young.html|archive-date=2014-09-06|access-date=2013-03-16|publisher=Chris "Monty" Montgomery}}</ref>
 
==Limitations==
The [[Nyquist–Shannon sampling theorem]] shows PCM devices can operate without introducing distortions within their designed frequency bands if they provide a sampling frequency at least twice that of the highest frequency contained in the input signal. For example, in [[telephony]], the usable [[voice frequency]] band ranges from approximately 300 to 3400&nbsp;[[Hz]] to 3400&nbsp;Hz.<ref>https://www.its.bldrdoc.gov/fs-1037/dir-039/_5829.htm{{fv|reason=This source says 4k|date=August 2020}}</ref> For effective reconstruction of the voice signal, telephony applications therefore typically use an 8000&nbsp;Hz sampling frequency which is more than twice the highest usable voice frequency.
 
Regardless, there are potential sources of impairment implicit in any PCM system:
* Choosing a discrete value that is near but not exactly at the analog signal level for each sample leads to [[quantization error]]. When [[dither]]ing is used to compensate for this, it introduces additional noise.<ref group=note>Quantization error swings between -''q''/2 and ''q''/2. In the ideal case (with a fully linear ADC and signal level >> ''q'') it is [[uniform distribution (continuous)|uniformly distributed]] over this interval, with zero mean and variance of ''q''<sup>2</sup>/12.</ref>
* Between samples no measurement of the signal is made; the sampling theorem guarantees non-ambiguous representation and recovery of the signal only if it has no energy at frequency ''f<sub>s</sub>''/2 or higher (one half the sampling frequency, known as the [[Nyquist frequency]]); higher frequencies will not be correctly represented or recovered and add aliasing distortion to the signal below the Nyquist frequency.
* As samples are dependent on time, an accurate clock is required for accurate reproduction. If either the encoding or decoding clock is not stable, these imperfections will directly affect the output quality of the device.<ref group=note>A slight difference between the encoding and decoding clock frequencies is not generally a major concern; a small constant error is not noticeable. Clock error does become a major issue if the clock contains significant [[jitter]], however.</ref>