Pulse-code modulation: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 16:19, 12 August 2023 edit OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: doi added to citation with #oabot. ← Previous edit		Latest revision as of 17:23, 27 July 2025 edit undo Kvng (talk \| contribs) Extended confirmed users, New page reviewers 115,948 edits m unpiped links using script
(31 intermediate revisions by 21 users not shown)
Line 8: \| screenshot = \| caption = \| extension = .L16, .WAV, .AIFF, .AU, .PCM<ref name="rfc2586">{{cite journal\|first1=Harald Tveit \|last1=Alvestrand \|last2=Salsman \|first2=James \|url=http://tools.ietf.org/html/rfc2586 \|title=RFC 2586 – The Audio/L16 MIME content type \|date=May 1999 \|publisher=The Internet Society \|doi=10.17487/RFC2586 \|access-date=2010-03-16\|url-access=subscription }}</ref> \| mime = audio/L16, audio/L8,<ref name="rfc4856">{{cite journal\|first=S. \|last=Casner \|url=http://tools.ietf.org/html/rfc4856#page-17 \|title=RFC 4856 – Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences – Registration of Media Type audio/L8 \|date=March 2007 \|publisher=The IETF Trust \|doi=10.17487/RFC4856 \|access-date=2010-03-16}}</ref> audio/L20, audio/L24<ref name="rfc3190">{{cite journal \|last1=Bormann \|first1=C. \|last2=Casner \|first2=S. \|last3=Kobayashi \|first3=K. \|last4=Ogawa \|first4=A. \|url=http://tools.ietf.org/html/rfc3190 \|title=RFC 3190 – RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio \|date=January 2002 \|publisher=The Internet Society \|doi=10.17487/RFC3190 \|access-date=2010-03-16\|doi-access=free }}</ref><ref>{{cite web \|url=https://www.iana.org/assignments/media-types/audio/ \|title=Audio Media Types \|publisher=Internet Assigned Numbers Authority \|access-date=2010-03-16}}</ref> Line 20: \| type = Uncompressed [[Audio file format\|audio]] \| container for = \| contained by = [[Red Book (audio CD standard)\|Audio CD]], [[AES3]], [[WAV]], [[Audio Interchange File Format\|AIFF]], [[Au file format\|AU]], [[M2TS]], [[VOB]], and many others \| extended from = \| extended to = Line 30: {{Modulation techniques}} '''Pulse-code modulation''' ('''PCM''') is a method used to [[Digital signal (signal processing)\|digitally]] represent ~~sampled~~ [[analog signal]]s. It is the standard form of [[digital audio]] in computers, [[compact disc]]s, [[digital telephony]] and other digital audio applications. In a PCM [[Stream (computing)\|stream]], the [[amplitude]] of the analog signal is [[Sampling (signal processing)\|sampled]] at uniform intervals, and each sample is [[Quantization (signal processing)\|quantized]] to the nearest value within a range of digital steps. [[Alec Reeves]], [[Claude Shannon]], [[Barney Oliver]] and [[John R. Pierce]] are credited with its invention.<ref>{{Cite book \|last=Noll \|first=A. Michael \|url=https://books.google.com/books?id=rpkuAgAAQBAJ&pg=PA50 \|title=Highway of Dreams: A Critical View Along the Information Superhighway \|date=1997 \|publisher=Erlbaum \|isbn=978-0-8058-2557-2 \|edition=Revised \|series=Telecommunications \|___location=Mahwah, NJ \|pages=50 \|language=en}}</ref><ref>{{Cite web \|last=Leibson \|first=Steven \|date=2021-09-07 \|title=A Brief History of the Single-Chip DSP, Part I \|url=https://www.eejournal.com/article/a-brief-history-of-the-single-chip-dsp-part-i/ \|access-date=2024-09-19 \|website=EEJournal \|language=en-US}}</ref><ref>{{Cite book \|last=Barrett \|first=G. Douglas \|url=https://books.google.com/books?id=r9-SEAAAQBAJ&pg=PA102 \|title=Experimenting the Human: Art, Music, and the Contemporary Posthuman \|publisher=[[The University of Chicago Press]] \|year=2023 \|isbn=978-0-226-82340-9 \|___location=Chicago London \|pages=102 \|language=en}}</ref> '''Linear pulse-code modulation''' ('''LPCM''') is a specific type of PCM in which the quantization levels are linearly uniform.<ref name="LOC_LPCM" /> This is in contrast to PCM encodings in which quantization levels vary as a function of amplitude (as with the [[~~A-law\|~~A-law algorithm]] or the [[~~μ-law\|~~μ-law algorithm]]). Though ''PCM'' is a more general term, it is often used to describe data encoded as LPCM. A PCM stream has two basic properties that determine the stream's fidelity to the original analog signal: the [[sampling rate]], which is the number of times per second that samples are taken; and the [[Audio bit depth\|bit depth]], which determines the number of possible digital values that can be used to represent each sample. Line 39: Early electrical communications started to [[Sampling (signal processing)\|sample]] signals in order to [[Multiplexing\|multiplex]] samples from multiple [[telegraphy]] sources and to convey them over a single telegraph cable. The American inventor [[Moses G. Farmer]] conceived telegraph [[time-division multiplexing]] (TDM) as early as 1853. Electrical engineer W. M. Miner, in 1903, used an electro-mechanical [[Commutator (electric)\|commutator]] for time-division multiplexing multiple telegraph signals; he also applied this technology to [[telephony]]. He obtained intelligible speech from channels sampled at a rate above 3500–4300 Hz; lower rates proved unsatisfactory. In 1920, the [[Bartlane cable picture transmission system]] used telegraph signaling of characters punched in paper tape to send samples of images [[Quantization (image processing)\|quantized]] to 5 levels.<ref name="digicamhistory">{{cite web \|url=http://www.digicamhistory.com/1906_1920.html \|title=The Bartlane Transmission System \|publisher=DigicamHistory.com \|access-date=7 January 2010\| archive-url = https://web.archive.org/web/20100210053055/http://www.digicamhistory.com/1906_1920.html\| archive-date=February 10, 2010}}</ref> In 1926, Paul M. Rainey of [[Western Electric]] patented a [[facsimile machine]] ~~which~~that transmitted its signal using 5-bit PCM, encoded by an opto-mechanical [[analog-to-digital converter]].<ref>U.S. patent number 1,608,527; also see p. 8, ''Data conversion handbook'', Walter Allan Kester, ed., Newnes, 2005, {{ISBN\|0-7506-7841-0}}.</ref> The machine did not go into production.<ref name=Vardalas>{{citation \|publisher=[[IEEE]] \|title=Pulse Code Modulation: It all Started 75 Years Ago with Alec Reeves \|url= https://insight.ieeeusa.org/articles/your-engineering-heritage-pulse-code-modulation-it-all-started-75-years-ago-with-alec-reeves/ \|date= June 2013 \|author=John Vardalas}}</ref> British engineer [[Alec Reeves]], unaware of previous work, conceived the use of PCM for voice communication in 1937 while working for [[International Telephone and Telegraph]] in France. He described the theory and its advantages, but no practical application resulted. Reeves filed for a French patent in 1938, and his US patent was granted in 1943.<ref>{{cite patent \|country=US \|number=2272070}}</ref> By this time Reeves had started working at the [[Telecommunications Research Establishment]].<ref name=Vardalas/> The first transmission of [[speech]] by digital techniques, the [[SIGSALY]] encryption equipment, conveyed high-level [[Allies of World War II\|Allied communications]] during [[World War II]]. In 1943 the [[Bell Labs]] researchers who designed the SIGSALY system became aware of the use of PCM binary coding as already proposed by Reeves. In 1949, for the Canadian Navy's [[DATAR]] system, [[~~Ferranti-Packard\|~~Ferranti Canada]] built a working PCM radio system that was able to transmit digitized radar data over long distances.<ref>{{cite book \|author=Porter, Arthur \|title=So Many Hills to Climb \|date=2004 \|publisher=Beckham Publications Group \|isbn=9780931761188}}{{page needed\|date=September 2017}}</ref> PCM in the late 1940s and early 1950s used a [[Cathode ray tube\|cathode-ray]] [[:File:US02632058 Gray.png\|coding tube]] with a [[plate electrode]] having encoding perforations.<ref>{{cite ~~book~~ journal\|url=https://archive.org/details/bstj27-1-44 \|author=Sears, R. W. \|~~work~~journal=Bell ~~Systems~~System Technical Journal \|volume=27 \|title=Electron Beam Deflection Tube for Pulse Code Modulation \|pages=44–57 \|publisher=[[Bell Labs]] \|date=January 1948 \|doi=10.1002/j.1538-7305.1948.tb01330.x \|access-date=14 May 2017}}</ref> As in an [[oscilloscope]], the beam was swept horizontally at the sample rate while the vertical deflection was controlled by the input analog signal, causing the beam to pass through higher or lower portions of the perforated plate. The plate collected or passed the beam, producing current variations in binary code, one bit at a time. Rather than natural binary, the grid of Goodall's later tube was perforated to produce a glitch-free [[Gray code]] and produced all bits simultaneously by using a fan beam instead of a scanning beam.<ref>{{cite ~~book~~journal \|url=https://archive.org/details/bstj30-1-33 \|author=Goodall, W. M. \|~~work~~journal=Bell ~~Systems~~System Technical Journal \|volume=30 \|title=Television by Pulse Code Modulation \|pages=33–49 \|publisher=[[Bell Labs]] \|date=January 1951 \|doi=10.1002/j.1538-7305.1951.tb01365.x \|access-date=14 May 2017}}</ref> In the United States, the [[National Inventors Hall of Fame]] has honored [[Bernard M. Oliver]]<ref> Line 92: The [[T-carrier]] system, introduced in 1961, uses two twisted-pair transmission lines to carry 24 PCM [[telephone]] calls sampled at 8 kHz and 8-bit resolution. This development improved capacity and call quality compared to the previous [[frequency-division multiplexing]] schemes. In 1973, [[adaptive differential pulse-code modulation]] (ADPCM) was developed, by P. Cummiskey, [[Nikil Jayant]] and [[James L. Flanagan]].<ref>P. Cummiskey, N. S. Jayant, and J. L. Flanagan, "Adaptive quantization in differential PCM coding of speech," Bell Syst. Tech. J., vol. 52, pp. ~~1105—1118~~1105–1118, Sept. 1973.</ref> ===Digital audio recordings=== {{Main\|Digital audio\|Digital recording}} In 1967, the first PCM recorder was developed by [[NHK]]'s research facilities in Japan.<ref name="Fine">{{cite journal \|author=Thomas Fine \|year=2008 \|title=The dawn of commercial digital recording \|journal=[[~~Association for Recorded Sound Collections\|~~ARSC Journal]] \|volume=39 \|issue=1 \|pages=1–17 \|url=http://www.aes.org/aeshc/pdf/fine_dawn-of-digital.pdf}}</ref> The 30 kHz 12-bit device used a [[compander]] (similar to [[Dbx (noise reduction)\|DBX Noise Reduction]]) to extend the dynamic range, and stored the signals on a [[video tape recorder]]. In 1969, NHK expanded the system's capabilities to 2-channel [[stereo]] and 32 kHz 13-bit resolution. In January 1971, using NHK's PCM recording system, engineers at [[Denon]] recorded the first commercial digital recordings.<ref group=note>Among the first recordings was ''Uzu: The World Of Stomu Yamash'ta 2'' by [[Stomu Yamashta]].</ref><ref name="Fine"/> In 1972, Denon unveiled the first 8-channel digital recorder, the DN-023R, which used a 4-head open reel broadcast video tape recorder to record in 47.25 kHz, 13-bit PCM audio.<ref group=note>The first recording with this new system was recorded in [[Tokyo]] during April 24–26, 1972.</ref> In 1977, Denon developed the portable PCM recording system, the DN-034R. Like the DN-023R, it recorded 8 channels at 47.25 kHz, but it used 14-bits "with [[Emphasis (telecommunications)\|emphasis]], making it equivalent to 15.5 bits."<ref name="Fine"/> Line 117: * [[AES3]] (specified in 1985, upon which [[S/PDIF]] is based) is a particular format using LPCM. * [[LaserDisc]]s with digital sound have an LPCM track on the digital channel. * On PCs, PCM and LPCM often refer to the format used in [[WAV]] (defined in 1991) and [[Audio Interchange File Format\|AIFF]] audio container formats (defined in 1988). LPCM data may also be stored in other formats such as [[Au file format\|AU]], [[raw audio format]] (header-less file) and various multimedia [[Digital container format\|container formats]]. * LPCM has been defined as a part of the [[DVD]] (since 1995) and [[~~Blu-ray Disc\|~~Blu-ray]] (since 2006) standards.<ref name="bd">{{citation \|url=http://www.blu-raydisc.com/Assets/Downloadablefile/2b_bdrom_audiovisualapplication_0305-12955-15269.pdf \|title=White paper Blu-ray Disc Format – 2.B Audio Visual Application Format Specifications for BD-ROM \|author=Blu-ray Disc Association \|date=March 2005 \|access-date=2009-07-26}}</ref><ref>{{cite web \|url=http://www.mpeg.org/MPEG/DVD/Book_B/Audio.html \|title=DVD Technical Notes (DVD Video – "Book B") – Audio data specifications \|date=1996-07-21 \|access-date=2010-03-16}}</ref><ref>{{cite web \|url=http://dvddemystified.com/dvdfaq.html#3.6.2 \|title=DVD Frequently Asked Questions (and Answers) – Audio details of DVD-Video \|author=Jim Taylor \|access-date=2010-03-20}}</ref> It is also defined as a part of various digital video and audio storage formats (e.g. [[DV (video format)\|DV]] since 1995,<ref>{{cite web \|url=http://seaspray.trinity-bris.ac.uk/~altwfaq/graphics/video/1394/1394formats.html \|title=How DV works \|archive-url=https://web.archive.org/web/20071206032412/http://seaspray.trinity-bris.ac.uk/~altwfaq/graphics/video/1394/1394formats.html \|archive-date=2007-12-06 \|access-date=2010-03-21}}</ref> [[AVCHD]] since 2006<ref>{{cite web \|url=http://www.avchd-info.org/format/index.html \|title=AVCHD Information Website – AVCHD format specification overview \|access-date=2010-03-21}}</ref>). * LPCM is used by [[HDMI]] (defined in 2002), a single-cable digital audio/video connector interface for transmitting uncompressed digital data. * [[RF64]] container format (defined in 2007) uses LPCM and also allows non-PCM bitstream storage: various compression formats contained in the RF64 file as data bursts (Dolby E, Dolby AC3, DTS, MPEG-1/MPEG-2 Audio) can be "disguised" as PCM linear.<ref>{{citation \|url=http://tech.ebu.ch/docs/tech/tech3306-2009.pdf \|title=EBU Tech 3306 – MBWF / RF64: An Extended File Format for Audio \|date=July 2009 \|author=EBU \|access-date=2010-01-19 \|archive-date=November 22, 2009 \|archive-url=https://web.archive.org/web/20091122155436/http://tech.ebu.ch/docs/tech/tech3306-2009.pdf \|url-status=dead }}</ref> ==Modulation== [[File:Pcm.svg\|250px\|thumb\|right\|Sampling and quantization of a signal (red) for 4-bit LPCM over a time ___domain at specific frequency.]] In the diagram, a [[sine wave]] (red curve) is sampled and quantized for PCM. The sine wave is sampled at regular intervals, shown as vertical lines. For each sample, one of the available values (on the y-axis) is chosen. The PCM process is commonly implemented on a single [[integrated circuit]] called an [[analog-to-digital converter]] (ADC). This produces a fully discrete representation of the input signal (blue points) that can be easily encoded as digital data for storage or manipulation. Several PCM streams could also be multiplexed into a larger aggregate [[data stream]], generally for transmission of multiple streams over a single physical link. One technique is called [[time-division multiplexing]] (TDM) and is widely used, notably in the modern public telephone system. Line 130: The electronics involved in producing an accurate analog signal from the discrete data are similar to those used for generating the digital signal. These devices are [[digital-to-analog converter]]s (DACs). They produce a [[voltage]] or [[Electric current\|current]] (depending on type) that represents the value presented on their digital inputs. This output would then generally be filtered and amplified for use. To recover the original signal from the sampled data, a ''demodulator'' can apply the procedure of modulation in reverse. After each sampling period, the demodulator reads the next value and transitions the output signal to the new value. As a result of these transitions, the signal retains a significant amount of high-frequency energy due to imaging effects. To remove these undesirable frequencies, the demodulator passes the signal through a [[reconstruction filter]] that suppresses energy outside the expected frequency range (greater than the [[Nyquist frequency]] <math>f_s / 2 </math>).<ref group=note>Some systems use [[digital filter]]ing to remove some of the aliasing, converting the signal from digital to analog at a higher sample rate such that the analog [[anti-aliasing filter]] is much simpler. In some systems, no explicit filtering is done at all; as it's is impossible for any system to reproduce a signal with infinite bandwidth, inherent losses in the system compensate for the artifacts — or the system simply does not require much precision.</ref> ==Standard sampling precision and rates== Common sample depths for LPCM are 8, 16, 20 or 24 bits per [[sample (signal)\|sample]].<ref name="rfc2586" /><ref name="rfc4856" /><ref name="rfc3190" /><ref>{{cite journal \|url=http://tools.ietf.org/html/rfc3108#page-62 \|title=RFC 3108 – Conventions for the use of the Session Description Protocol (SDP) for ATM Bearer Connections \|date=May 2001 \|access-date=2010-03-16\|last1=Mostafa \|first1=Mohamed \|last2=Kumar \|first2=Rajesh \|doi=10.17487/RFC3108 \|url-access=subscription }}</ref> LPCM encodes a single sound channel. Support for multichannel audio depends on file format and relies on synchronization of multiple LPCM streams.<ref name=LOC_LPCM/><ref>{{Cite web\|publisher=Library of Congress \|url=https://www.loc.gov/preservation/digital/formats/fdd/fdd000016.shtml \|title=PCM, Pulse Code Modulated Audio \|date=April 6, 2022 \|access-date=2022-09-05}}</ref> While two channels (stereo) is the most common format, systems can support up to 8 audio channels (7.1 surround)<ref name="rfc4856"/><ref name="rfc3190"/> or more. Common sampling frequencies are 48 [[~~hertz\|~~kHz]] as used with [[DVD]] format videos, or 44.1 kHz as used in CDs. Sampling frequencies of 96 kHz or 192 kHz can be used on some equipment, but the [[High-resolution audio#Controversy\|benefits have been debated]].<ref>{{Cite web\|last=Christopher\|first=Montgometry\|title=24/192 Music Downloads, and why they do not make sense\|url=http://people.xiph.org/~xiphmont/demo/neil-young.html\|url-status=dead\|archive-url=https://web.archive.org/web/20140906115306/http://people.xiph.org/~xiphmont/demo/neil-young.html\|archive-date=2014-09-06\|access-date=2013-03-16\|publisher=Chris "Monty" Montgomery}}</ref> ==Limitations== The [[Nyquist–Shannon sampling theorem]] shows PCM devices can operate without introducing distortions within their designed frequency bands if they provide a sampling frequency at least twice that of the highest frequency contained in the input signal. For example, in [[telephony]], the usable [[voice frequency]] band ranges from approximately 300 to 3400 [[~~Hertz\|~~Hz]] ~~to 3400 Hz~~.<ref>https://www.its.bldrdoc.gov/fs-1037/dir-039/_5829.htm{{fv\|reason=This source says 4k\|date=August 2020}}</ref> For effective reconstruction of the voice signal, telephony applications therefore typically use an 8000 Hz sampling frequency which is more than twice the highest usable voice frequency. Regardless, there are potential sources of impairment implicit in any PCM system: * Choosing a discrete value that is near but not exactly at the analog signal level for each sample leads to [[quantization error]]. When [[dither]]ing is used to compensate for this, it introduces additional noise.<ref group=note>Quantization error swings between -''q''/2 and ''q''/2. In the ideal case (with a fully linear ADC and signal level >> ''q'') it is [[uniform distribution (continuous)\|uniformly distributed]] over this interval, with zero mean and variance of ''q''<sup>2</sup>/12.</ref> * Between samples no measurement of the signal is made; the sampling theorem guarantees non-ambiguous representation and recovery of the signal only if it has no energy at frequency ''f<sub>s</sub>''/2 or higher (one half the sampling frequency, known as the [[Nyquist frequency]]); higher frequencies will not be correctly represented or recovered and add aliasing distortion to the signal below the Nyquist frequency. * As samples are dependent on time, an accurate clock is required for accurate reproduction. If either the encoding or decoding clock is not stable, these imperfections will directly affect the output quality of the device.<ref group=note>A slight difference between the encoding and decoding clock frequencies is not generally a major concern; a small constant error is not noticeable. Clock error does become a major issue if the clock contains significant [[jitter]], however.</ref> Line 151: * Linear PCM (LPCM) is PCM with linear quantization.<ref name="LOC_LPCM" /> * [[~~DPCM\|~~Differential PCM]] (DPCM) encodes the PCM values as differences between the current and the predicted value. An algorithm predicts the next sample based on the previous samples, and the encoder stores only the difference between this prediction and the actual value. If the prediction is reasonable, fewer bits can be used to represent the same information. For audio, this type of encoding reduces the number of bits required per sample by about 25% compared to PCM. * [[Adaptive differential pulse-code modulation]] (ADPCM) is a variant of DPCM that varies the size of the quantization step, to allow further reduction of the required bandwidth for a given [[signal-to-noise ratio]]. * [[Delta modulation]] is a form of DPCM that uses one bit per sample to indicate whether the signal is increasing or decreasing compared to the previous sample. Line 165: {{See also\|T-carrier\|E-carrier}} PCM can be either [[return-to-zero]] (RZ) or [[non-return-to-zero]] (NRZ). For a NRZ system to be synchronized using in-band information, there must not be long sequences of identical symbols, such as ones or zeroes. For binary PCM systems, the density of 1-symbols is called ''ones-density''.<ref>Stallings, William, [~~http~~https://ieeexplore.ieee.org/~~stamp~~document/~~stamp.jsp?arnumber=01091872~~1091872 Digital Signaling Techniques], December 1984, Vol. 22, No. 12, [[IEEE]] [[IEEE Communications Magazine\|Communications Magazine]]</ref> Ones-density is often controlled using precoding techniques such as [[run-length limited]] encoding, where the PCM code is expanded into a slightly longer code with a guaranteed bound on ones-density before modulation into the channel. In other cases, extra [[framing bit]]s are added into the stream, which guarantees at least occasional symbol transitions. Another technique used to control ones-density is the use of a [[scrambler]] on the data, which will tend to turn the data stream into a stream that looks [[~~pseudorandom\|~~pseudo-random]], but where the data can be recovered exactly by a complementary descrambler. In this case, long runs of zeroes or ones are still possible on the output but are considered unlikely enough to allow reliable synchronization. In other cases, the long term DC value of the modulated signal is important, as building up a [[DC bias]] will tend to move communications circuits out of their operating range. In this case, special measures are taken to keep a count of the cumulative DC bias and to modify the codes if necessary to make the DC bias always tend back to zero.