Quantization (image processing): Difference between revisions

Content deleted Content added
Curps (talk | contribs)
m Reverted edits by 220.247.227.74 (talk) to last version by Omegatron
m add WP:TEMPLATECAT to remove from template; genfixes
 
(63 intermediate revisions by 51 users not shown)
Line 1:
{{Short description|Lossy compression technique}}
'''Quantization''', involved in [[image processing]]. Quantization techniques generally compress by compressing a range of values to a single quantum value. By reducing the number of discrete symbols in a given stream, the stream becomes more compressible. For example seeking to reduce the number of colors required to represent an [[image]]. Another widely used example — [[DCT]] data quantization in [[JPEG]] and [[DWT]] data quantization in [[JPEG 2000]].
{{More citations needed|date=November 2012}}
 
'''Quantization''', involved in [[image processing]]., Quantizationis techniquesa [[lossy compression]] generallytechnique compressachieved by compressing a range of values to a single quantum (discrete) value. By reducingWhen the number of discrete symbols in a given stream is reduced, the stream becomes more compressible. For example, seeking to reducereducing the number of colors required to represent ana digital [[image]]. Anothermakes widelyit usedpossible exampleto reduce its file size. Specific applications include [[Discrete cosine transform|DCT]] data quantization in [[JPEG]] and [[Discrete wavelet transform|DWT]] data quantization in [[JPEG 2000]].
== Quantization in file formats ==
 
== Color quantization ==
Although more frequent in the past, nowadays color quantization with pallets of lower than 256 colours is mainly used in [[GIF]] and [[PNG]] images. Using "nearest-neighbor" quantization and allowing fewer colours usually results in smaller file sizes, however sophisticated "random dithering" can actually inflate the final size.
{{main|Color quantization}}
 
Color quantization reduces the number of colors used in an image; this is important for displaying images on devices that support a limited number of colors and for efficiently compressing certain kinds of images. Most bitmap editors and many operating systems have built-in support for color quantization. Popular modern color quantization algorithms include the nearest color algorithm (for fixed palettes), the [[Median cut|median cut algorithm]], and an algorithm based on [[octree]]s.
The infinite number of colors available through the lens of a camera is impossible to display on a computer screen. Since a computer can display only a finite number of colors, quantization is always necessary.
 
It is common to combine color quantization with [[dither]]ing to create an impression of a larger number of colors and eliminate [[colour banding|banding]] artifacts.
Many early computers were limited in the number of colors they could display at one time — commonly 16 (and later 256) colours. Modern computers can now display millions of colours at once, far more than can be distinguished by the human eye.
 
== Grayscale quantization ==
Most quantization algorithms allow you to set exactly how many colors you want to use. With the few colors available on early computers, different quantization algorithms produced very different-looking output images. As a result, a lot of time was spent on writing sophisticated algorithms to be more lifelike. Nowadays almost every algorithm produces an output indistinguishable from the view through the camera lens.
Grayscale quantization, also known as gray level quantization, is a process in digital image processing that involves reducing the number of unique intensity levels (shades of gray) in an image while preserving its essential visual information. This technique is commonly used for simplifying images, reducing storage requirements, and facilitating processing operations. In grayscale quantization, an image with ''N'' intensity levels is converted into an image with a reduced number of levels, typically ''L'' levels, where ''L''<''N''. The process involves mapping each pixel's original intensity value to one of the new intensity levels. One of the simplest methods of grayscale quantization is uniform quantization, where the intensity range is divided into equal intervals, and each interval is represented by a single intensity value. Let's say we have an image with intensity levels ranging from 0 to 255 (8-bit grayscale). If we want to quantize it to 4 levels, the intervals would be [0-63], [64-127], [128-191], and [192-255]. Each interval would be represented by the midpoint intensity value, resulting in intensity levels of 31, 95, 159, and 223 respectively.
 
The formula for uniform quantization is:
=== Quantization algorithms ===
 
<math>Q(x) = \left \lfloor \frac{x}{\Delta} \right \rfloor \times \Delta + \frac{\Delta}{2} </math>
A standard quantization [[algorithm]] works in 2 steps:
Where:
# Analyze the image's color usage to select the new color palette.
# [[Dithering|Dither]] the image to the new color palette.
 
* ''Q''(''x'') is the quantized intensity value.
A common quantization algorithm is the Octree-based algorithm, as described in the [http://msdn.microsoft.com MSDN] article on [[ASP.NET]] color quantization.
* ''x'' is the original intensity value.
<!-- FIXME: is there a link directly to the referenced article? -->
* Δ is the size of each quantization interval.
 
Let's quantize an original intensity value of 147 to 3 intensity levels.
Another popular quantization algorithm is the "median cut" algorithm.
 
Original intensity value: ''x''=147
== Quantization in image compression ==
 
Desired intensity levels: ''L''=3
The human eye is fairly good at seeing small differences in [[brightness]] over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation. This fact allows one to get away with greatly reducing the amount of information in the high frequency components. This is done by simply dividing each component in the frequency ___domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers.
 
We first need to calculate the size of each quantization interval:
This is an example DCT coefficient matrix: <!--NOTE: this matrix was generated using random numbers and the other two matricies. It may not actually work well with an iDCT. -->
 
<math>\Delta = \frac{255}{L-1} = \frac{255}{3-1} = 127.5</math>
 
Using the uniform quantization formula:
 
<math>Q(x) = \left \lfloor \frac{147}{127.5} \right \rfloor \times 127.5 + \frac{127.5}{2}</math>
 
<math>Q(x) = \left \lfloor 1.15294118 \right \rfloor \times 127.5 + \frac{127.5}{2}</math>
 
<math>Q(x) = 1 \times 127.5 + 63.75 = 191.25</math>
 
Rounding 191.25 to the nearest integer, we get <math>Q(x) = 191</math>
 
So, the quantized intensity value of 147 to 3 levels is 191.
 
== QuantizationFrequency inquantization for image compression ==
 
The human eye is fairly good at seeing small differences in [[brightness]] over a relatively large area, but not so good at distinguishing the exact strength of a high frequency (rapidly varying) brightness variation. This fact allows one to get away with greatly reducingreduce the amount of information inrequired by ignoring the high frequency components. This is done by simply dividing each component in the frequency ___domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers.
 
As human vision is also more sensitive to [[luminance]] than [[chrominance]], further compression can be obtained by working in a non-RGB color space which separates the two (e.g., [[YCbCr]]), and quantizing the channels separately.<ref name="wiseman">John Wiseman, ''An Introduction to MPEG Video Compression'', https://web.archive.org/web/20111115004238/http://www.john-wiseman.com/technical/MPEG_tutorial.htm</ref>
 
=== Quantization in file formatsmatrices ===
 
A typical video codec works by breaking the picture into discrete blocks (8×8 pixels in the case of MPEG<ref name="wiseman"/>). These blocks can then be subjected to [[discrete cosine transform]] (DCT) to calculate the frequency components, both horizontally and vertically.<ref name="wiseman"/> The resulting block (the same size as the original block) is then pre-multiplied by the quantization scale code and divided element-wise by the quantization matrix, and rounding each resultant element. The quantization matrix is designed to provide more resolution to more perceivable frequency components over less perceivable components (usually lower frequencies over high frequencies) in addition to transforming as many components to 0, which can be encoded with greatest efficiency. Many video encoders (such as [[DivX]], [[Xvid]], and [[3ivx]]) and compression standards (such as [[MPEG-2]] and [[H.264/AVC]]) allow custom matrices to be used. The extent of the reduction may be varied by changing the quantizer scale code, taking up much less bandwidth than a full quantizer matrix.<ref name="wiseman"/>
 
This is an example of DCT coefficient matrix: <!--NOTE: this matrix was generated using random numbers and the other two matricies. It may not actually work well with an iDCT. -->
 
:<math>
Line 41 ⟶ 70:
</math>
 
A common quantization matrix is: <!--FIXME: whereA does this matrix come from? afaict this issource exactlyfor the part"common" thatjustification would changebe withnice. encoderWiseman andgives qualitythe settingdefault soMPEG callingmatrix, itwhich commonvaries seemsfrom a bit of a stretch especially with no source [[User:Plugwash|Plugwash]] 12:30, 23the Mayone 2005below. (UTC)-->
 
:<math>
Line 56 ⟶ 85:
</math>
 
UsingDividing thisthe quantizationDCT coefficient matrix element-wise with thethis DCT coefficientquantization matrix, and rounding fromto aboveintegers results in:
 
:<math>
\begin{bmatrix}
-26 & -3 & -6 & 2 & 2 & -1 & 0 & 0 \\
0 & -3 & -4 & 1 & 1 & 0 & 0 & 0 \\
-3 & 1 & 5 & -1 & -1 & 0 & 0 & 0 \\
-4 & 1 & 2 & -1 & 0 & 0 & 0 & 0 \\
Line 83 ⟶ 112:
-25.9375
\right)
=-26
-26
</math>
 
Typically this process will result in matrices with values primarily in the upper left (low frequency) corner. By using a zig-zag ordering to group the non-zero entries and [[run length encoding]], the quantized matrix can be much more efficiently stored than the non-quantized version.<ref name="wiseman"/>
== External links ==
 
==See also==
* [[Image segmentation]]
* [[Image-based meshing]]
* [[Range segmentation]]
 
==References==
* [http://www.compression-links.info/Quantization List of papers and resources about quantization]
{{reflist}}<ref>{{Cite book |last=Smith |first=Steven W. |title=Digital signal processing: a practical guide for engineers and scientists |date=2003 |publisher=Newnes |isbn=978-0-7506-7444-7 |series=Demystifying technology series |___location=Amsterdam Boston}}</ref>{{Compression Methods}}
 
{{DEFAULTSORT:Quantization (Image Processing)}}
[[Category:Lossy compression algorithms]]
[[Category:Image compression]]
[[Category:Data compression]]