Quantization (image processing): Difference between revisions

Content deleted Content added
Added Grayscale Quantization
m add WP:TEMPLATECAT to remove from template; genfixes
 
(One intermediate revision by one other user not shown)
Line 1:
{{Short description|Lossy compression technique}}
{{RefimproveMore citations needed|date=November 2012}}
 
'''Quantization''', involved in [[image processing]], is a [[lossy compression]] technique achieved by compressing a range of values to a single quantum (discrete) value. When the number of discrete symbols in a given stream is reduced, the stream becomes more compressible. For example, reducing the number of colors required to represent a digital [[image]] makes it possible to reduce its file size. Specific applications include [[Discrete cosine transform|DCT]] data quantization in [[JPEG]] and [[Discrete wavelet transform|DWT]] data quantization in [[JPEG 2000]].
Line 22 ⟶ 23:
* Δ is the size of each quantization interval.
 
Let's quantize an original intensity value of 147 to 3 intensity levels.
 
Original intensity value: ''x''=147
 
Desired intensity levels: ''L''=3
Line 46 ⟶ 47:
== Frequency quantization for image compression ==
 
The human eye is fairly good at seeing small differences in [[brightness]] over a relatively large area, but not so good at distinguishing the exact strength of a high frequency (rapidly varying) brightness variation. This fact allows one to reduce the amount of information required by ignoring the high frequency components. This is done by simply dividing each component in the frequency ___domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers.
 
As human vision is also more sensitive to [[luminance]] than [[chrominance]], further compression can be obtained by working in a non-RGB color space which separates the two (e.g., [[YCbCr]]), and quantizing the channels separately.<ref name="wiseman">John Wiseman, ''An Introduction to MPEG Video Compression'', https://web.archive.org/web/20111115004238/http://www.john-wiseman.com/technical/MPEG_tutorial.htm</ref>
Line 52 ⟶ 53:
=== Quantization matrices ===
 
A typical video codec works by breaking the picture into discrete blocks (8×8 pixels in the case of MPEG<ref name="wiseman"/>). These blocks can then be subjected to [[discrete cosine transform]] (DCT) to calculate the frequency components, both horizontally and vertically.<ref name="wiseman"/> The resulting block (the same size as the original block) is then pre-multiplied by the quantization scale code and divided element-wise by the quantization matrix, and rounding each resultant element. The quantization matrix is designed to provide more resolution to more perceivable frequency components over less perceivable components (usually lower frequencies over high frequencies) in addition to transforming as many components to 0, which can be encoded with greatest efficiency. Many video encoders (such as [[DivX]], [[Xvid]], and [[3ivx]]) and compression standards (such as [[MPEG-2]] and [[H.264/AVC]]) allow custom matrices to be used. The extent of the reduction may be varied by changing the quantizer scale code, taking up much less bandwidth than a full quantizer matrix.<ref name="wiseman"/>
 
This is an example of DCT coefficient matrix: <!--NOTE: this matrix was generated using random numbers and the other two matricies. It may not actually work well with an iDCT. -->
Line 127 ⟶ 128:
[[Category:Lossy compression algorithms]]
[[Category:Image compression]]
[[Category:Data compression]]