Block-matching algorithm: Difference between revisions

Content deleted Content added
m Disambiguating links to Image restoration (link changed to Digital photograph restoration) using DisamAssist.
 
(83 intermediate revisions by 42 users not shown)
Line 1:
{{Short description|System used in computer graphics applications}}
{{Unreferenced|date=June 2008}}
A '''Block Matching Algorithm''' (BMA) is a way of locating matching blocks[[macroblock]]s in a sequence of [[digital video]] frames for the purposes of [[motion estimation]]. The underlying supposition behind [[motion estimation]] is that the patterns corresponding to objects and background in a frame of video sequence move within the frame to form corresponding objects on the subsequent frame. This can be used to discover temporal redundancy in the video sequence, increasing the effectiveness of inter-frame [[video compression]] andby televisiondefining the contents of a macroblock by reference to the contents of a known macroblock which is standardsminimally conversiondifferent.
[[File:Block-matching algorithm.png|thumb|405x405px]]
A block matching algorithm involves dividing the current [[Film frame|frame]] of a video into macroblocks and comparing each of the macroblocks with a corresponding block and its adjacent neighbors in a nearby frame of the video (sometimes just the previous one). A [[vector (mathematics and physics)|vector]] is created that models the movement of a macroblock from one ___location to another. This movement, calculated for all the macroblocks comprising a frame, constitutes the motion estimated in a frame.
 
The search area for a good macroblock match is decided by the ‘search parameter’, p, where p is the number of [[pixels]] on all four sides of the corresponding macro-block in the previous frame. The search parameter is a measure of motion. The larger the value of p, larger is the potential motion and the possibility for finding a good match. A full search of all potential blocks however is a computationally expensive task. Typical inputs are a macroblock of size 16 pixels and a search area of p = 7 pixels.
 
[[Block-matching and 3D filtering]] makes use of this approach to solve various [[Digital photograph restoration|image restoration]] [[inverse problems]] such as [[noise reduction]]<ref>{{cite journal |last1= Dabov |first1= Kostadin |last2= Foi |first2= Alessandro |first3= Vladimir |last3= Katkovnik |first4= Karen |last4= Egiazarian |date= 16 July 2007 |title= Image denoising by sparse 3D transform-___domain collaborative filtering |journal= IEEE Transactions on Image Processing |volume=16 |issue= 8 |pages= 2080–2095 |doi= 10.1109/TIP.2007.901238 |pmid= 17688213 |bibcode= 2007ITIP...16.2080D |citeseerx= 10.1.1.219.5398 |s2cid= 1475121 }}</ref> and [[deblurring]]<ref>{{Cite journal|last1= Danielyan|first1= Aram|last2= Katkovnik|first2= Vladimir|last3= Egiazarian|first3= Karen|arxiv=1106.6180 |title= BM3D Frames and Variational Image Deblurring |journal= IEEE Transactions on Image Processing|volume= 21|issue= 4|pages= 1715–28|date=30 June 2011 |doi= 10.1109/TIP.2011.2176954|pmid= 22128008|bibcode= 2012ITIP...21.1715D|s2cid= 11204616}}</ref> in both still images and [[digital video]].
 
== Motivation ==
 
[[Motion estimation]] is the process of determining [[motion vector]]s that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom.
 
Applying the motion vectors to an image to predict the transformation to another image, on account of moving camera or object in the image is called [[motion compensation]]. The combination of motion estimation and motion compensation is a key part of [[video compression]] as used by [[MPEG]] 1, 2 and 4 as well as many other [[video codecs]].
 
Motion estimation based video compression helps in saving bits by sending encoded difference images which have inherently less energyentropy as opposed to sending a fully coded frame. However, the most computationally expensive and resource extensive operation in the entire compression process is motion estimation. Hence, fast and computationally inexpensive algorithms for motion estimation is a need for video compression.
 
== Block Matching Algorithm ==
 
Block matching algorithm involves dividing the current [[Film frame|frame]] of video into ‘macro blocks’ and comparing each of the macro-block with corresponding block and its adjacent neighbors in the previous frame of the video. A [[vector (mathematics and physics)|vector]] is created that captures the movement of macro-block from one ___location to another in the previous frame. This movement calculated for all the macro blocks comprising a frame, constitutes the motion estimated in the current frame.
 
The search area for a good macro-block match is decided by the ‘search parameter’, p, where p is the number of [[pixels]] on all four sides of the corresponding macro-block in the previous frame. The search parameter is a measure of motion. Larger the value of p, larger is the motion, however this becomes a computationally extensive task. Usually the macro-block is taken to be of size 16 pixels and the search parameter is set to 7 pixels
 
== Evaluation Metrics ==
MetricA metric for matching a macro-blockmacroblock with another block is based on a cost function,. theThe most popular in terms of computationallycomputational inexpensiveexpense is:
 
[[Mean absolute difference|Mean difference]] or Mean Absolute Difference (MAD) = <math>\frac{1}{N^2}\sum_{i=0}^{n-1}\sum_{j=0}^{n-1} |C_{ij}-R_{ij}|</math>
 
[[Mean Squared Error]] (MSE) = <math>\frac{1}{N^2}\sum_{i=0}^{n-1}\sum_{j=0}^{n-1} (C_{ij}-R_{ij})^2</math>
 
where N is the size of the macro-block, and <math>C_{ij}</math> and <math>R_{ij}</math> are the pixels being compared in current macro blockmacroblock and reference macro blockmacroblock, respectively.
 
The motion compensated image that is created using the [[motion vector]]svectors and macro-blocksmacroblocks from the reference frame is characterized by [[Peak signal-to-noise ratio]] (PSNR),
 
<math>\text{PSNR} = <math>10Log_10 \log_{10}[\frac {(\text{peak to peak value of original data})^2}{\text{MSE}]}</math>
 
== Algorithms ==
{{More citations needed|date=January 2024}}
Block Matching algorithms have been developed since mid-80 to the most recent fast algorithms in the year 2002. The various algorithms developed so far have been discussed below.
Block Matching algorithms have been researched since mid-1980s. Many algorithms have been developed, but only some of the most basic or commonly used have been described below.
 
=== Exhaustive Search (ES) or Full Search (FS)===
This algorithm calculates the [[Loss function|cost function]] at each possible ___location in the search window. This leads to the best possible match of the macro-block in the reference frame with a block in another frame. The resulting motion compensated image has highest peak signal-to-noise ratio as compared to any other block matching algorithm.
 
• Three Step Search (TSS)
 
• Two Dimensional Logarithmic Search (TDLS)
 
• New Three Step Search (NTSS)
 
• Simple and Efficient Search (SES)
 
• Four Step Search (FSS)
 
• Diamond Search (DS)
 
• Adaptive Rood Pattern Search (ARPS)
 
=== Exhaustive Search (ES) ===
This algorithm calculates the [[Loss function|cost function]] at each possible ___location in the search window. This leads to the best possible match of the macro-block in the reference frame with a block in another frame. The resulting motion compensated image has highest [[PSNR]] as compared to any other block matching algorithm.
However this is the most computationally extensive block matching algorithm among all. A larger search window requires greater number of computations.
 
=== ThreeOptimized StepHierarchical SearchBlock Matching (TSSOHBM) ===
 
The optimized hierarchical block matching (OHBM) algorithm speeds up the exhaustive search based on the optimized image pyramids.<ref name="Je_spic13_ohbm">{{Cite journal |doi = 10.1016/j.image.2013.04.002|title = Optimized hierarchical block matching for fast and accurate image registration|journal = Signal Processing: Image Communication|volume = 28|issue = 7|pages = 779–791|year = 2013|last1 = Je|first1 = Changsoo|last2 = Park|first2 = Hyung-Min}}</ref>
It is one of the earliest fast block matching algorithm. It runs as follows:
 
=== Three Step Search ===
• Start with search ___location at center
 
It is one of the earliest fast block matching algorithms. It runs as follows:
• Set step size ‘S’ = 4 and search parameter ‘p’ = 7
 
# Start with search ___location at center
• Search 8 locations +/- S [[pixels]] around ___location (0,0)
# Set step size S = 4 and search parameter p = 7
 
# Search 8 locations +/- S pixels around ___location (0,0) and the ___location (0,0)
• Pick among the 9 locations searched, the one with minimum cost function
# Pick among the 9 locations searched, the one with minimum cost function
 
# Set the new search origin to the above picked ___location
# Set the new step size as S = S/2
 
# SetRepeat the newsearch stepprocedure size asuntil S = S/21
 
• Repeat the search procedure until S = 1
 
The resulting ___location for S=1 is the one with minimum cost function and the macro block at this ___location is the best match.
 
There is a reduction in computation by a factor of 9 in this algorithm. For p=7, while ES evaluates cost for 255225 macro-blocks, TSS evaluates only for 25 macro blocks.
 
=== Two Dimensional Logarithmic Search (TDLS) ===
 
TDLS is closely related to TSS however it is more accurate for estimating [[motion vector]]svectors for a large search window size. The algorithm can be described as follows,
 
# Start with search ___location at the center
# Select an initial step size say, S = 8
# Search for 4 locations at a distance of S from center on the X and Y axes
# Find the ___location of point with least cost function
# If a point other than center is the best matching point,
## Select this point as the new center
# If the best matching point is at the center, set S = S/2
# Repeat steps 2 to 3
# If S = 1, all 8 locations around the center at a [[distance]] S are searched
# Set the motion vector as the point with least cost function
 
=== New Three Step Search ===
• Select an initial step size say, S = 8
 
TSS uses a uniformly allocated checking pattern and is prone to miss small motions. NTSS <ref name=tss>{{cite journal|last1=Li|first1=Renxiang|last2=Zeng|first2=Bing|last3=Liou|first3=Ming|title=A New Three-Step Search Algorithm for Block Motion Estimation|journal= IEEE Transactions on Circuits and Systems for Video Technology|date=August 1994|volume=4|issue=4|pages=438–442|doi=10.1109/76.313138}}</ref> is an improvement over TSS as it provides a center biased search scheme and has provisions to stop halfway to reduce the computational cost. It was one of the first widely accepted fast algorithms and frequently used for implementing earlier standards like [[MPEG]] 1 and H.261.
• Search for 4 locations at a distance of S from center on the X and Y axes
 
• Find the ___location of point with least cost function
 
• If a point other than center is the best matching point,
• Select this point as the new center
• Repeat steps 2 to 3
 
• If the best matching point is at the center, set S = S/2
 
• If S = 1, all 8 locations around the center at a [[distance]] S are searched
 
• Set the [[motion vector]] as the point with least cost function
 
=== New Three Step Search (NTSS) ===
 
TSS uses a uniformly allocated checking pattern and is prone to miss small motions. NTSS <ref name=tss>{{cite journal|last1=Li|first1=Renxiang|last2=Zeng|first2=Bing|last3=Liou|first3=Ming|title=A New Three-Step Search Algorithm for Block Motion Estimation|journal=IEEE Trans. Circuits And Systems For Video Technology|date=August 1994|volume=4|issue=4|pages=438-442}}</ref> is an improvement over TSS as it provides a center biased search scheme and has provisions to stop half way to reduce the computational cost. It was one of the first widely accepted fast algorithms and frequently used for implementing earlier standards like [[MPEG]] 1 and H.261.
 
The algorithm runs as follows:
 
# Start with search ___location at center
# Search 8 locations +/- S pixels with S = 4 and 8 locations +/- S pixels with S = 1 around ___location (0,0)
 
# Pick among the 16 locations searched, the one with minimum cost function
• Search 8 locations +/- S [[pixels]] with S = 4 and 8 locations +/- S [[pixels]] with S = 1 around ___location (0,0)
# If the minimum cost function occurs at origin, stop the search and set motion vector to (0,0)
 
# PickIf amongthe minimum cost function occurs at one of the 168 locations searchedat S = 1, set the onenew withsearch minimumorigin costto functionthis ___location
## Check adjacent weights for this ___location, depending on ___location it may check either 3 or 5 points
 
# IfThe theone minimumthat costgives functionlowest occursweight at origin, stopis the searchclosest andmatch, set [[the motion vector]] to (0,0)that ___location
# If the lowest weight after the first step was one of the 8 locations at S = 4, the normal TSS procedure follows
 
## IfPick the minimum cost function occurs at one ofamong the 89 locations at S = 1searched, set the newone searchwith originminimum tocost this ___locationfunction
## Set the new search origin to the above picked ___location
• Check adjacent weights for this ___location, depending on ___location it may check either 3 or 5 points
## Set the new step size as S = S/2
 
## Repeat the search procedure until S = 1
• The one that gives lowest weight is the closest match, set the [[motion vector]] to that ___location
 
• If the lowest weight after the first step was one of the 8 locations at S = 4, the normal TSS procedure follows
• Pick among the 9 locations searched, the one with minimum cost function
• Set the new search origin to the above picked ___location
• Set the new step size as S = S/2
• Repeat the search procedure until S = 1
 
Thus this algorithm checks 17 points for each macro-block and the worst-case scenario involves checking 33 locations, which is still much faster than TSS
 
=== Simple and Efficient Search (SES) ===
 
The idea behind TSS is that the error surface due to motion in every macro block is [[unimodal]]. A [[unimodal]] surface is a bowl shaped surface such that the weights generated by the cost function increase monotonically from the global minimum. However a [[unimodal]] surface cannot have two minimums in opposite directions and hence the 8 point fixed pattern search of TSS can be further modified to incorporate this and save computations. SES [<ref>{{cite journal|last1=Lu|first1=Jianhua|last2=Liou|first2=Ming|title=A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation|journal= IEEE Transactions on Circuits and Systems for Video Technology|date=April 1997|volume=7|issue=2]|pages=429–433|doi=10.1109/76.564122}}</ref> is the extension of TSS that incorporates this assumption.
 
SES algorithm improves upon TSS algorithm as each search step in SES is divided into two phases:
Line 131 ⟶ 101:
• First Phase :
• Divide the area of search in four [[quadrant (plane geometry)|quadrant]]s
• Start search with three locations, one at center (A) and others (B and C), S=4 locations away from A in orthogonal directions
S=4 locations away from A in orthogonal directions
• Find points in search quadrant for second phase using the weight distribution for A, B, C:
• If (MAD(A)>=MAD(B) and MAD(A)>=MAD(C)), select points in second phase quadrant IV
Line 147 ⟶ 116:
• Repeat the SES search procedure until S=1
 
• Select the ___location with lowest weight as [[motion vector]]
SES is computationally very efficient as compared to TSS. However the peak signal-to-noise ratio achieved is poor as compared to TSS as the error surfaces are not strictly unimodal in reality.
 
=== Four Step Search ===
SES is computationally very efficient as compared to TSS. However the [[peak signal-to-noise ratio]] achieved is poor as compared to TSS as the error surfaces are not strictly [[unimodal]] in reality.
Four Step Search is an improvement over TSS in terms of lower computational cost and better peak signal-to-noise ratio. Similar to NTSS, FSS <ref>{{cite journal|last1=Po|first1=Lai-Man|last2=Ma|first2=Wing-Chung|title=A Novel Four-Step Search Algorithm for Fast Block Motion Estimation|journal= IEEE Transactions on Circuits and Systems for Video Technology|date=June 1996|volume=6|issue=3|pages=313–317|doi=10.1109/76.499840}}</ref> also employs center [[bias (statistics)|bias]]ed searching and has a halfway stop provision.
 
=== Four Step Search (FSS) ===
Four Step Search is an improvement over TSS in terms of lower computational cost and better peak signal-to-noise ratio. Similar to NTSS, FSS [3] also employs center [[bias (statistics)|bias]]ed searching and has a halfway stop provision.
 
The algorithm runs as follows:
 
# Start with search ___location at center
# Set step size S = 2, (irrespective of search parameter p)
# Search 8 locations +/- S pixels around ___location (0,0)
# Pick among the 9 locations searched, the one with minimum cost function
# If the minimum weight is found at center for search window:
## Set the new step size as S = S/2 (that is S = 1)
## Repeat the search procedure from steps 3 to 4
## Select ___location with the least weight as motion vector
# If the minimum weight is found at one of the 8 locations other than the center:
## Set the new origin to this ___location
## Fix the step size as S = 2
## Repeat the search procedure from steps 3 to 4. Depending on ___location of new origin, search through 5 locations or 3 locations
## Select the ___location with the least weight
## If the least weight ___location is at the center of new window go to step 5, else go to step 6
 
=== Diamond Search ===
• Set step size ‘S’ = 2, (irrespective of search parameter ‘p’)
Diamond Search (DS)<ref>{{cite journal|last1=Zhu|first1=Shan|last2=Ma|first2=Kai-Kuang|title=A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation|journal= IEEE Transactions on Image Processing|date=February 2000|volume=9|issue=12|pages=287–290|doi=10.1109/83.821744|pmid=18255398|bibcode=2000ITIP....9..287Z}}</ref> algorithm uses a diamond search point pattern and the algorithm runs exactly the same as 4SS. However, there is no limit on the number of steps that the algorithm can take.
 
• Search 8 locations +/- S pixels around ___location (0,0) as shown in figure
 
• Pick among the 9 locations searched, the one with minimum cost function
 
• If the minimum weight is found at center for search window:
• Set the new search origin as shown in figure 7(d)
• Set the new step size as S = S/2 = 1
• Repeat the search procedure from steps 3 to 4
• Select ___location with the least weight as motion vector
 
• If the minimum weight is found at one of the 8 locations other than the center:
• Set the new origin to this ___location
• Fix the step size as S = 2
• Repeat the search procedure from steps 3 to 4. Depending on ___location of new origin,
search through 5 locations or 3 locations
• Select the ___location with the least weight
• If the least weight ___location is at the center of new window go to step 5, else go to step 6
 
=== Diamond Search (DS) ===
Diamond Search (DS) [4] algorithm uses a diamond search point pattern and the algorithm runs exactly the same as 4SS. However, there is no limit on the number of steps that the algorithm can take.
 
Two different types of fixed patterns are used for search,
 
* Large Diamond Search Pattern (LDSP)
* Small Diamond Search Pattern (SDSP)
 
• Small Diamond Search Pattern (SDSP)
 
The algorithm runs as follows:
* LDSP:
## Start with search ___location at center
## Set step size ‘S’S = 2
## Search 8 locations pixels (X,Y) such that (|X|+|Y|=S) around ___location (0,0) using a diamond search point pattern
## Pick among the 9 locations searched, the one with minimum cost function
around ___location (0,0) using a diamond search point pattern
## If the minimum weight is found at center for search window, go to SDSP step
• Pick among the 9 locations searched, the one with minimum cost function
## If the minimum weight is found at centerone forof searchthe window8 locations other than the center, goset the new origin to SDSPthis step___location
## Repeat LDSP
• If the minimum weight is found at one of the 8 locations other than the center,
* SDSP:
set the new origin to this ___location
## Set the new search origin
• Repeat LDSP
## Set the new step size as S = S/2 (that is S = 1)
• SDSP:
## Repeat the search procedure to find ___location with least weight
• Set the new search origin
## Select ___location • Setwith the newleast step sizeweight as S = S/2 =motion 1vector
• Repeat the search procedure to find ___location with least weight
• Select ___location with the least weight as [[motion vector]]
 
This algorithm finds the global minimum very accurately as the search pattern is neither too big nor too small. Diamond Search algorithm has a peak signal-to-noise ratio close to that of Exhaustive Search with significantly less computational expense.
 
=== Adaptive Rood Pattern Search (ARPS) ===
Adaptive rood pattern search (ARPS) <ref>{{cite journal|last1=Nie|first1=Yao|last2=Ma|first2=Kai-Kuang|title=Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation|journal= IEEE Trans.Transactions on Image Processing|date=December 2002|volume=11|issue=12|pages=1442–1448|doi=10.1109/TIP.2002.806251|pmid=18249712|bibcode=2002ITIP...11.1442N|url=http://www3.ntu.edu.sg/home/ekkma/1_Publications_files/Adaptive%20rood%20pattern%20search%20for%20fast%20block-matching%20motion%20estimation%20%28IEEE%20TIP%20Dec%202002%29.pdf}}</ref> algorithm makes use of the fact that the general motion in a frame is usually [[coherence (physics)|coherent]], i.e. if the macro blocks around the current macro block moved in a particular direction then there is a high [[probability]] that the current macro block will also have a similar [[motion vector]]. This algorithm uses the motion vector of the macro block to its immediate left to predict its own motion vector.
 
ARPSAdaptive rood pattern search runs as follows:
 
# Start with search ___location at the center (origin)
# Find the predicted motion vector for the block
 
# Set step size S = max (|X|,|Y|), where (X,Y) is the [[coordinate]] of predicted motion vector
• Find the predicted motion vector for the block, (refer figure 10)
# Search for rood pattern distributed points around the origin at step size S
 
# Set the point with least weight as origin
• Set step size S = max (|X|,|Y|), where (X,Y) is the [[coordinate]] of predicted motion vector
# Search using small diamond search pattern (SDSP) around the new origin
 
# Repeat SDSP search until least weighted point is at the center of SDSP
• Search for rood pattern distributed points around the origin at step size S
 
• Set the point with least weight as origin
 
• Search using small diamond search pattern (SDSP) around the new origin
 
• Repeat SDSP search until least weighted point is at the center of SDSP
 
Rood pattern search directly puts the search in an area where there is a high probability of finding a good matching block. The main advantage of ARPS over DS is if the predicted motion vector is (0, 0), it does not waste computational time in doing LDSP, but it directly starts using SDSP. Furthermore, if the predicted motion vector is far away from the center, then again ARPS saves on computations by directly jumping to that vicinity and using SDSP, whereas DS takes its time doing LDSP.
Line 229 ⟶ 181:
==References==
{{reflist}}
 
== other references ==
These need to be removed and cited properly
 
[2] Jianhua Lu, and Ming L. Liou, “A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation”, IEEE Trans. Circuits And Systems For Video Technology, vol 7, no. 2, pp.&nbsp;429–433, April 1997
 
[3] Lai-Man Po, and Wing-Chung Ma, “A Novel Four-Step Search Algorithm for Fast Block Motion Estimation”, IEEE Trans. Circuits And Systems For Video Technology, vol 6, no. 3, pp.&nbsp;313–317, June 1996.
 
[4] Shan Zhu, and Kai-Kuang Ma, “ A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation”, IEEE Trans. Image Processing, vol 9, no. 2, pp.&nbsp;287–290, February 2000.
 
[5] Yao Nie, and Kai-Kuang Ma, “Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation”, IEEE Trans. Image Processing, vol 11, no. 12, pp.&nbsp;1442–1448, December 2002.
 
== External links ==