Block-matching algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 11:34, 5 November 2021 edit Petri Krohn (talk \| contribs) Autopatrolled, Extended confirmed users 37,111 edits Undid revision 1012584688 by 66.75.123.245 (talk) - Undoing old vandalism. Step 1/4 (Must first revert this correction.) Tag: Undo ← Previous edit		Latest revision as of 12:05, 12 September 2024 edit undo ColRad85 (talk \| contribs) Extended confirmed users 38,373 edits m Disambiguating links to Image restoration (link changed to Digital photograph restoration) using DisamAssist.
(16 intermediate revisions by 11 users not shown)
Line 1: {{Short description\|System used in computer graphics applications}} A '''Block Matching Algorithm''' is a way of locating matching [[macroblock]]s in a sequence of [[digital video]] frames for the purposes of [[motion estimation]]. The underlying supposition behind motion estimation is that the patterns corresponding to objects and background in a frame of video sequence move within the frame to form corresponding objects on the subsequent frame. This can be used to discover temporal redundancy in the video sequence, increasing the effectiveness of inter-frame [[video compression]] by defining the contents of a macroblock by reference to the contents of a known macroblock which is minimally different. [[File:Block-matching algorithm.png\|thumb\|405x405px]] Line 5 ⟶ 6: The search area for a good macroblock match is decided by the ‘search parameter’, p, where p is the number of [[pixels]] on all four sides of the corresponding macro-block in the previous frame. The search parameter is a measure of motion. The larger the value of p, larger is the potential motion and the possibility for finding a good match. A full search of all potential blocks however is a computationally expensive task. Typical inputs are a macroblock of size 16 pixels and a search area of p = 7 pixels. [[Block-matching and 3D filtering]] makes use of this approach to solve various [[Digital photograph restoration\|image restoration]] [[inverse problems]] such as [[noise reduction]]<ref>{{cite journal \|last1= Dabov \|first1= Kostadin \|last2= Foi \|first2= Alessandro \|first3= Vladimir \|last3= Katkovnik \|first4= Karen \|last4= Egiazarian \|date= 16 July 2007 \|title= Image denoising by sparse 3D transform-___domain collaborative filtering \|journal= IEEE Transactions on Image Processing \|volume=16 \|issue= 8 \|pages= 2080–2095 \|doi= 10.1109/TIP.2007.901238 \|pmid= 17688213 \|bibcode= 2007ITIP...16.2080D \|citeseerx= 10.1.1.219.5398 \|s2cid= 1475121 }}</ref> and [[deblurring]]<ref>{{Cite journal\|last1= Danielyan\|first1= Aram\|last2= Katkovnik\|first2= Vladimir\|last3= Egiazarian\|first3= Karen\|arxiv=1106.6180 \|title= BM3D Frames and Variational Image Deblurring \|journal= IEEE Transactions on Image Processing\|volume= 21\|issue= 4\|pages= 1715–28\|date=30 June 2011 \|doi= 10.1109/TIP.2011.2176954\|pmid= 22128008\|bibcode= 2012ITIP...21.1715D\|s2cid= 11204616}}</ref> in both still images and [[digital video]]. == Motivation == Line 13 ⟶ 14: Applying the motion vectors to an image to predict the transformation to another image, on account of moving camera or object in the image is called [[motion compensation]]. The combination of motion estimation and motion compensation is a key part of [[video compression]] as used by [[MPEG]] 1, 2 and 4 as well as many other [[video codecs]]. Motion estimation based video compression helps in saving bits by sending encoded difference images which have inherently less entropy as opposed to sending a fully coded frame. However, the most computationally expensive and resource extensive operation in the entire compression process is motion estimation. Hence, fast and computationally inexpensive algorithms for motion estimation is a need for video compression. == Evaluation Metrics == Line 29 ⟶ 30: == Algorithms == {{More citations needed\|date=January 2024}} Block Matching algorithms have been researched since mid-1980s. Many algorithms have been developed, but only some of the most basic or commonly used have been described below. Line 35 ⟶ 37: However this is the most computationally extensive block matching algorithm among all. A larger search window requires greater number of computations. ==== Optimized ~~hierarchical~~Hierarchical ~~block~~Block ~~matching~~Matching (OHBM) ==== The optimized hierarchical block matching (OHBM) algorithm speeds up the exhaustive search based on the optimized image pyramids.<ref name="Je_spic13_ohbm">{{Cite journal \|doi = 10.1016/j.image.2013.04.002\|title = Optimized hierarchical block matching for fast and accurate image registration\|journal = Signal Processing: Image Communication\|volume = 28\|issue = 7\|pages = 779–791\|year = 2013\|last1 = Je\|first1 = Changsoo\|last2 = Park\|first2 = Hyung-Min}}</ref> Line 65 ⟶ 67: # If a point other than center is the best matching point, ## Select this point as the new center ## Repeat steps 2 to 3▼ # If the best matching point is at the center, set S = S/2 ▲## Repeat steps 2 to 3 # If S = 1, all 8 locations around the center at a [[distance]] S are searched # Set the motion vector as the point with least cost function Line 72 ⟶ 74: === New Three Step Search === TSS uses a uniformly allocated checking pattern and is prone to miss small motions. NTSS <ref name=tss>{{cite journal\|last1=Li\|first1=Renxiang\|last2=Zeng\|first2=Bing\|last3=Liou\|first3=Ming\|title=A New Three-Step Search Algorithm for Block Motion Estimation\|journal= IEEE ~~Trans.~~Transactions on Circuits and Systems for Video Technology\|date=August 1994\|volume=4\|issue=4\|pages=438–442\|doi=10.1109/76.313138}}</ref> is an improvement over TSS as it provides a center biased search scheme and has provisions to stop halfway to reduce the computational cost. It was one of the first widely accepted fast algorithms and frequently used for implementing earlier standards like [[MPEG]] 1 and H.261. The algorithm runs as follows: Line 93 ⟶ 95: === Simple and Efficient Search === The idea behind TSS is that the error surface due to motion in every macro block is [[unimodal]]. A unimodal surface is a bowl shaped surface such that the weights generated by the cost function increase monotonically from the global minimum. However a unimodal surface cannot have two minimums in opposite directions and hence the 8 point fixed pattern search of TSS can be further modified to incorporate this and save computations. SES <ref>{{cite journal\|last1=Lu\|first1=Jianhua\|last2=Liou\|first2=Ming\|title=A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation\|journal= IEEE ~~Trans.~~Transactions on Circuits and Systems for Video Technology\|date=April 1997\|volume=7\|issue=2\|pages=429–433\|doi=10.1109/76.564122}}</ref> is the extension of TSS that incorporates this assumption. SES algorithm improves upon TSS algorithm as each search step in SES is divided into two phases: Line 99 ⟶ 101: • First Phase : • Divide the area of search in four [[quadrant (plane geometry)\|quadrant]]s • Start search with three locations, one at center (A) and others (B and C), S=4 locations away from A in orthogonal directions ~~A=34.0444094,-1177977943; B=34.043634,-117.7897651; C=34.04453,-117.7977953~~ • Find points in search quadrant for second phase using the weight distribution for A, B, C: • If (MAD(A)>=MAD(B) and MAD(A)>=MAD(C)), select points in second phase quadrant IV Line 116 ⟶ 117: • Select the ___location with lowest weight as motion vector SES is computationally very efficient as compared to TSS. However the peak signal-to-noise ratio achieved is poor as compared to TSS as the error surfaces are not strictly unimodal in reality. ~~34.0444094~~ === Four Step Search === Four Step Search is an improvement over TSS in terms of lower computational cost and better peak signal-to-noise ratio. Similar to NTSS, FSS <ref>{{cite journal\|last1=Po\|first1=Lai-Man\|last2=Ma\|first2=Wing-Chung\|title=A Novel Four-Step Search Algorithm for Fast Block Motion Estimation\|journal= IEEE ~~Trans.~~Transactions on Circuits and Systems for Video Technology\|date=June 1996\|volume=6\|issue=3\|pages=313–317\|doi=10.1109/76.499840}}</ref> also employs center [[bias (statistics)\|bias]]ed searching and has a halfway stop provision. The algorithm runs as follows: Line 125 ⟶ 126: # Start with search ___location at center # Set step size S = 2, (irrespective of search parameter p) # Search ~~4 locations +/- S pixels search~~ 8 locations +/- S pixels around ___location (0,0)~~locations 1=34.0460965,-117.7955275;~~ ~~2=34.0464443,-117.7990496; 3=34.0446975,-117.7999699; 4=34.0438915,-117.7952174~~ # Pick among the 9 locations searched, the one with minimum cost function # If the minimum weight is found at center for search window: Line 136: ## Fix the step size as S = 2 ## Repeat the search procedure from steps 3 to 4. Depending on ___location of new origin, search through 5 locations or 3 locations ## Select the ___location with the least weight ~~34.043634,-117.7897651~~ ## If the least weight ___location is at the center of new window go to step 5, else go to step 6 === Diamond Search === Diamond Search (DS)<ref>{{cite journal\|last1=Zhu\|first1=Shan\|last2=Ma\|first2=Kai-Kuang\|title=A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation\|journal=~~EEE~~ ~~Trans.~~IEEE Transactions on Image Processing\|date=February 2000\|volume=9\|issue=12\|pages=287–290\|doi=10.1109/83.821744\|pmid=18255398\|bibcode=2000ITIP....9..287Z}}</ref> algorithm uses a diamond search point pattern and the algorithm runs exactly the same as 4SS. However, there is no limit on the number of steps that the algorithm can take. Two different types of fixed patterns are used for search, Line 160: ## Set the new step size as S = S/2 (that is S = 1) ## Repeat the search procedure to find ___location with least weight ## Select ___location with the least weight as motion vector ~~___location of motion vector with least weight 34.0444094,-1177977943~~ This algorithm finds the global minimum very accurately as the search pattern is neither too big nor too small. Diamond Search algorithm has a peak signal-to-noise ratio close to that of Exhaustive Search with significantly less computational expense. === Adaptive Rood Pattern Search === Adaptive rood pattern search (ARPS) <ref>{{cite journal\|last1=Nie\|first1=Yao\|last2=Ma\|first2=Kai-Kuang\|title=Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation\|journal= IEEE ~~Trans.~~Transactions on Image Processing\|date=December 2002\|volume=11\|issue=12\|pages=1442–1448\|doi=10.1109/TIP.2002.806251\|pmid=18249712\|bibcode=2002ITIP...11.1442N\|url=http://www3.ntu.edu.sg/home/ekkma/1_Publications_files/Adaptive%20rood%20pattern%20search%20for%20fast%20block-matching%20motion%20estimation%20%28IEEE%20TIP%20Dec%202002%29.pdf}}</ref> algorithm makes use of the fact that the general motion in a frame is usually [[coherence (physics)\|coherent]], i.e. if the macro blocks around the current macro block moved in a particular direction then there is a high [[probability]] that the current macro block will also have a similar [[motion vector]]. This algorithm uses the motion vector of the macro block to its immediate left to predict its own motion vector. Adaptive rood pattern search runs as follows: