Pyramid (image processing): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 22:21, 26 June 2022 edit Citation bot (talk \| contribs) Bots 5,869,744 edits Add: s2cid. Upgrade ISBN10 to 13. \| Use this bot. Report bugs. \| Suggested by AManWithNoPlan \| #UCB_CommandLine ← Previous edit		Latest revision as of 05:09, 17 April 2025 edit undo Citation bot (talk \| contribs) Bots 5,869,744 edits Add: bibcode, journal. \| Use this bot. Report bugs. \| Suggested by Dominic3203 \| Linked from User:LinguisticMystic/cs/outline \| #UCB_webform_linked 1676/2277
(7 intermediate revisions by 5 users not shown)
Line 1: {{Short description\|Type of multi-scale signal representation}} {{FeatureDetectionCompVisNavbox}}▼ [[File:image pyramid.svg\|thumb\|upright=1.2\|Visual representation of an image pyramid with 5 levels]] ▲{{FeatureDetectionCompVisNavbox}} '''Pyramid''', or '''pyramid representation''', is a type of [[Scale model\|multi-scale]] [[Signal (information theory)\|signal]] [[Knowledge representation\|representation]] developed by the [[computer vision]], [[image processing]] and [[signal processing]] communities, in which a signal or an image is subject to repeated [[smoothing]] and [[Downsampling\|subsampling]]. Pyramid representation is a predecessor to [[scale space\|scale-space representation]] and [[multiresolution analysis]]. Line 25 ⟶ 27: \| pages = 20–51 \|date=May 1981 }}</ref><ref name=Crowley1981>{{Cite journal \|last=Crowley \|first=James L. \|title=A representation for visual information \|journal=Interim Report Carnegie-Mellon Univ \|publisher=Carnegie-Mellon University, Robotics Institute \|date=November 1981 \|bibcode=1981cmu..reptR....C \|id=tech. report CMU-RI-TR-82-07 \|url=http://www.ri.cmu.edu/publication_view.html?pub_id=37}}</ref><ref>{{cite journal \| last1 = Burt \| first1 = Peter \| last2 = Adelson \| first2 = Ted \| year = 1983 \| title = The Laplacian Pyramid as a Compact Image Code \| url = http://persci.mit.edu/pub_pdfs/pyramid83.pdf\| journal = IEEE ~~Trans.~~Transactions ~~Commun.~~on Communications\| volume = 9 \| issue = 4\| pages = 532–540 \| doi = 10.1109/TCOM.1983.1095851 \| citeseerx = 10.1.1.54.299 \| s2cid = 8018433 }}</ref><ref>{{Cite journal \| last1 = Crowley \| first1 = J. L. \| last2 = Parker \| first2 = A. C. \| author2-link = Alice C. Parker Line 38 ⟶ 40: \| citeseerx = 10.1.1.161.3102 \| s2cid = 14348919 }}</ref><ref>{{cite journal \| last1 = Crowley \| first1 = J. L. \| last2 = Sanderson \| first2 = A. C. \| year = 1987 \| title = Multiple resolution representation and probabilistic matching of 2-D gray-scale shape \| url = http://www-prima.inrialpes.fr/Prima/Homepages/jlc/papers/Crowley-Sanderson-PAMI87.pdf\| journal = IEEE Transactions on Pattern Analysis and Machine Intelligence \| volume = 9 \| issue = 1\| pages = 113–121 \| doi = 10.1109/tpami.1987.4767876 \| pmid = 21869381 \| citeseerx = 10.1.1.1015.9294 \| s2cid = 14999508 }}</ref><ref>{{cite journal \| last1 = Meer \| first1 = P. \| last2 = Baugher \| first2 = E. S. \| last3 = Rosenfeld \| first3 = A. \| year = 1987 \| title = Frequency ___domain analysis and synthesis of image generating kernels \| doi = 10.1109/tpami.1987.4767939 \| journal = IEEE Transactions on Pattern Analysis and Machine Intelligence \| volume = 9 \| issue = 4\| pages = 512–522 \| pmid = 21869409 \| s2cid = 5978760 }}</ref> Among the suggestions that have been given, the ''binomial kernels'' arising from the [[binomial coefficient]]s stand out as a particularly useful and theoretically well-founded class.<ref name=Crowley1981/><ref>Lindeberg, Tony, "[http://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A472968&dswid=77 Scale-space for discrete signals]," PAMI(12), No. 3, March 1990, pp. 234-254.</ref><ref>{{cite journal \| last1 = Haddad \| first1 = R. A. \| last2 = Akansu \| first2 = A. N. \| date = March 1991 \| title = A Class of Fast Gaussian Binomial Filters for Speech and Image Processing \| url = https://web.njit.edu/~akansu/PAPERS/Haddad-AkansuFastGaussianBinomialFiltersIEEE-TSP-March1991.pdf \| journal = IEEE Transactions on Signal Processing \| volume = 39 \| issue = 3\| pages = 723–727\| doi = 10.1109/78.80892 \| bibcode = 1991ITSP...39..723H }}</ref><ref>Lindeberg, Tony. [http://www.csc.kth.se/~tony/book.html Scale-Space Theory in Computer Vision], Kluwer Academic Publishers, 1994, {{ISBN\|0-7923-9418-6}} (see specifically Chapter 2 for an overview of Gaussian and Laplacian image pyramids and Chapter 3 for theory about generalized binomial kernels and discrete Gaussian kernels)</ref><ref name=LinBre03-ScSp/><ref>See the article on [[multi-scale approaches]] for a very brief theoretical statement</ref> Thus, given a two-dimensional image, we may apply the (normalized) binomial filter (1/4, 1/2, 1/4) typically twice or more along each spatial dimension and then subsample the image by a factor of two. This operation may then proceed as many times as desired, leading to a compact and efficient multi-scale representation. If motivated by specific requirements, intermediate scale levels may also be generated where the subsampling stage is sometimes left out, leading to an ''oversampled'' or ''hybrid pyramid''.<ref name=LinBre03-ScSp/> With the increasing computational efficiency of [[CPU]]s available today, it is in some situations also feasible to use wider supported [[Gaussian filter]]s as smoothing kernels in the pyramid generation steps. ===Gaussian pyramid=== Line 44 ⟶ 46: ===Laplacian pyramid=== A Laplacian pyramid is very similar to a Gaussian pyramid but saves the difference image of the blurred versions between each levels. Only the smallest level is not a difference image to enable reconstruction of the high resolution image using the difference images on higher levels. This technique can be used in [[image compression]].<ref>{{cite journal \| last1 = Burt \| first1 = Peter J. \| last2 = Adelson \| first2 = Edward H. \| year = 1983 \| title = The Laplacian Pyramid as a Compact Image Code \| url = http://persci.mit.edu/pub_pdfs/pyramid83.pdf \| journal = IEEE Transactions on Communications \| volume = 31\| issue = 4\| pages = 532–540\| doi = 10.1109/TCOM.1983.1095851 \| citeseerx = 10.1.1.54.299 \| s2cid = 8018433 }}</ref> ===Steerable pyramid=== A steerable pyramid, developed by [[Eero Simoncelli\|Simoncelli]] and others, is an implementation of a multi-scale, multi-orientation [[band-pass filter]] bank used for applications including [[image compression]], [[texture synthesis]], and [[Outline of object recognition\|object recognition]]. It can be thought of as an orientation selective version of a Laplacian pyramid, in which a bank of [[steerable filter]]s are used at each level of the pyramid instead of a single Laplacian or [[Gaussian filter]].<ref>{{Cite web \|first=Eero \|last=Simoncelli \|url=http://www.cns.nyu.edu/~eero/STEERPYR/ \|title=The Steerable Pyramid \|publisher=cns.nyu.edu }}</ref><ref>{{Cite web \|first1=Roberto \|last1=Manduchi \|first2=Pietro \|last2=Perona \|first3=Doug \|last3=Shy \|title=Efficient Deformable Filter Banks \|url=http://www.vision.caltech.edu/publications/ManduchiPeronaShy_efficient_deformable.pdf \|publisher=[[California Institute of Technology]]/[[University of Padua]] \|year=1997 }} <br />Also in {{Cite journal \|journal= IEEE Transactions on Signal Processing \|title=Efficient Deformable Filter Banks \|volume=46 \|issue=4 \|pages=1168–1173 \|year=1998 \|doi=10.1109/78.668570\|last1=Manduchi \|first1=R. \|last2=Perona \|first2=P. \|last3=Shy \|first3=D. \|bibcode=1998ITSP...46.1168M \|citeseerx=10.1.1.5.3102 }}</ref><ref>~~Stanley~~{{cite Abook \| doi=10.1117/12.274510 \| chapter=Seven models of masking \| title=Human Vision and Electronic Imaging II \| date=1997 \| editor-last1=Rogowitz \| editor-first1=Bernice E. \| last1=Klein ;\| ~~Thom~~first1=Stanley A. \| last2=Carney ;\| ~~Lauren~~first2=Thom \| last3=Barghout-Stein ~~and~~\| first3=Lauren \| last4=Tyler \| first4=Christopher W. ~~Tyler~~\| volume=3016 \| pages=13–24 \| s2cid=8366504 \| editor-first2=Thrasyvoulos N. \| editor-last2=Pappas }}</ref> ~~"Seven models of masking", Proc. SPIE 3016, Human Vision and Electronic Imaging II, 13 (June 3, 1997); {{doi\|10.1117/12.274510}}</ref>~~ ==Applications of pyramids==