Pyramid (image processing): Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Add: s2cid. Upgrade ISBN10 to 13. | Use this bot. Report bugs. | Suggested by AManWithNoPlan | #UCB_CommandLine
Citation bot (talk | contribs)
Add: bibcode, journal. | Use this bot. Report bugs. | Suggested by Dominic3203 | Linked from User:LinguisticMystic/cs/outline | #UCB_webform_linked 1676/2277
 
(7 intermediate revisions by 5 users not shown)
Line 1:
{{Short description|Type of multi-scale signal representation}}
{{FeatureDetectionCompVisNavbox}}
[[File:image pyramid.svg|thumb|upright=1.2|Visual representation of an image pyramid with 5 levels]]
{{FeatureDetectionCompVisNavbox}}
 
'''Pyramid''', or '''pyramid representation''', is a type of [[Scale model|multi-scale]] [[Signal (information theory)|signal]] [[Knowledge representation|representation]] developed by the [[computer vision]], [[image processing]] and [[signal processing]] communities, in which a signal or an image is subject to repeated [[smoothing]] and [[Downsampling|subsampling]]. Pyramid representation is a predecessor to [[scale space|scale-space representation]] and [[multiresolution analysis]].
 
Line 25 ⟶ 27:
| pages = 20–51
|date=May 1981
}}</ref><ref name=Crowley1981>{{Cite journal |last=Crowley |first=James L. |title=A representation for visual information |journal=Interim Report Carnegie-Mellon Univ |publisher=Carnegie-Mellon University, Robotics Institute |date=November 1981 |bibcode=1981cmu..reptR....C |id=tech. report CMU-RI-TR-82-07 |url=http://www.ri.cmu.edu/publication_view.html?pub_id=37}}</ref><ref>{{cite journal | last1 = Burt | first1 = Peter | last2 = Adelson | first2 = Ted | year = 1983 | title = The Laplacian Pyramid as a Compact Image Code | url = http://persci.mit.edu/pub_pdfs/pyramid83.pdf| journal = IEEE Trans.Transactions Commun.on Communications| volume = 9 | issue = 4| pages = 532–540 | doi = 10.1109/TCOM.1983.1095851 | citeseerx = 10.1.1.54.299 | s2cid = 8018433 }}</ref><ref>{{Cite journal
| last1 = Crowley | first1 = J. L.
| last2 = Parker | first2 = A. C. | author2-link = Alice C. Parker
Line 38 ⟶ 40:
| citeseerx = 10.1.1.161.3102
| s2cid = 14348919
}}</ref><ref>{{cite journal | last1 = Crowley | first1 = J. L. | last2 = Sanderson | first2 = A. C. | year = 1987 | title = Multiple resolution representation and probabilistic matching of 2-D gray-scale shape | url = http://www-prima.inrialpes.fr/Prima/Homepages/jlc/papers/Crowley-Sanderson-PAMI87.pdf| journal = IEEE Transactions on Pattern Analysis and Machine Intelligence | volume = 9 | issue = 1| pages = 113–121 | doi = 10.1109/tpami.1987.4767876 | pmid = 21869381 | citeseerx = 10.1.1.1015.9294 | s2cid = 14999508 }}</ref><ref>{{cite journal | last1 = Meer | first1 = P. | last2 = Baugher | first2 = E. S. | last3 = Rosenfeld | first3 = A. | year = 1987 | title = Frequency ___domain analysis and synthesis of image generating kernels | doi = 10.1109/tpami.1987.4767939 | journal = IEEE Transactions on Pattern Analysis and Machine Intelligence | volume = 9 | issue = 4| pages = 512–522 | pmid = 21869409 | s2cid = 5978760 }}</ref> Among the suggestions that have been given, the ''binomial kernels'' arising from the [[binomial coefficient]]s stand out as a particularly useful and theoretically well-founded class.<ref name=Crowley1981/><ref>Lindeberg, Tony, "[http://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A472968&dswid=77 Scale-space for discrete signals]," PAMI(12), No. 3, March 1990, pp. 234-254.</ref><ref>{{cite journal | last1 = Haddad | first1 = R. A. | last2 = Akansu | first2 = A. N. | date = March 1991 | title = A Class of Fast Gaussian Binomial Filters for Speech and Image Processing | url = https://web.njit.edu/~akansu/PAPERS/Haddad-AkansuFastGaussianBinomialFiltersIEEE-TSP-March1991.pdf | journal = IEEE Transactions on Signal Processing | volume = 39 | issue = 3| pages = 723–727| doi = 10.1109/78.80892 | bibcode = 1991ITSP...39..723H }}</ref><ref>Lindeberg, Tony. [http://www.csc.kth.se/~tony/book.html Scale-Space Theory in Computer Vision], Kluwer Academic Publishers, 1994, {{ISBN|0-7923-9418-6}} (see specifically Chapter 2 for an overview of Gaussian and Laplacian image pyramids and Chapter 3 for theory about generalized binomial kernels and discrete Gaussian kernels)</ref><ref name=LinBre03-ScSp/><ref>See the article on [[multi-scale approaches]] for a very brief theoretical statement</ref> Thus, given a two-dimensional image, we may apply the (normalized) binomial filter (1/4, 1/2, 1/4) typically twice or more along each spatial dimension and then subsample the image by a factor of two. This operation may then proceed as many times as desired, leading to a compact and efficient multi-scale representation. If motivated by specific requirements, intermediate scale levels may also be generated where the subsampling stage is sometimes left out, leading to an ''oversampled'' or ''hybrid pyramid''.<ref name=LinBre03-ScSp/> With the increasing computational efficiency of [[CPU]]s available today, it is in some situations also feasible to use wider supported [[Gaussian filter]]s as smoothing kernels in the pyramid generation steps.
 
===Gaussian pyramid===
Line 44 ⟶ 46:
 
===Laplacian pyramid===
A Laplacian pyramid is very similar to a Gaussian pyramid but saves the difference image of the blurred versions between each levels. Only the smallest level is not a difference image to enable reconstruction of the high resolution image using the difference images on higher levels. This technique can be used in [[image compression]].<ref>{{cite journal | last1 = Burt | first1 = Peter J. | last2 = Adelson | first2 = Edward H. | year = 1983 | title = The Laplacian Pyramid as a Compact Image Code | url = http://persci.mit.edu/pub_pdfs/pyramid83.pdf | journal = IEEE Transactions on Communications | volume = 31| issue = 4| pages = 532–540| doi = 10.1109/TCOM.1983.1095851 | citeseerx = 10.1.1.54.299 | s2cid = 8018433 }}</ref>
 
===Steerable pyramid===
A steerable pyramid, developed by [[Eero Simoncelli|Simoncelli]] and others, is an implementation of a multi-scale, multi-orientation [[band-pass filter]] bank used for applications including [[image compression]], [[texture synthesis]], and [[Outline of object recognition|object recognition]]. It can be thought of as an orientation selective version of a Laplacian pyramid, in which a bank of [[steerable filter]]s are used at each level of the pyramid instead of a single Laplacian or [[Gaussian filter]].<ref>{{Cite web |first=Eero |last=Simoncelli |url=http://www.cns.nyu.edu/~eero/STEERPYR/ |title=The Steerable Pyramid |publisher=cns.nyu.edu }}</ref><ref>{{Cite web |first1=Roberto |last1=Manduchi |first2=Pietro |last2=Perona |first3=Doug |last3=Shy |title=Efficient Deformable Filter Banks |url=http://www.vision.caltech.edu/publications/ManduchiPeronaShy_efficient_deformable.pdf |publisher=[[California Institute of Technology]]/[[University of Padua]] |year=1997 }} <br />Also in {{Cite journal |journal= IEEE Transactions on Signal Processing |title=Efficient Deformable Filter Banks |volume=46 |issue=4 |pages=1168–1173 |year=1998 |doi=10.1109/78.668570|last1=Manduchi |first1=R. |last2=Perona |first2=P. |last3=Shy |first3=D. |bibcode=1998ITSP...46.1168M |citeseerx=10.1.1.5.3102 }}</ref><ref>Stanley{{cite Abook | doi=10.1117/12.274510 | chapter=Seven models of masking | title=Human Vision and Electronic Imaging II | date=1997 | editor-last1=Rogowitz | editor-first1=Bernice E. | last1=Klein ;| Thomfirst1=Stanley A. | last2=Carney ;| Laurenfirst2=Thom | last3=Barghout-Stein and| first3=Lauren | last4=Tyler | first4=Christopher W. Tyler| volume=3016 | pages=13–24 | s2cid=8366504 | editor-first2=Thrasyvoulos N. | editor-last2=Pappas }}</ref>
"Seven models of masking", Proc. SPIE 3016, Human Vision and Electronic Imaging II, 13 (June 3, 1997); {{doi|10.1117/12.274510}}</ref>
 
==Applications of pyramids==