Object recognition: differenze tra le versioni

Naviga nella cronologia in modo interattivo

← Differenza precedente

Contenuto cancellato Contenuto aggiunto

VisualeWikitesto

Versione delle 11:16, 3 feb 2010 modifica Escherblau (discussione \| contributi) 4 modifiche mNessun oggetto della modifica ← Differenza precedente		Versione attuale delle 22:07, 25 ago 2023 modifica annulla InternetArchiveBot (discussione \| contributi) Bot 1 845 993 modifiche Recupero di 1 fonte/i e segnalazione di 0 link interrotto/i.) #IABot (v2.0.9.5
(45 versioni intermedie di 26 utenti non mostrate)
Riga 1: [[File:Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg\|thumb\|Individuazione di oggetti mediante [[apprendimento profondo]], utilizzando un modello di rete basato su [[YOLO (algoritmo)\|YOLOv3]] addestrato con il dataset COCO, capace di individuare fino ad 80 differenti tipologie di oggetti.]] ~~{{T\|inglese\|informatica\|giugno 2009}}~~ LNella [[visione artificiale]], il '''~~'Object~~riconoscimento ~~recognition~~di oggetti''', ~~nella~~in ~~[[computer~~inglese ~~vision]]~~'''''object recognition''''', è la capacità di trovare un determinato oggetto in una sequenza di immagini o video. L'~~uomo~~essere umano riconosce una moltitudine di oggetti in immagini con poco sforzo, nonostante il fatto che l'immagine degli oggetti possa variare un po' in diversi punti di vista, in diversi formati/scala o rotazione~~. Inoltre~~; gli oggetti possono essere riconosciuti anche quando sono parzialmente esclusi dalla vista. Questo compito è ancora una sfida per la [[visione artificiale (in inglese ''computer vision]]'') in generale. L'informatico [[David G. Lowe ~~(computer scientist)~~\|David Lowe]] ha sperimentato la [[~~computer~~visione ~~vision~~artificiale]] per l'estrazione e l'utilizzo della scala invariante [[Scale-invariant feature transform\|SIFT]] in modo da rendere il riconoscimento più affidabile. Per ogni oggetto in un'immagine, ci sono molte ~~'features'~~[[Caratteristica (apprendimento automatico)\|caratteristiche]], che sono caratteristiche interessanti dell'oggetto, le quali possono essere estratte in modo da fornire una descrizione "caratteristica" dell'oggetto. Questa descrizione estratta da una immagine campione può poi essere utilizzata per identificare l'oggetto durante il tentativo di individuare l'oggetto in una immagine di test contenente più oggetti. È importante che l'insieme di caratteristiche estratte dall'immagine campione sia insensibile a variazioni di scala delle immagini, i disturbi, l'illuminazione e distorsioni geometriche, in modo da rendere affidabile il riconoscimento. Il metodo brevettato di Lowe <ref>{{US patent\|6,711,293}}, "Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image", David Lowe's patent for the SIFT algorithm</ref> può riconoscere gli oggetti in maniera affidabile, anche tra il disordine e con occlusione parziale perché il metodo ~~[[Scale-invariant feature transform\|~~SIFT]] è indipendente dalla scala, orientamento, distorsione e parzialmente dai cambiamenti d'illuminazione <ref name="lowe">{{en}} Lowe, D. G., “Object recognition from local scale-invariant features”, International Conference on Computer Vision, Corfu, Greece, September 1999.</ref>. Questo articolo presenta il metodo di Lowe e cita alcuni concorrenti tecniche disponibili per l'object recognition in presenza di disordine e occlusione parziale. == Metodo di ~~David~~ Lowe == I punti chiave [[Scale-invariant feature transform\|SIFT]] degli oggetti sono prima estratti da una serie di immagini di riferimento<ref name="lowe" /> e memorizzati in un database. Un oggetto è riconosciuto in una nuova immagine confrontando singolarmente ciascun elemento della nuova immagine con quello nel database trovando quello più simile secondo la [[distanza euclidea]] delle loro caratteristiche vettoriali. Da l'intera serie di corrispondenze, insieme di punti chiave che corrispondono all'oggetto e la sua ubicazione, scala, e orientamento, sono identificate nella nuova immagine estraendo le migliori. La determinazione dei gruppi più coerenti viene eseguita rapidamente utilizzando un'efficiente [[hash table]] implementazione della [[trasformata di Hough]] generalizzata. Ogni gruppo di 3 o più caratteristiche che concordano su un oggetto e la sua posizione è poi oggetto di ulteriori verifiche e, successivamente, i peggiori vengono scartati. Infine, viene calcolata la probabilità che un determinato insieme di caratteristiche indica la presenza di un oggetto, dando la precisione di adattamento e il numero di probabili corrispondenze errate. Gli oggetti trovati che passano tutte queste prove possono essere identificati come corretti con elevata affidabilità<ref name="lowe04">Lowe, D. G., “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.</ref>. Riga 30: \| Modelli affidabili \|- \| Verifica di un modello / ~~indivuduazione~~individuazione di scarto \| Linear least squares \| Miglior tolleranza con minor corrispondenze Riga 40: == Fasi principali == === Funzione di rilevamento a scala invariante === Il metodo Lowe per la generazione di un'immagine caratteristica chiamata '''Scale Invariant Feature Transform''' (~~[[Scale-invariant feature transform\|~~SIFT]]) trasforma l'immagine in una grande collezione di caratteristiche vettoriali, ognuna delle quali è invariante rispetto a traslazione, ridimensionamento, rotazione e, in parte rispetto all'illuminazione. Tale metodo è robusto rispetto a distorsioni geometriche. Queste caratteristiche hanno proprietà simili ai neuroni del [[lobo occipitale]], i quali vengono utilizzati per il riconoscimento di oggetti nei sistema di visione dei primati<ref>{{en}} Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T., “A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex”, Computer Science and Artificial Intelligence Laboratory Technical Report, December 19, 2005 MIT-CSAIL-TR-2005-082.</ref>. Le posizioni dei punti chiave sono definite come massimi e minimi del risultato della ~~differencza delle Gaussiane (vedi~~ [[~~:en:Difference~~differenza ~~of Gaussians\|Difference of~~delle ~~Gaussians~~gaussiane]]), di una serie di immagini ottenute col sistema [[~~:en:Scale-space\|~~spazio-scala]]. Vengono scartati i punti a basso contrasto e i punti di bordo che si trovano lungo un bordo. Maggiore credibilità viene assegnata ai punti chiave localizzati. Queste fasi garantiscono che i punti chiave siano più stabili durante il riconoscimento.▼ La solidità del metodo ~~[[Scale-invariant feature transform\|~~SIFT]] rispetto alla distorsione è quindi ottenuta considerando i [[pixel]] nell'intorno del punto chiave e sfocando e ricampionando l'immagine locale.▼ ▲Il metodo Lowe per la generazione di un'immagine caratteristica chiamata '''Scale Invariant Feature Transform''' ([[Scale-invariant feature transform\|SIFT]]) trasforma l'immagine in una grande collezione di caratteristiche vettoriali, ognuna delle quali è invariante rispetto a traslazione, ridimensionamento, rotazione e, in parte rispetto all'illuminazione. Tale metodo è robusto rispetto a distorsioni geometriche. Queste caratteristiche hanno proprietà simili ai neuroni del [[lobo occipitale]], i quali vengono utilizzati per il riconoscimento di oggetti nei sistema di visione dei primati<ref>Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T., “A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex”, Computer Science and Artificial Intelligence Laboratory Technical Report, December 19, 2005 MIT-CSAIL-TR-2005-082.</ref>.Le posizioni dei punti chiave sono definite come massimi e minimi del risultato della differencza delle Gaussiane (vedi [[:en:Difference of Gaussians\|Difference of Gaussians]]), di una serie di immagini ottenute col sistema [[:en:Scale-space\|spazio-scala]]. Vengono scartati i punti a basso contrasto e i punti di bordo che si trovano lungo un bordo. Maggiore credibilità viene assegnata ai punti chiave localizzati. Queste fasi garantiscono che i punti chiave siano più stabili durante il riconoscimento. ▲La solidità del metodo [[Scale-invariant feature transform\|SIFT]] rispetto alla distorsione è quindi ottenuta considerando i pixel nell'intorno del punto chiave e sfocando e ricampionando l'immagine locale. === Ricerca e indicizzazione === L'indicizzazione è il problema di immagazzinare i punti chiave ~~[[Scale-invariant feature transform\|~~SIFT]] e di individuarli in una nuova immagine. Lowe ha usato una modifica dell'algoritmo [[k-d tree]] chiamato metodo del '''Best-bin-first search''' <ref>{{en}} Beis, J., and Lowe, D.G “Shape indexing using approximate nearest-neighbour search in high-dimensional spaces”, Conference on Computer Vision and Pattern Recognition, Puerto Rico, 1997, pp. 1000–1006.</ref> che può ~~indivuduare~~individuare il ''[[K-nearest neighbors\|nearest neighbor]]s'' con elevata probabilità utilizzando solo limitate risorse di elaborazione. L'algoritmo BBF utilizza un ordinamento di ricerca modificato per il [[k-d tree]] in modo che i bins nella proprietà spazio siano ricercati in funzione della loro minima distanza dalla posizione richiesta. Questo ordine di ricerca richiede l'uso di una ''[[Heap (struttura dati)\|heap]]'' basata sulla [[coda di priorità]] per l'efficiente determinazione dell'ordine di ricerca. L'appaiamento al miglior candidato per ogni keypoint viene trovato identificando ili suoi vicini più prossimi ~~[[nearest neighbor]]~~ nel database dei keypoints proveniente dalle immagini di addestramento. I [[''nearest ~~neighbor]]~~neighbors'' sono definiti come i ~~the~~ keypoints con la minima [[~~Euclidean~~distanza ~~distance~~euclidea]] da un dato vettore descrittivo. La probabilità che un appaiamento sia corretto può essere determinata tramite il rapporto delle distanze con i due vicini più prossimi. ▼ Lowe<ref name="lowe04" /> rigetta tutti gli accoppiamenti in cui il rapporto di distanza è superiore a 0.8, il che elimina 90% dei falsi accoppiamenti pur compromettendo meno del 5% degli appaiamenti corretti. Per migliorare ulteriormente l'efficienza dell'algoritmo di ricerca best-bin-first viene effettuato un cutoff dopo i primi 200 candidati [[nearest neighbor]]. Per un database di 100,000 keypoints, tutto ciò ~~portta~~porta ad una velocizzazione sulla ricerca del corretto [[nearest neighbor]] di circa due ordini di grandezza compromettendo meno del 5% sul numero dei corretti accoppiamenti. ▼ ▲L'indicizzazione è il problema di immagazzinare i punti chiave [[Scale-invariant feature transform\|SIFT]] e di individuarli in una nuova immagine. Lowe ha usato una modifica dell'algoritmo [[k-d tree]] chiamato metodo del '''Best-bin-first search''' <ref>Beis, J., and Lowe, D.G “Shape indexing using approximate nearest-neighbour search in high-dimensional spaces”, Conference on Computer Vision and Pattern Recognition,Puerto Rico, 1997, pp. 1000–1006.</ref>che può indivuduare il [[nearest neighbor]]s con elevata probabilità utilizzando solo limitate risorse di elaborazione. L'algoritmo BBF utilizza un ordinamento di ricerca modificato per il [[k-d tree]] in modo che i bins nella proprietà spazio siano ricercati in funzione della loro minima distanza dalla posizione richiesta. Questo ordine di ricerca richiede l'uso di una [[heap]] basata sulla [[coda di priorità]] per l'efficiente determinazione dell'ordine di ricerca. L'appaiamento al miglior candidato per ogni keypoint viene trovato identificando il suoi vicini più prossimi [[nearest neighbor]] nel database dei keypoints proveniente dalle immagini di addestramento. I [[nearest neighbor]] sono definiti come i the keypoints con la minima [[Euclidean distance]] da un dato vettore descrittivo. La probabilità che un appaiamento sia corretto può essere determinata tramite il rapporto delle distanze con i due vicini più prossimi. == Applicazioni == ▲Lowe<ref name="lowe04" /> rigetta tutti gli accoppiamenti in cui il rapporto di distanza è superiore a 0.8, il che elimina 90% dei falsi accoppiamenti pur compromettendo meno del 5% degli appaiamenti corretti. Per migliorare ulteriormente l'efficienza dell'algoritmo di ricerca best-bin-first viene effettuato un cutoff dopo i primi 200 candidati [[nearest neighbor]]. Per un database di 100,000 keypoints, tutto ciò portta ad una velocizzazione sulla ricerca del corretto [[nearest neighbor]] di circa due ordini di grandezza compromettendo meno del 5% sul numero dei corretti accoppiamenti. I metodi di riconoscimento di oggetti trovano le seguenti applicazioni: * ~~Image~~panorami ~~panoramas~~di immagini;<ref>Brown, M., and Lowe, D.G., "Recognising Panoramas," ICCV, p. 1218, Ninth IEEE International Conference on Computer Vision (ICCV'03) - Volume 2, Nice, France, 2003</ref>▼ ~~=== Identificazione dei cluster tramite Hough transform voting ===~~ * ~~Image~~[[Watermark ~~watermarking~~(informatica)\|watermark]] di immagini;<ref>Li, L., Guo, B., and Shao, K., " Geometrically robust image watermarking using scale-invariant feature transform and Zernike moments," Chinese Optics Letters, Volume 5, Issue 6, pp. 332-335, 2007.</ref>▼ [[Hough Transform]] is used to cluster reliable model hypotheses to search for keys that agree upon a particular model [[pose]]. [[Hough transform]] identifies clusters of features with a consistent interpretation by using each feature to vote for all object [[pose]]s that are consistent with the feature. When clusters of features are found to vote for the same pose of an object, the probability of the interpretation being correct is much higher than for any single feature. An entry in a [[hash table]] is created predicting the model ___location, orientation, and scale from the match hypothesis.The [[hash table]] is searched to identify all clusters of at least 3 entries in a bin, and the bins are sorted into decreasing order of size. * ~~Global~~localizzazione di robot ~~localization~~globale.<ref>Se,S., Lowe, D.G., and Little, J.J.,"Vision-based global localization and mapping for mobile robots", IEEE Transactions on Robotics, 21, 3 (2005), pp. 364-375.</ref>▼ Each of the [[Scale-invariant feature transform\|SIFT]] keypoints specifies 2D ___location, scale, and orientation, and each matched keypoint in the database has a record of the keypoint’s parameters relative to the training image in which it was found. The similarity transform implied by these 4 parameters is only an approximation to the full 6 degree-of-freedom pose space for a 3D object and also does not account for any non-rigid deformations. Therefore, Lowe<ref name="lowe04" /> used broad bin sizes of 30 degrees for orientation, a factor of 2 for scale, and 0.25 times the maximum projected training image dimension (using the predicted scale) for ___location. The [[Scale-invariant feature transform\|SIFT]] key samples generated at the larger scale are given twice the weight of those at the smaller scale. This means that the larger scale is in effect able to filter the most likely neighbours for checking at the smaller scale. This also improves recognition performance by giving more weight to the least-noisy scale. To avoid the problem of boundary effects in bin assignment, each keypoint match votes for the 2 closest bins in each dimension, giving a total of 16 entries for each hypothesis and further broadening the pose range. ~~=== Model verification by linear least squares ===~~ Each identified cluster is then subject to a verification procedure in which a [[linear least squares]] solution is performed for the parameters of the [[affine transformation]] relating the model to the image. The [[affine transformation]] of a model point [x y]<sup>T</sup> to an image point [u v]<sup>T</sup> can be written as below ~~:<math>~~ ~~\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} m1 & m2 \\ m3 & m4 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} + \begin{bmatrix} tx \\ ty \end{bmatrix}~~ ~~</math>~~ where the model translation is [tx ty]<sup>T</sup> and the affine rotation, scale, and stretch are represented by the parameters m1, m2, m3 and m4. To solve for the transformation parameters the equation above can be rewritten to gather the unknowns into a column vector. ~~:<math>~~ \begin{bmatrix} x & y & 0 & 0 & 1 & 0 \\ 0 & 0 & x & y & 0 & 1 \\ ....\\ ....\end{bmatrix} \begin{bmatrix}m1 \\ m2 \\ m3 \\ m4 \\ tx \\ ty \end{bmatrix} = \begin{bmatrix} u \\ v \\ . \\ . \end{bmatrix} ~~</math>~~ This equation shows a single match, but any number of further matches can be added, with each match contributing two more rows to the first and last matrix. At least 3 matches are needed to provide a solution. ~~We can write this linear system as~~ ~~:<math>A\hat{\mathbf{x}} \approx \mathbf{b},</math>~~ where ''A'' is a known ''m''-by-''n'' [[Matrix (mathematics)\|matrix]] (usually with ''m'' > ''n''), '''x''' is an unknown ''n''-dimensional parameter [[vector space\|vector]], and '''b''' is a known ''m''-dimensional measurement vector. ~~Therefore the minimizing vector <math>\hat{\mathbf{x}}</math> is a solution of the '''normal equation'''~~ ~~:<math> A^T \! A \hat{\mathbf{x}} = A^T \mathbf{b}. </math>~~ ~~The solution of the system of linear equations is given in terms of the matrix <math>(A^TA)^{-1}A^T</math> , called the [[Moore-Penrose pseudoinverse\|pseudoinverse]] of ''A'', by~~ ~~:<math> \hat{\mathbf{x}} = (A^T\!A)^{-1} A^T \mathbf{b}. </math>~~ ~~which minimizes the sum of the squares of the distances from the projected model locations to the corresponding image locations.~~ ~~=== Outlier detection ===~~ [[Outlier]]s can now be removed by checking for agreement between each image feature and the model, given the parameter solution. Given the [[linear least squares]] solution, each match is required to agree within half the error range that was used for the parameters in the [[Hough transform]] bins. As outliers are discarded, the [[linear least squares]] solution is re-solved with the remaining points, and the process iterated. If fewer than 3 points remain after discarding [[outlier]]s, then the match is rejected. In addition, a top-down matching phase is used to add any further matches that agree with the projected model position, which may have been missed from the [[Hough transform]] bin due to the similarity transform approximation or other errors. The final decision to accept or reject a model hypothesis is based on a detailed probabilistic model<ref>Lowe, D.G., Local feature view clustering for 3D object recognition. IEEE Conference on Computer Vision and Pattern Recognition,Kauai, Hawaii, 2001, pp. 682-688.</ref>. This method first computes the expected number of false matches to the model pose, given the projected size of the model, the number of features within the region, and the accuracy of the fit. A [[Bayesian probability]] analysis then gives the probability that the object is present based on the actual number of matching features found. A model is accepted if the final probability for a correct interpretation is greater than 0.98. Lowe's SIFT based object recognition gives excellent results except under wide illumination variations and under non-rigid transformations. ~~== Competing methods for scale invariant object recognition under clutter / partial occlusion ==~~ RIFT <ref>Lazebnik, S., Schmid, C., and Ponce, J., Semi-Local Affine Parts for Object Recognition, BMVC, 2004.</ref> is a rotation-invariant generalization of SIFT. The RIFT descriptor is constructed using circular normalized patches divided into concentric rings of equal width and within each ring a gradient orientation histogram is computed. To maintain rotation invariance, the orientation is measured at each point relative to the direction pointing outward from the center. G-RIF<ref>Sungho Kim, Kuk-Jin Yoon, In So Kweon, "Object Recognition Using a Generalized Robust Invariant Feature and Gestalt’s Law of Proximity and Similarity," Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06), 2006</ref> : Generalized Robust Invariant Feature is a general context descriptor which encodes edge orientation, edge density and hue information in a unified form combining perceptual information with spatial encoding. The object recognition scheme uses neighbouring context based voting to estimate object models. "[[SURF]]<ref>Bay, H., Tuytelaars, T., Gool, L.V., "SURF: Speeded Up Robust Features", Proceedings of the ninth European Conference on Computer Vision, May 2006.</ref> : Speeded Up Robust Features" is a high-performance scale and rotation-invariant interest point detector / descriptor claimed to approximate or even outperform previously proposed schemes with respect to repeatability, distinctiveness, and robustness. [[SURF]] relies on integral images for image convolutions to reduce computation time, builds on the strengths of the leading existing detectors and descriptors (using a fast [[Hessian matrix]]-based measure for the detector and a distribution-based descriptor). It describes a distribution of [[Haar wavelet]] responses within the interest point neighbourhood. Integral images are used for speed and only 64 dimensions are used reducing the time for feature computation and matching. The indexing step is based on the sign of the [[Laplacian]],which increases the matching speed and the robustness of the descriptor. PCA-SIFT <ref>Ke, Y., and Sukthankar, R., PCA-SIFT: A More Distinctive Representation for Local Image DescriptorsComputer Vision and Pattern Recognition, 2004.</ref>and [[GLOH]] <ref>Mikolajczyk, K., and Schmid, C., "A performance evaluation of local descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 27, pp 1615--1630, 2005.</ref> are variants of [[Scale-invariant feature transform\|SIFT]]. PCA-SIFT descriptor is a vector of image gradients in x and y direction computed within the support region. The gradient region is sampled at 39x39 locations, therefore the vector is of dimension 3042. The dimension is reduced to 36 with [[PCA]]. Gradient ___location-orientation histogram ([[GLOH]]) is an extension of the [[Scale-invariant feature transform\|SIFT]] descriptor designed to increase its robustness and distinctiveness. The [[Scale-invariant feature transform\|SIFT]] descriptor is computed for a log-polar ___location grid with three bins in radial direction (the radius set to 6, 11, and 15) and 8 in angular direction, which results in 17 ___location bins. The central bin is not divided in angular directions. The gradient orientations are quantized in 16 bins resulting in 272 bin histogram. The size of this descriptor is reduced with [[PCA]]. The [[covariance matrix]] for [[PCA]] is estimated on image patches collected from various images. The 128 largest [[eigenvector]]s are used for description. ~~== Applications ==~~ ~~Object recognition methods has the following applications:~~ ▲* Image panoramas<ref>Brown, M., and Lowe, D.G., "Recognising Panoramas," ICCV, p. 1218, Ninth IEEE International Conference on Computer Vision (ICCV'03) - Volume 2, Nice,France, 2003</ref> ▲* Image watermarking<ref>Li, L., Guo, B., and Shao, K., " Geometrically robust image watermarking using scale-invariant feature transform and Zernike moments," Chinese Optics Letters, Volume 5, Issue 6, pp. 332-335, 2007.</ref> ▲* Global robot localization<ref>Se,S., Lowe, D.G., and Little, J.J.,"Vision-based global localization and mapping for mobile robots", IEEE Transactions on Robotics, 21, 3 (2005), pp. 364-375.</ref> == Note == <references/> == Voci correlate ==▼ [[Riconoscimento di pattern]] [[YOLO (algoritmo)\|YOLO]] == Collegamenti esterni == * [{{cita web\|http://citeseer.ist.psu.edu/lowe04distinctive.html \|Lowe, D. G., “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.]}} * [{{cita web\|http://www.cs.ubc.ca/spider/lowe/pubs.html \|David Lowe's Publications] }} * [{{cita web\|http://www.cs.ubc.ca/~lowe/keypoints/ \|David Lowe's Demo Software : SIFT keypoint detector]}} * [{{cita web\|http://www.vision.ee.ethz.ch/~surf/index.html \|SURF: Speeded up robust features ]}} * [{{cita web \| 1 = http://lear.inrialpes.fr/pubs/2005/MS05/ \| 2 = Mikolajczyk, K., and Schmid, C., "A performance evaluation of local descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 27, pp 1615--1630, 2005.] \| accesso = 10 giugno 2009 \| dataarchivio = 6 aprile 2019 \| urlarchivio = https://web.archive.org/web/20190406222721/http://lear.inrialpes.fr/pubs/2005/MS05/ \| urlmorto = sì }} * ~~[http~~{{cita web\|https://www.cs.cmu.edu/~yke/pcasift/ \|PCA-SIFT: A More Distinctive Representation for Local Image Descriptors]}} * [{{cita web \| 1 = http://www-cvr.ai.uiuc.edu/ponce_grp/publication/paper/bmvc04.pdf \| 2 = Lazebnik, S., Schmid, C., and Ponce, J., Semi-Local Affine Parts for Object Recognition, BMVC, 2004. ]\| accesso = 10 giugno 2009 \| dataarchivio = 11 ottobre 2017 \| urlarchivio = https://web.archive.org/web/20171011044539/http://www-cvr.ai.uiuc.edu/ponce_grp/publication/paper/bmvc04.pdf \| urlmorto = sì }} * [{{cita web \|1=http://user.cs.tu-berlin.de/~nowozin/libsift/ \|2=libsift: Scale Invariant Feature Transform implementation] \|accesso=10 giugno 2009 \|urlarchivio=https://web.archive.org/web/20090418005923/http://user.cs.tu-berlin.de/~nowozin/libsift/ \|dataarchivio=18 aprile 2009 \|urlmorto=sì }} {{Apprendimento automatico}} {{portale\|informatica\|statistica\|matematica}} ▲== Voci correlate == * [[Pattern recognition]] [[Categoria:Apprendimento automatico]] [[Categoria:Visione artificiale]] ~~[[ca:Reconeixement d'objectes]]~~ ~~[[de:Objekterkennung]]~~ ~~[[en:Object recognition (computer vision)]]~~ ~~[[nl:Objectherkenning]]~~