Features from accelerated segment test: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 21:40, 9 November 2022 edit Olexa Riznyk (talk \| contribs) Extended confirmed users 2,365 edits →Improvement with machine learning: Adding/removing wikilinks ← Previous edit		Latest revision as of 21:34, 25 June 2024 edit undo 192.19.222.250 (talk) No edit summary Tag: Visual edit
(7 intermediate revisions by 3 users not shown)
Line 1: '''Features from accelerated segment test (FAST)''' is a [[corner detection]] method, which could be used to extract [[Feature (computer vision)\|feature]] points and later used to track and map objects in many [[computer vision]] tasks. The FAST corner detector was originally developed by Edward Rosten and Tom Drummond, and was published in 2006.<ref> {{cite book \|last1=Rosten \|first1=Edward Line 13: \|isbn=978-3-540-33832-1 }} </ref> The most promising advantage of the FAST [[corner detector]] is its computational efficiency. Referring to its name, it is indeed faster than many other well-known feature extraction methods, such as [[difference of Gaussians]] (DoG) used by the [[Scale-invariant feature transform\|SIFT]], [[Corner detection#The SUSAN corner detector\|SUSAN]] and [[Harris affine region detector\|Harris]] detectors. Moreover, when machine learning techniques are applied, superior performance in terms of computation time and resources can be realised. The FAST corner detector is very suitable for real-time video processing application because of this high-speed performance. == Segment test detector == [[File:FAST Corner Detector.jpg\|thumb\|The pixels used by the FAST corner detector]] FAST corner detector uses a circle of 16 pixels (a [[Midpoint circle algorithm\|Bresenham circle]] of radius 3) to classify whether a candidate point p is actually a corner. Each pixel in the circle is labeled from integer number 1 to 16 clockwise. If a set of N contiguous pixels in the circle are all brighter than the intensity of candidate pixel p (denoted by I<sub>p</sub>) plus a threshold value t or all darker than the intensity of candidate pixel p minus threshold value t, then p is classified as corner. The conditions can be written as: Condition 1: A set of N contiguous pixels S, <math>\forall x \in S</math>, ~~the intensity of x > I<sub>p</sub> + threshold, or~~ <math>I_x > I_p + t</math> (a dark corner on a bright background) Condition 2: A set of N contiguous pixels S, <math>\forall x \in S</math>, <math>I_x < I_p - t</math> (a bright corner on a dark background) So when either of the two conditions is met, candidate p can be classified as a corner. There is a tradeoff of choosing N, the number of contiguous pixels and the threshold value t. On one hand the number of detected corner points should not be too many, on the other hand, the high performance should not be achieved by sacrificing computational efficiency. Without the improvement of [[machine learning]], N is usually chosen as 12. A high-speed test method could be applied to exclude non-corner points. Line 46: Secondly, a [[decision tree]] algorithm, the [[ID3 algorithm]] is applied to the 16 locations in order to achieve the maximum [[information gain]]. Let K<sub>p</sub> be a boolean variable which indicates whether p is a corner, then the [[Entropy (information theory)\|entropy]] of K<sub>p</sub> is used to measure the information of p being a corner. For a set of pixels Q, the total entropy of K<sub>Q</sub> (not normalized) is: H(Q) = ( c + n ) log<sub>2</sub>( c + n ) - clog<sub>2</sub>c - nlog<sub>2</sub>n * where c = \|{ i ∈ Q: K<sub>i</sub> is true}\| (number of corners) ** where n = \|{ i ∈ Q: K<sub>i</sub> is false}\| (number of non-corners) Line 55: A recursive process is applied to each subsets in order to select each x that could maximize the information gain. For example, at first an x is selected to partition P into P<sub>d</sub>, P<sub>s</sub>, P<sub>b</sub> with the most information; then for each subset P<sub>d</sub>, P<sub>s</sub>, P<sub>b</sub>, another y is selected to yield the most information gain (notice that the y could be the same as x ). This recursive process ends when the entropy is zero so that either all pixels in that subset are corners or non-corners. This generated [[decision tree]] can then be converted into programming code, such as [[C (programming language)\|C]] and [[C++]], which is just a bunch of nested if-else statements. For optimization purpose, [[profile-guided optimization]] is used to compile the code. The ~~complied~~compiled code is used as corner detector later for other images. Notice that the corners detected using this decision tree algorithm should be slightly different from the results using segment test detector. This is because that [[decision tree model]] depends on the training data, which could not cover all possible corners. Line 65: == FAST-ER: Enhanced repeatability == FAST-ER detector is an improvement of the FAST detector using a [[metaheuristic]] algorithm, in this case [[simulated annealing]]. So that after the optimization, the structure of the decision tree would be optimized and suitable for points with high repeatability. However, since [[simulated annealing]] is a metaheurisic algorithm, each time the algorithm would generate a different optimized decision tree. So it is better to take efficiently large amount of iterations to find a solution that is close to the real optimal. According to Rosten, it takes about 200 hours on a [[Pentium 4]] at 3 GHz which is 100 repeats of 100,000 iterations to optimize the FAST detector. == Comparison with other detectors == In Rosten's research,<ref>Edward Rosten, [http://arXiv.org/pdf/0810.2434 FASTER and better: A machine learning approach to corner detection]</ref> FAST and FAST-ER detector are evaluated on several different datasets and compared with the [[DoG]], [[Harris affine region detector\|Harris]], [[Harris–Laplace detector\|Harris-Laplace]], [[Shi-Tomasi]], and [[Corner detection#The SUSAN corner detector\|SUSAN]] corner detectors. The parameter settings for the detectors (other than FAST) are as follows: Line 138: \|} * Speed tests were performed on a 3.0 GHz [[Pentium D\|Pentium 4-D]] computer. The dataset are divided into a training set and a test set. The training set consists of 101 [[monochrome]] images with a resolution of 992×668 pixels. The test set consists of 4968 frames of [[monochrome]] 352×288 video. And the result is: {\| class="wikitable" \|- Line 164: == Bibliography == * {{cite book \| last=Rosten \| first=Edward \|author2=Tom Drummond \| title=Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 \| chapter=Fusing points and lines for high performance tracking \| ~~journal~~url=~~IEEE International Conference on Computer Vision~~http://edwardrosten.com/work/rosten_2005_tracking.pdf \| year=2005 \| doi=10.1109/ICCV.2005.104 \| volume=2\| pages=1508–1511\| isbn=978-0-7695-2334-7 \| citeseerx=10.1.1.60.4715 \| s2cid=1505168 }}▼ ~~\| url=http://edwardrosten.com/work/rosten_2005_tracking.pdf~~ ▲\| year=2005 \| doi=10.1109/ICCV.2005.104 \| volume=2\| pages=1508–1511\| isbn=978-0-7695-2334-7 \| citeseerx=10.1.1.60.4715 }} * {{cite journal \| last=Rosten \| first=Edward \|author2=Reid Porter \|author3=Tom Drummond \| title=FASTER and better: A machine learning approach to corner detection \| journal=IEEE Transactions on Pattern Analysis and Machine Intelligence \| arxiv=0810.2434\| year=2010 \| doi=10.1109/TPAMI.2008.275 \| pmid=19926902 \| volume=32\| issue=1 \| pages=105–119\| s2cid=206764370 }} * {{cite book \| last=Rosten \| first=Edward \|author2=Tom Drummond \| title=Computer Vision – ECCV 2006 \| chapter=Machine ~~learning~~Learning for ~~high~~High-~~speed~~Speed ~~corner~~Corner ~~detection~~Detection \| journal=European Conference on Computer Vision▼ ▲* {{cite book \| last=Rosten \| first=Edward \|author2=Tom Drummond \| title=Machine learning for high-speed corner detection \| journal=European Conference on Computer Vision \| url=http://edwardrosten.com/work/rosten_2006_machine.pdf \| year=2006 \| doi=10.1007/11744023_34 \| volume=1\| pages=430–443\| series=Lecture Notes in Computer Science \| isbn=978-3-540-33832-1 \| citeseerx=10.1.1.64.8513 \| s2cid=1388140 }} == External links ==