Content deleted Content added
Changing short description from "small object detection" to "Detecting small objects in digital images" |
m Cleanup, curly → straight quotes, punctuation before citation |
||
Line 1:
{{Short description|Detecting small objects in digital images}}
'''Small object detection''' is a particular case of [[object detection]] where various techniques are employed to detect small objects in digital images and videos.
== Uses ==
[[File:Track_Results.webm|thumb|An example of object tracking]]
== Problems with small objects ==
* Modern-day object detection algorithms such as You Only Look Once(YOLO)<ref>{{Cite journal |last=Redmon |first=Joseph |last2=Divvala |first2=Santosh |last3=Girshick |first3=Ross |last4=Farhadi |first4=Ali |date=2016-05-09 |title=You Only Look Once: Unified, Real-Time Object Detection |url=http://arxiv.org/abs/1506.02640 |journal=arXiv:1506.02640 [cs] |doi=10.48550/arxiv.1506.02640}}</ref><ref>{{Cite journal |last=Redmon |first=Joseph |last2=Farhadi |first2=Ali |date=2016-12-25 |title=YOLO9000: Better, Faster, Stronger |url=http://arxiv.org/abs/1612.08242 |journal=arXiv:1612.08242 [cs] |doi=10.48550/arxiv.1612.08242}}</ref><ref>{{Cite journal |last=Redmon |first=Joseph |last2=Farhadi |first2=Ali |date=2018-04-08 |title=YOLOv3: An Incremental Improvement |url=http://arxiv.org/abs/1804.02767 |journal=arXiv:1804.02767 [cs] |doi=10.48550/arxiv.1804.02767}}</ref><ref>{{Cite journal |last=Bochkovskiy |first=Alexey |last2=Wang |first2=Chien-Yao |last3=Liao |first3=Hong-Yuan Mark |date=2020-04-22 |title=YOLOv4: Optimal Speed and Accuracy of Object Detection |url=http://arxiv.org/abs/2004.10934 |journal=arXiv:2004.10934 [cs, eess] |doi=10.48550/arxiv.2004.10934}}</ref><ref>{{Cite journal |last=Wang |first=Chien-Yao |last2=Bochkovskiy |first2=Alexey |last3=Liao |first3=Hong-Yuan Mark |date=2021-02-21 |title=Scaled-YOLOv4: Scaling Cross Stage Partial Network |url=http://arxiv.org/abs/2011.08036 |journal=arXiv:2011.08036 [cs] |doi=10.48550/arxiv.2011.08036}}</ref><ref>{{Cite journal |last=Li |first=Chuyi |last2=Li |first2=Lulu |last3=Jiang |first3=Hongliang |last4=Weng |first4=Kaiheng |last5=Geng |first5=Yifei |last6=Li |first6=Liang |last7=Ke |first7=Zaidan |last8=Li |first8=Qingyuan |last9=Cheng |first9=Meng |last10=Nie |first10=Weiqiang |last11=Li |first11=Yiduo |last12=Zhang |first12=Bo |last13=Liang |first13=Yufei |last14=Zhou |first14=Linyuan |last15=Xu |first15=Xiaoming |date=2022-09-07 |title=YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications |url=http://arxiv.org/abs/2209.02976 |journal=arXiv:2209.02976 [cs] |doi=10.48550/arxiv.2209.02976}}</ref><ref>{{Cite journal |last=Wang |first=Chien-Yao |last2=Bochkovskiy |first2=Alexey |last3=Liao |first3=Hong-Yuan Mark |date=2022-07-06 |title=YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors |url=http://arxiv.org/abs/2207.02696 |journal=arXiv:2207.02696 [cs] |doi=10.48550/arxiv.2207.02696}}</ref> heavily uses convolution layers to learn [[Feature (computer vision)|features]]. As an object passes through convolution layers, its size gets reduced. Therefore, the small object disappears after several layers and becomes undetectable.
* Sometimes, the shadow of an object is detected as a part of object itself.<ref>{{Cite journal |last=Zhang |first=Mingrui |last2=Zhao |first2=Wenbing |last3=Li |first3=Xiying |last4=Wang |first4=Dan |date=2020-12-11 |title=Shadow Detection Of Moving Objects In Traffic Monitoring Video |url=https://ieeexplore.ieee.org/document/9338958/ |journal=2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) |___location=Chongqing, China |publisher=IEEE |pages=1983–1987 |doi=10.1109/ITAIC49862.2020.9338958 |isbn=978-1-7281-5244-8}}</ref> So, the placement of the bounding box tends to centre around a shadow rather than an object. In the case of vehicle detection, [[pedestrian]] and two-wheeler detection suffer because of this.
* At present, [[Unmanned aerial vehicle|drones]] are very widely used in aerial imagery.<ref>{{Cite journal |title=Interactive workshop "How drones are changing the world we live in" |url=http://ieeexplore.ieee.org/document/7486437/ |journal=2016 Integrated Communications Navigation and Surveillance (ICNS) |___location=Herndon, VA |publisher=IEEE |pages=1–17 |doi=10.1109/ICNSURV.2016.7486437 |isbn=978-1-5090-2149-9}}</ref> They are equipped with
[[File:Disp_shadow.jpg|thumb|Shadow and drone movement effect]]
== Methods ==
Various methods<ref>{{Cite web |title=An Evaluation of Deep Learning Methods for Small Object Detection |url=https://www.hindawi.com/journals/jece/2020/3189691/ |access-date=2022-09-14 |website=www.hindawi.com |language=en |doi=10.1155/2020/3189691}}</ref> are available to detect small objects, which
[[File:Yolov5.jpg|thumb|YOLOv5 detection result]]
[[File:Y5_sahi.jpg|thumb|YOLOv5 and SAHI interface]]
Line 25:
==== Choosing a data set that has small objects ====
The [[machine learning]] model's output depends on
==== Generating more data via augmentation, if required ====
[[Deep learning]] models have billions of neurons that settle down to some weights after training. Therefore, it requires a good amount of quantitative and qualitative data for better training.<ref>{{Cite web |title=The Size and Quality of a Data Set {{!}} Machine Learning |url=https://developers.google.com/machine-learning/data-prep/construct/collect/data-size-quality |access-date=2022-09-14 |website=Google Developers |language=en}}</ref> [[Data augmentation]] is useful technique to generate more diverse data<ref name=":0" /> from an existing data set.
==== Increasing image capture resolution and model’s input resolution ====
Line 34:
==== Auto learning anchors ====
Selecting anchor size plays a vital role in small object detection.<ref>{{Cite journal |last=Zhong |first=Yuanyi |last2=Wang |first2=Jianfeng |last3=Peng |first3=Jian |last4=Zhang |first4=Lei |date=2020-01-26 |title=Anchor Box Optimization for Object Detection |url=http://arxiv.org/abs/1812.00469 |journal=arXiv:1812.00469 [cs] |doi=10.48550/arxiv.1812.00469}}</ref> Instead of hand picking it, use algorithms that identify it
==== Tiling approach during training and inference ====
State-of-the-art object detectors allow only the fixed size of image and
==== Feature Pyramid Network (FPN) ====
Use a feature [[Pyramid (image processing)|pyramid]] network<ref>{{Cite journal |last=Lin |first=Tsung-Yi |last2=Dollár |first2=Piotr |last3=Girshick |first3=Ross |last4=He |first4=Kaiming |last5=Hariharan |first5=Bharath |last6=Belongie |first6=Serge |date=2017-04-19 |title=Feature Pyramid Networks for Object Detection |url=http://arxiv.org/abs/1612.03144 |journal=arXiv:1612.03144 [cs] |doi=10.48550/arxiv.1612.03144}}</ref> to learn features at a multi-scale
=== Add-on techniques ===
Instead of modifying existing methods, some add-on techniques are there, which can be directly placed on top of existing approaches to detect smaller objects. One such technique is Slicing Aided Hyper Inference(SAHI).<ref>{{Cite journal |last=Akyon |first=Fatih Cagatay |last2=Altinuc |first2=Sinan Onur |last3=Temizel |first3=Alptekin |date=2022-07-12 |title=Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection |url=http://arxiv.org/abs/2202.06934 |journal=arXiv:2202.06934 [cs] |doi=10.48550/arxiv.2202.06934}}</ref>
=== Well-Optimised techniques for small object detection ===
Various deep learning techniques are available that focus on such object detection problems
== Other applications ==
|