Content deleted Content added
m Bot: Change redirected category Imaging sensors to Image sensors |
m Various citation & identifier cleanup, plus AWB genfixes (arxiv version pointless when published) |
||
Line 5:
== Uses ==
[[File:Track_Results.webm|thumb|An example of object tracking]]
Small object detection has applications in various fields such as Video [[surveillance]] (Traffic video Surveillance,<ref>{{Cite journal |last=Saran K B |last2=Sreelekha G |title=Traffic video surveillance: Vehicle detection and classification |url=http://ieeexplore.ieee.org/document/7432948/ |journal=2015 International Conference on Control Communication & Computing India (ICCC) |___location=Trivandrum, Kerala, India |publisher=IEEE |pages=516–521 |doi=10.1109/ICCC.2015.7432948 |isbn=978-1-4673-7349-4}}</ref><ref>{{Cite journal |last=Nemade |first=Bhushan |date=2016-01-01 |title=Automatic Traffic Surveillance Using Video Tracking |url=https://www.sciencedirect.com/science/article/pii/S1877050916001836 |journal=Procedia Computer Science |series=Proceedings of International Conference on Communication, Computing and Virtualization (ICCCV) 2016 |language=en |volume=79 |pages=402–409 |doi=10.1016/j.procs.2016.03.052 |issn=1877-0509}}</ref> [[Content-based image retrieval|Small object retrieval]],<ref>{{Cite journal |last=Guo |first=Haiyun |last2=Wang |first2=Jinqiao |last3=Xu |first3=Min |last4=Zha |first4=Zheng-Jun |last5=Lu |first5=Hanqing |date=2015-10-13 |title=Learning Multi-view Deep Features for Small Object Retrieval in Surveillance Scenarios |url=https://doi.org/10.1145/2733373.2806349 |journal=Proceedings of the 23rd ACM international conference on Multimedia |series=MM '15 |___location=New York, NY, USA |publisher=Association for Computing Machinery |pages=859–862 |doi=10.1145/2733373.2806349 |isbn=978-1-4503-3459-4}}</ref><ref>{{Cite journal |last=Galiyawala |first=Hiren |last2=Raval |first2=Mehul S. |last3=Patel |first3=Meet |date=2022-05-20 |title=Person retrieval in surveillance videos using attribute recognition |url=https://doi.org/10.1007/s12652-022-03891-0 |journal=Journal of Ambient Intelligence and Humanized Computing |language=en |doi=10.1007/s12652-022-03891-0 |issn=1868-5145}}</ref> [[Anomaly detection]],<ref>{{Cite journal |last=Ingle |first=Palash Yuvraj |last2=Kim |first2=Young-Gab |date=2022-05-19 |title=Real-Time Abnormal Object Detection for Video Surveillance in Smart Cities |url=https://www.mdpi.com/1424-8220/22/10/3862 |journal=Sensors |language=en |volume=22 |issue=10 |pages=3862 |doi=10.3390/s22103862 |issn=1424-8220 |pmc=9143895 |pmid=35632270}}</ref> [[Maritime surveillance]], [[Aerial survey|Drone surveying]], [[Traffic flow|Traffic flow analysis]],<ref>{{Cite journal |last=Tsuboi |first=Tsutomu |last2=Yoshikawa |first2=Noriaki |date=2020-03-01 |title=Traffic flow analysis in Ahmedabad (India) |url=https://www.sciencedirect.com/science/article/pii/S2213624X18301974 |journal=Case Studies on Transport Policy |language=en |volume=8 |issue=1 |pages=215–228 |doi=10.1016/j.cstp.2019.06.001 |issn=2213-624X}}</ref> and [[Video tracking|Object tracking]].
== Problems with small objects ==
* Modern-day object detection algorithms such as You Only Look Once(YOLO)<ref>{{
* Sometimes, the shadow of an object is detected as a part of object itself.<ref>{{Cite journal |last=Zhang |first=Mingrui |last2=Zhao |first2=Wenbing |last3=Li |first3=Xiying |last4=Wang |first4=Dan |date=2020-12-11 |title=Shadow Detection Of Moving Objects In Traffic Monitoring Video |url=https://ieeexplore.ieee.org/document/9338958/ |journal=2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) |___location=Chongqing, China |publisher=IEEE |pages=1983–1987 |doi=10.1109/ITAIC49862.2020.9338958 |isbn=978-1-7281-5244-8}}</ref> So, the placement of the bounding box tends to centre around a shadow rather than an object. In the case of vehicle detection, [[pedestrian]] and two-wheeler detection suffer because of this.
* At present, [[Unmanned aerial vehicle|drones]] are very widely used in aerial imagery.<ref>{{Cite journal |title=Interactive workshop "How drones are changing the world we live in" |url=http://ieeexplore.ieee.org/document/7486437/ |journal=2016 Integrated Communications Navigation and Surveillance (ICNS) |___location=Herndon, VA |publisher=IEEE |pages=1–17 |doi=10.1109/ICNSURV.2016.7486437 |isbn=978-1-5090-2149-9}}</ref> They are equipped with hardware ([[
[[File:Disp_shadow.jpg|thumb|Shadow and drone movement effect|alt=Here, both images are from same video. See, How the shadow of objects affecting detection accuracy. Also, drone's self-movement changes the scene near boundary(Refer to object "car" at bottom-left corner).
== Methods ==
Line 19:
[[File:Yolov5.jpg|thumb|YOLOv5 detection result]]
[[File:Y5_sahi.jpg|thumb|YOLOv5 and SAHI interface]]
[[File:Yolov7.jpg|thumb|YOLOv7 detection output
=== Improvising existing techniques ===
Line 28:
==== Generating more data via augmentation, if required ====
[[Deep learning]] models have billions of neurons that settle down to some weights after training. Therefore, it requires a good amount of quantitative and qualitative data for better training.<ref>{{Cite web |title=The Size and Quality of a Data Set {{!}} Machine Learning |url=https://developers.google.com/machine-learning/data-prep/construct/collect/data-size-quality |access-date=2022-09-14 |website=Google Developers |language=en}}</ref> [[Data augmentation]] is useful technique to generate more diverse data<ref name=":0" /> from an existing data set.
==== Increasing image capture resolution and model’s input resolution ====
These help to get more features from objects and eventually learn the best from them. For example, a bike object in the 1280 X 1280 [[Image resolution|resolution]] image has more features than the 640 X 640 resolution.
==== Auto learning anchors ====
Selecting anchor size plays a vital role in small object detection.<ref>{{
==== Tiling approach during training and inference ====
State-of-the-art object detectors allow only the fixed size of image and change the input image size according to it. This change may deform the small objects in the image. The tiling approach<ref>{{Cite journal |last=Unel |first=F. Ozge |last2=Ozkalayci |first2=Burak O. |last3=Cigla |first3=Cevahir |title=The Power of Tiling for Small Object Detection |url=https://ieeexplore.ieee.org/document/9025422/ |journal=2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) |___location=Long Beach, CA, USA |publisher=IEEE |pages=582–591 |doi=10.1109/CVPRW.2019.00084 |isbn=978-1-7281-2506-0}}</ref> helps when an image has a high resolution than the model's fixed input size; instead of scaling it down, the image is broken down into tiles and then used in training. The same approach is used during inference as well.
==== Feature Pyramid Network (FPN) ====
Use a feature [[Pyramid (image processing)|pyramid]] network<ref>{{
=== Add-on techniques ===
Instead of modifying existing methods, some add-on techniques are there, which can be directly placed on top of existing approaches to detect smaller objects. One such technique is Slicing Aided Hyper Inference(SAHI).<ref>{{
=== Well-Optimised techniques for small object detection ===
Various deep learning techniques are available that focus on such object detection problems: e.g., Feature-Fused SSD,<ref>{{Cite journal |last=Cao |first=Guimei |last2=Xie |first2=Xuemei |last3=Yang |first3=Wenzhe |last4=Liao |first4=Quan |last5=Shi |first5=Guangming |last6=Wu |first6=Jinjian |date=2018-04-10 |title=Feature-fused SSD: fast detection for small objects |url=https://www.spiedigitallibrary.org/conference-proceedings-of-spie/10615/106151E/Feature-fused-SSD-fast-detection-for-small-objects/10.1117/12.2304811.full |journal=Ninth International Conference on Graphic and Image Processing (ICGIP 2017) |publisher=SPIE |volume=10615 |pages=381–388 |doi=10.1117/12.2304811}}</ref> YOLO-Z.<ref>{{
== Other applications ==
Line 66:
== External links ==
* [https://github.com/VisDrone/VisDrone-Dataset VisDrone] dataset by AISKYEYE team at Lab of Machine Learning and Data Mining, Tianjin University, China.
[[Category:Image sensors]]
|