Small object detection: Difference between revisions

Content deleted Content added
Line 9:
== Problems with small objects ==
 
* Modern-day object detection algorithms such as [[You Only Look Once(YOLO)]]<ref>{{cite arXiv |last1=Redmon |first1=Joseph |last2=Divvala |first2=Santosh |last3=Girshick |first3=Ross |last4=Farhadi |first4=Ali |date=2016-05-09 |title=You Only Look Once: Unified, Real-Time Object Detection |class=cs.CV |eprint=1506.02640}}</ref><ref>{{cite arXiv |last1=Redmon |first1=Joseph |last2=Farhadi |first2=Ali |date=2016-12-25 |title=YOLO9000: Better, Faster, Stronger |class=cs.CV |eprint=1612.08242}}</ref><ref>{{cite arXiv |last1=Redmon |first1=Joseph |last2=Farhadi |first2=Ali |date=2018-04-08 |title=YOLOv3: An Incremental Improvement |class=cs.CV |eprint=1804.02767}}</ref><ref>{{cite arXiv |last1=Bochkovskiy |first1=Alexey |last2=Wang |first2=Chien-Yao |last3=Liao |first3=Hong-Yuan Mark |date=2020-04-22 |title=YOLOv4: Optimal Speed and Accuracy of Object Detection |class=cs.CV |eprint=2004.10934}}</ref><ref>{{cite arXiv |last1=Wang |first1=Chien-Yao |last2=Bochkovskiy |first2=Alexey |last3=Liao |first3=Hong-Yuan Mark |date=2021-02-21 |title=Scaled-YOLOv4: Scaling Cross Stage Partial Network |class=cs.CV |eprint=2011.08036}}</ref><ref>{{cite arXiv |last1=Li |first1=Chuyi |last2=Li |first2=Lulu |last3=Jiang |first3=Hongliang |last4=Weng |first4=Kaiheng |last5=Geng |first5=Yifei |last6=Li |first6=Liang |last7=Ke |first7=Zaidan |last8=Li |first8=Qingyuan |last9=Cheng |first9=Meng |last10=Nie |first10=Weiqiang |last11=Li |first11=Yiduo |last12=Zhang |first12=Bo |last13=Liang |first13=Yufei |last14=Zhou |first14=Linyuan |last15=Xu |first15=Xiaoming |date=2022-09-07 |title=YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications |class=cs.CV |eprint=2209.02976}}</ref><ref>{{cite arXiv |last1=Wang |first1=Chien-Yao |last2=Bochkovskiy |first2=Alexey |last3=Liao |first3=Hong-Yuan Mark |date=2022-07-06 |title=YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors |class=cs.CV |eprint=2207.02696}}</ref> heavily uses convolution layers to learn [[Feature (computer vision)|features]]. As an object passes through convolution layers, its size gets reduced. Therefore, the small object disappears after several layers and becomes undetectable.
* Sometimes, the shadow of an object is detected as a part of object itself.<ref>{{Cite book |last1=Zhang |first1=Mingrui |last2=Zhao |first2=Wenbing |last3=Li |first3=Xiying |last4=Wang |first4=Dan |title=2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) |chapter=Shadow Detection of Moving Objects in Traffic Monitoring Video |date=2020-12-11 |chapter-url=https://ieeexplore.ieee.org/document/9338958 |volume=9 |___location=Chongqing, China |publisher=IEEE |pages=1983–1987 |doi=10.1109/ITAIC49862.2020.9338958 |isbn=978-1-7281-5244-8|s2cid=231824327 }}</ref> So, the placement of the bounding box tends to centre around a shadow rather than an object. In the case of vehicle detection, [[pedestrian]] and two-wheeler detection suffer because of this.
* At present, [[Unmanned aerial vehicle|drones]] are very widely used in aerial imagery.<ref>{{Cite book |chapter-url=https://ieeexplore.ieee.org/document/7486437 |year=2016 |___location=Herndon, VA |publisher=IEEE |pages=1–17 |doi=10.1109/ICNSURV.2016.7486437 |isbn=978-1-5090-2149-9|s2cid=21388151 |chapter=Interactive workshop "How drones are changing the world we live in" |title=2016 Integrated Communications Navigation and Surveillance (ICNS) }}</ref> They are equipped with hardware ([[sensor]]s) and software ([[algorithm]]s) that help maintain a particular stable position during their flight. In windy conditions, the drone automatically makes fine moves to maintain its position and that changes the view near the boundary. It may be possible that some new objects appear near the image boundary. Overall, these affect classification, detection, and eventually tracking accuracy.