|
|
Transformer object detection in street view images based on improved YOLOv8 |
LIAO Fangzhou1, YANG Xiaoxia1, YANG Ronghao2, SHI Qiqi2 |
1. College of Geography and Planning, Chengdu University of Technology, Chengdu 610059; 2. College of Earth and Planetary Science, Chengdu University of Technology, Chengdu 610059 |
|
|
Abstract Street view images are a form of geospatial big data at the urban street level. Utilizing street view images not only enables large-scale and efficient transformer inspection but also reduces inspection costs. However, transformers in street view images often have few pixels, low resolution and complex backgrounds, leading to unsatisfactory precision of existing object detection methods. To address these issues, this paper proposes an improved YOLOv8 algorithm named YOLOv8-WSX. Firstly, wise intersection over union (WIoU) is used as the loss function to strengthen the detection performance of the algorithm for difficult samples. Secondly, the spatial group-wise enhance (SGE) attention mechanism module is introduced to improve the feature extraction ability of the algorithm. Finally, an extra-small object detection head is added to solve the problem of missing detection of extra-small transformer objects. The experimental results show that compared to YOLOv8, YOLOv8-WSX increases the F1 score by 5.9 percentage points, increases the mean average precision by 6.3 percentage points for IoU at 50%, and increases the mean average precision by 3.2 percentage points for IoU from 50% to 95%. Additionally, the model has fewer parameters.
|
Received: 14 June 2024
|
|
|
|
Cite this article: |
LIAO Fangzhou,YANG Xiaoxia,YANG Ronghao等. Transformer object detection in street view images based on improved YOLOv8[J]. Electrical Engineering, 2024, 25(12): 12-20.
|
|
|
|
URL: |
https://dqjs.cesmedia.cn/EN/Y2024/V25/I12/12
|
[1] 侯春羽, 侯永宏, 朱新山, 等. 视听觉协同的电网目标检测网络[J]. 高电压技术, 2024, 50(9): 4048-4057. [2] 张浩, 王玮, 徐丽杰, 等. 图像识别技术在电力设备监测中的应用[J]. 电力系统保护与控制, 2010, 38(6): 88-91. [3] 臧国强, 刘晓莉, 徐颖菲, 等. 深度学习在电力设备缺陷识别中的应用进展[J]. 电气技术, 2022, 23(6): 1-7. [4] 张丽英, 裴韬, 陈宜金, 等. 基于街景图像的城市环境评价研究综述[J]. 地球信息科学学报, 2019, 21(1): 46-58. [5] 张帆, 刘瑜. 街景影像——基于人工智能的方法与应用[J]. 遥感学报, 2021, 25(5): 1043-1054. [6] DALAL N, TRIGGS B.Histograms of oriented gradients for human detection[C]//2005 IEEE Com- puter Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 2005: 886-893. [7] PSYLLOS A P, ANAGNOSTOPOULOS C N E, KAYAFAS E. Vehicle logo recognition using a SIFT- based enhanced matching scheme[J]. IEEE Transa- ctions on Intelligent Transportation Systems, 2010, 11(2): 322-328. [8] VINAY A, VASUKI V, BHAT S, et al.Two dimen- sionality reduction techniques for SURF based face recognition[J]. Procedia Computer Science, 2016, 85: 241-248. [9] ZHU Tingge, ZHENG Jiangbin, LAI Yi, et al.Image blind detection based on LBP residue classes and color regions[J]. PLoS One, 2019, 14(8): e0221627. [10] ALI M, MACANA C A, PRAKASH K, et al.A novel transfer learning approach to detect the location of transformers in distribution network[C]//2020 8th Inter- national Conference on Smart Grid (ICSMARTGRID), Paris, France, 2020: 56-60. [11] 杨波, 曹雪虹, 焦良葆, 等. 改进实时目标检测算法的电力巡检鸟巢检测[J]. 电气技术, 2020, 21(5): 21-27, 32. [12] 郑含博, 李金恒, 刘洋, 等. 基于改进YOLOv3的电力设备红外目标检测模型[J]. 电工技术学报, 2021, 36(7): 1389-1398. [13] 石鑫, 化晨冰, 张凯, 等. 基于SSD深度神经网络的航拍图像电力目标检测[J]. 数据采集与处理, 2022, 37(1): 207-216. [14] 仲林林, 胡霞, 刘柯妤. 基于改进生成对抗网络的无人机电力杆塔巡检图像异常检测[J]. 电工技术学报, 2022, 37(9): 2230-2240, 2262. [15] 吴合风, 王国伟, 万造君, 等. 基于改进YOLOv8s的配电设备红外目标检测模型[J]. 电气技术, 2024, 25(3): 18-23. [16] LI Xiang, HU Xiaolin, YANG Jian.Spatial group-wise enhance: improving semantic feature learning in convolutional networks[EB/OL]. arXiv: https://arxiv.org/abs/1905.096.46. [17] TONG Zanjia, CHEN Yuhang, XU Zewei, et al.Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. arXiv: https://arxiv.org/abs/2301.10051. [18] YUAN Tailing, ZHU Zhe, XU Kun, et al.Chinese text in the wild[EB/OL]. arXiv: https://arxiv.org/abs/1803.00085. [19] WOO S, PARK J, LEE J Y, et al.CBAM: con-volutional block attention module[M]//Lecture Notes in Computer Science. Cham: Springer Inter- national Publishing, 2018: 3-19. [20] HUANG Hejun, CHEN Zuguo, ZOU Ying, et al.Channel prior convolutional attention for medical image segmentation[EB/OL]. arXiv: https://arxiv.org/abs/2306.05196. |
|
|
|