深圳信息职业技术学院学报

2023, 05, v.21 1-8

基于注意力机制的U²-Net图像语义分割

1.深圳信息职业技术学院计算机学院 2.深圳大学应用技术学院 3.广东技术师范大学电子与信息学院 4.深圳大学电子与信息工程学院

基金项目(Foundation):

邮箱(Email):

DOI:

307	7	351
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

图像语义分割是计算机视觉领域中的一项重要技术，在自动驾驶、医学影像分析、智能家居和安防监控等领域都有广泛的应用。近年，利用深度学习模型进行图像语义分割的方法得到了广泛关注和研究。然而，深度学习模型很容易出现过拟合问题，并且面对一些存在遮挡、噪声的图像时容易预测出错，从而导致模型分割精度下降。针对这个问题，提出了一种联合注意力机制的U²-Net图像语义分割优化方法，在以VGG为主干网络的U²-Net模型中，增加CBAM注意力模块，使网络模型能够更加关注与分割任务相关的区域，忽略掉一些无关或噪声干扰的区域，增强特征图的表征，进而能够有效地提高模型的性能和泛化能力。实验结果表明，在增加CBAM模块后，U²-Net模型的MIoU及准确率分别提高了8.21%和4%。

关键词： 图像语义分割; 注意力机制; U~2-Net; 深度学习;

Abstract：

Image semantic segmentation is an important technology in the field of computer vision, and has a wide range of applications in the fields of automatic driving, medical image analysis, smart home and security monitoring.In recent years, the method of image semantic segmentation using deep learning model has been widely concerned and researched. However, deep learning models are prone to overfitting problems and are prone to prediction errors when facing images with occlusion and noise, resulting in a decrease in model segmentation accuracy. To address this problem, a U²-Net image semantic segmentation optimization method based on joint attention mechanism is proposed.In the U²-Net model with VGG as the backbone network, the CBAM attention module is added to make the network model pay more attention to the regions related to the segmentation task, ignore some irrelevant or noise interference regions, and enhance the representation of the feature map, which can effectively improve the performance and generalization ability of the model. The experimental results show that the MIoU and accuracy of the U²-Net model are increased by 8.21 % and 4 % respectively after adding the CBAM module.

KeyWords： image semantic segmentation; attention mechanism; U~2-Net; deep learning;

如需获取全文，请访问cnki.net

参考文献

[1] LONGJ, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C].Proceedings of the IEEE conference on computer vision and pattern recognition.Piscataway, NJ, USA:IEEE, 2015:3431-3440.

[2] RONNEBERGER O, FISCHER P, BROX T. U-net:Convolutional networks for biomedical image segmentation[C].Medical Image Computing and Computer-Assisted Intervention–MICCAI2015:18th International Conference, Munich, Germany:Springer International Publishing, 2015:234-241.

[3] QIN X, ZHANG Z, HUANG C, et al. U2-Net:Going deeper with nested U-structure for salient object detection[J]. Pattern recognition, 2020(106):107404.

[4]刘孟轩,张蕊,曾志远,等.基于注意力机制的全卷积神经网络模型[J].现代信息科技,2021,5(23):92-95.

[5]欧阳柳,贺禧,瞿绍军.全卷积注意力机制神经网络的图像语义分割[J].计算机科学与探索,2022,16(5):1136-1145.

[6]彭鹄.基于全卷积神经网络的航空遥感图像语义分割及改进方法研究[D].哈尔滨:哈尔滨工业大学, 2020.

[7]张宸嘉,朱磊,俞璐.卷积神经网络中的注意力机制综述[J].计算机工程与应用, 2021,57(20):64-72.

[8] WOO S, PARK J, LEE J Y, et al. Cbam:Convolutional block attention module[C].Proceedings of the European conference on computer vision(ECCV). Munich, Germany:Springer,2018:3-19.

[9]李鑫.基于深度学习的图像语义分割方法研究[D].绵阳:西南科技大学, 2023.

[10]李艳青.基于注意力机制和U-Net的图像显著性检测算法研究与实现[D].北京:北京交通大学,2021.

[11]于海洋,景鹏,张文涛,等.基于残差与注意力机制的道路裂缝检测U-Net改进模型[J].计算机工程, 2023, 49(6):265-273.

[12] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected crfs[J]. Computer Science, 2014(4):357-361.

[13] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE transactions on pattern analysis and machine intelligence,2017, 40(4):834-848.

[14] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL].(2017-06-17)[2023-06-10]. https://arxiv.org/abs/1706.05587.

[15] CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoderdecoder with atrous separable convolution for semantic image segmentation[C].Proceedings of the European conference on computer vision(ECCV). Springer, 2018:801-818.

[16]陈其浩,孙林,张倩.基于改进U2-Net的透明件划痕检测方法[J].科学技术与工程, 2022,22(2):620-627.

[17]王健.基于U2-Net眼底OCT图像黄斑水肿的语义分割方法研究[D].哈尔滨:黑龙江科技大学, 2023.

[18] HARIHARAN B, ARBELáEZ P, BOURDEV L, et al. Semantic contours from inverse detectors[C].2011 international conference on computer vision. Piscataway, NJ, USA:IEEE,2011:991-998.

[19] EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al. The PASCAL Visual Object Classes Challenge 2012(VOC2012)Results[DB/OL].(2020-10-25)[2023-06-10].http://www.pascalnetwork.org/challenges/VOC/voc2012/workshop/index.html.

[20] PASZKE A, GROSS S, MASSA F, et al. Pytorch:An imperative style, high-performance deep learning library[C].Advances in neural information processing systems. MIT Press, 2019.

[21] KINGMA D P, BA J. Adam:A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.

[22] HU J, SHEN L, Sun G. Squeeze-and-excitation networks[C].Proceedings of the IEEE conference on computer vision and pattern recognition. Piscataway, NJ, USA:IEEE, 2018:7132-7141.

基本信息:

中图分类号:TP391.41

引用信息:

[1]刘帅,邓晓冰,杨火祥,等.基于注意力机制的U~2-Net图像语义分割[J].深圳信息职业技术学院学报,2023,21(05):1-8.

发布时间：

2023-10-28

出版时间：

2023-10-28

请选择需要下载的pdf数据

深圳信息职业技术学院学报

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文

请选择需要下载的pdf数据

深圳信息职业技术学院学报

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

引用

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈