著者
Aorui GOU Jingjing LIU Xiaoxiang CHEN Xiaoyang ZENG Yibo FAN
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E107-A, no.1, pp.141-156, 2024-01-01

Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.
著者
Weina ZHOU Xinxin HUANG Xiaoyang ZENG
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E105-D, no.8, pp.1393-1400, 2022-08-01
被引用文献数
4

As a kind of marine vehicles, Unmanned Surface Vehicles (USV) are widely used in military and civilian fields because of their low cost, good concealment, strong mobility and high speed. High-precision detection of obstacles plays an important role in USV autonomous navigation, which ensures its subsequent path planning. In order to further improve obstacle detection performance, we propose an encoder-decoder architecture named Fusion Refinement Network (FRN). The encoder part with a deeper network structure enables it to extract more rich visual features. In particular, a dilated convolution layer is used in the encoder for obtaining a large range of obstacle features in complex marine environment. The decoder part achieves the multiple path feature fusion. Attention Refinement Modules (ARM) are added to optimize features, and a learnable fusion algorithm called Feature Fusion Module (FFM) is used to fuse visual information. Experimental validation results on three different datasets with real marine images show that FRN is superior to state-of-the-art semantic segmentation networks in performance evaluation. And the MIoU and MPA of the FRN can peak at 97.01% and 98.37% respectively. Moreover, FRN could maintain a high accuracy with only 27.67M parameters, which is much smaller than the latest obstacle detection network (WaSR) for USV.