2024 Vol. 50, No. 2

Display Method:
Volume 2 Issue E-journal
Volume 50 Issue22024
iconDownload (49828) 959 iconPreview
An overview of visual SLAM methods
WANG Peng, HAO Weilong, NI Cui, ZHANG Guangyuan, GONG Hui
2024, 50(2): 359-367. doi: 10.13700/j.bh.1001-5965.2022.0376
Abstract:

Simultaneous localization and mapping (SLAM) enables mobile robots to calculate their position and pose by independently building an environment model during movement without any environmental prior conditions by carrying specific sensors. It can greatly improve the autonomous navigation ability of ...

Zero-shot object detection based on multi-modal joint semantic perception
DUAN Lijuan, YUAN Ying, WANG Wenjian, LIANG Fangfang
2024, 50(2): 368-375. doi: 10.13700/j.bh.1001-5965.2022.0392
Abstract:

Existing zero-shot object detection maps visual features and category semantic embeddings of unseen items to the same space using semantic embeddings as guiding information, and then classifies the objects based on how close together the visual features and semantic embeddings are in the mapped spac...

Accurate license plate location based on synchronous vertex and body region detection
XU Guangzhu, LIU Gaofei, KUANG Wan, WAN Qiubo, MA Guoliang, LEI Bangjun
2024, 50(2): 376-387. doi: 10.13700/j.bh.1001-5965.2022.0396
Abstract:

A novel unconstrained license plate accurate location algorithm is designed by simultaneously detecting the four local vertex regions and the body of a license plate and fusing the results to address the issue that the widely used rectangular bounding boxes in mainstream target detection methods can...

Image super-resolution reconstruction network based on expectation maximization self-attention residual
HUANG Shuying, HU Hanyang, YANG Yong, WAN Weiguo, WU Zheng
2024, 50(2): 388-397. doi: 10.13700/j.bh.1001-5965.2022.0401
Abstract:

In recent years, most deep learning-based image super-resolution (SR) reconstruction methods mainly improve the quality of image reconstruction by increasing the depth of the model, while also increasing the computational cost of the model. Additionally, a lot of networks have implemented the attent...

Discrete sparrow search algorithm incorporating rough data-deduction for solving hybrid flow-shop scheduling problems
ZHOU Ning, ZHANG Songlin, ZHANG Chen
2024, 50(2): 398-408. doi: 10.13700/j.bh.1001-5965.2022.0424
Abstract:

To address the shortcomings of the sparrow search algorithm (SSA), such as easy fall into local optimum and inability to solve discrete optimization problems, an improved discrete sparrow search algorithm (IDSSA) is proposed. Firstly, the position update formula of the original sparrow search algori...

Aspect sentiment triple extraction for grammar-weighted graph text
HAN Hu, MENG Tiantian
2024, 50(2): 409-418. doi: 10.13700/j.bh.1001-5965.2022.0443
Abstract:

Aspect sentiment triple extraction includes three tasks: aspect term extraction, opinion term extraction, and aspect sentiment classification. However, research methods that solve this task in a pipeline way cannot utilize the interaction information between elements, and will also cause error propa...

Real-time robust visual tracking based on spatial attention mechanism
MA Sugang, ZHANG Zixian, PU Lei, HOU Zhiqiang
2024, 50(2): 419-432. doi: 10.13700/j.bh.1001-5965.2022.0329
Abstract:

A real-time object tracking method coupled with a spatial attention mechanism is suggested in order to enhance the fully convolutional Siamese network (SiamFC) tracker’s tracking capability in complex settings and alleviate the target drift problem in the tracking process. The improved visual geomet...

Cross-modality nearest neighbor loss for visible-infrared person re-identification
ZHAO Sanyuan, A Qi, GAO Yu
2024, 50(2): 433-441. doi: 10.13700/j.bh.1001-5965.2022.0422
Abstract:

The goal of the visual-infrared person re-identification task is to search the image of a specific person in a given modality in the image set taken by other cameras in different modality to find out the corresponding image of the same person. Due to the different imaging methods, there are obvious ...

A Transformer based deep conditional video compression
LU Guo, ZHONG Tianxiong, GENG Jing
2024, 50(2): 442-448. doi: 10.13700/j.bh.1001-5965.2022.0374
Abstract:

Convolutional neural networks (CNN) are the foundation of most recent learning-based video compression algorithms, which also use residual coding and motion compensation architectures. It is difficult to attain the best compression performance given that typical CNN can only use local correlations a...

Solidly mounted resonator based on optimized Bragg structure
ZHANG Shifeng, XUAN Weipeng, SHI Linhao, DONG Shurong, PU Shiliang
2024, 50(2): 449-455. doi: 10.13700/j.bh.1001-5965.2022.0436
Abstract:

The effective coupling coefficient and the quality factor of the bulk acoustic wave resonators determine the overall performance of bulk acoustic wave filters. The effective coupling coefficient is dependent on the layer stack structure, especially the piezoelectric material, while the quality facto...

Image captioning model based on divergence-based and spatial consistency constraints
JIANG Wenhui, CHEN Zhiliang, CHENG Yibo, FANG Yuming, ZUO Yifan
2024, 50(2): 456-465. doi: 10.13700/j.bh.1001-5965.2022.0400
Abstract:

The multi-head attention mechanism has been widely adopted in image captioning. It is appealing for the ability to jointly attend to information from different representation subspaces. However, as each head captures distinct properties of the input individually, the diversity between heads’ represe...

A person re-identification method for fusing convolutional attention and Transformer architecture
WANG Jing, LI Peitong, ZHAO Rongfeng, ZHANG Yun, MA Zhenling
2024, 50(2): 466-476. doi: 10.13700/j.bh.1001-5965.2022.0456
Abstract:

Person Re-identification technology is one of the important methods in intelligent security systems. In order to build a person re-identification model suitable for various complex scenarios, this article proposed a method of Fusing Convolutional Attention and Transformer architecture (FCAT) based o...

Periodic pattern-enhanced multi-view short-term load prediction
SU Wei, XIAO Xiaolong, SHI Mingming, FANG Xin, SI Xinyao
2024, 50(2): 477-486. doi: 10.13700/j.bh.1001-5965.2022.0399
Abstract:

Short-term load prediction is essential to ensure the proper operation of the power system. The existing efforts have two limitations: lack of mining the dependencies between features and ignore the periodic pattern of load changes. To solve the above limitations, we propose periodic Pattern-enhance...

Optimization of office process task allocation based on deep reinforcement learning
LIAO Chenyang, YU Jinsong, LE Xiangli
2024, 50(2): 487-498. doi: 10.13700/j.bh.1001-5965.2022.0290
Abstract:

In the office platform, we often need to face a large number of parallel heterogeneous process tasks. This not only tests the ability of task executors but also puts forward requirements for the performance of the scheduling system. The multi-agent game model based on Markov game theory is proposed ...

Improved YOLOv5s low-light underwater biological target detection algorithm
CHEN Yuliang, DONG Shaojiang, SUN Shizheng, YAN Kaibo
2024, 50(2): 499-507. doi: 10.13700/j.bh.1001-5965.2022.0322
Abstract:

A real-time detection method of low-light underwater biological target based on improved YOLOv5s, known as YOLOv5s-underwater, was proposed to address the issue of low biometric recognition accuracy caused by the significant attenuation of light in water, the complex image environment, and the movem...

Multi group sparrow search algorithm based on K-means clustering
YAN Shaoqiang, LIU Weidong, YANG Ping, WU Fengxuan, YAN Zhe
2024, 50(2): 508-518. doi: 10.13700/j.bh.1001-5965.2022.0328
Abstract:

A K-means multi-group sparrow search algorithm (KSSA) based on K-means clustering is proposed in order to improve the convergence speed of the sparrow search algorithm (SSA) in single population search, which causes redundancy in its convergence speed and makes it simple to ignore the flaw that the ...

Multi-agent coverage control based on communication connectivity maintenance constraints
ZHANG Yunlin, MA Zhuangzhuang, SHI Lei, SHAO Jinliang
2024, 50(2): 519-528. doi: 10.13700/j.bh.1001-5965.2022.0340
Abstract:

Coverage control will disperse the agents as much as possible according to the environmental information to achieve a better spatial coverage effect and realize the optimal monitoring of the task area. In this process, the cooperation between agents depends on the connected communication network. Li...

Multi-hop knowledge graph question answering based on deformed graph matching
LI Xiangyue, FANG Quan, HU Jun, QIAN Shengsheng, XU Changsheng
2024, 50(2): 529-534. doi: 10.13700/j.bh.1001-5965.2022.0375
Abstract:

Knowledge Graph Question Answering (KGQA) is a process in which a given natural language question is semantically understood and parsed, and then the knowledge graph is used to query and reason to get the answer. But knowledge graphs which lack links, bring many challenges to multi-hop question answ...

A smooth path planning method based on Dijkstra algorithm
GONG Hui, NI Cui, WANG Peng, CHENG Nuo
2024, 50(2): 535-541. doi: 10.13700/j.bh.1001-5965.2022.0377
Abstract:

When the mobile robot moves along the path planned by the Dijkstra algorithm in a complex environment, due to the planned path having many turning points and some turning angles being small, the mobile robot has to turn frequently or even pause to complete the turning, which seriously affects the wo...

Language-guided target segmentation method based on multi-granularity feature fusion
TAN Quange, WANG Rong, WU Ao
2024, 50(2): 542-550. doi: 10.13700/j.bh.1001-5965.2022.0384
Abstract:

The objective of language-guided target segmentation is to match the targets described in the text with the entities they refer to, thereby achieving an understanding of the relationships between text and entities, as well as the localization of the referred targets. This task has significant applic...

Image-text matching algorithm based on multi-level semantic alignment
LI Yiru, YAO Tao, ZHANG Linliang, SUN Yujuan, FU Haiyan
2024, 50(2): 551-558. doi: 10.13700/j.bh.1001-5965.2022.0385
Abstract:

The regional features in the image tend to pay more attention to the regional features in the image, and the environmental information is often ignored. How to effectively combine local features and global features has not been fully studied. A image-text maxching algorthm based on multi-level seman...

Graph pooling method based on multilevel union
DONG Xiaolong, HUANG Jun, QIN Feng, HONG Xudong
2024, 50(2): 559-568. doi: 10.13700/j.bh.1001-5965.2022.0386
Abstract:

Graph pooling method has been widely used in bioinformatics, chemistry, social networks, recommendation systems and other fields. At present, the graph pooling method does not solve the problem of node selection and node information loss caused by pooling. A new graph pooling method is proposed, nam...

Image-text aspect emotion recognition based on joint aspect attention interaction
ZHAO Yicheng, WANG Suge, LIAO Jian, HE Donghuan
2024, 50(2): 569-578. doi: 10.13700/j.bh.1001-5965.2022.0387
Abstract:

Due to the quick development of social media, the sentiment conveyed by users cannot be reliably identified by an Aspect-Category Sentiment Analysis of the text alone. However, the existing Aspect-Category Sentiment Analysis methods for image and text data only consider the interaction between image...

Multi-modal mask Transformer network for social event classification
CHEN Hong, QIAN Shengsheng, LI Zhangming, FANG Quan, XU Changsheng
2024, 50(2): 579-587. doi: 10.13700/j.bh.1001-5965.2022.0388
Abstract:

Utilizing both the properties of the text and image modalities to the fullest extent possible is essential for multi-modal social event classification. However,most of the existing methods have the following limitations: They simply concatenate the image features and textual features of events. The ...

Region-aware real-time portrait super resolution reconstruction network
GONG Kecun, ZHOU Menglin, TANG Dongming
2024, 50(2): 588-595. doi: 10.13700/j.bh.1001-5965.2022.0394
Abstract:

Conventional techniques typically process the entire image uniformly, which leads to low efficiency in the field of portrait super-resolution reconstruction.To reduce the inference latency of the model, this research proposes a real-time super-resolution reconstruction model RASR. The model first us...

Multimodal bidirectional information enhancement network for RGBT tracking
ZHAO Wei, LIU Lei, WANG Kunpeng, TU Zhengzheng, LUO Bin
2024, 50(2): 596-605. doi: 10.13700/j.bh.1001-5965.2022.0395
Abstract:

The goal of RGB-thermal infrared (RGBT) visual object tracking, which has drawn increasing interest in recent years, is to take advantage of the complimentary strengths of RGB and thermal infrared picture data to accomplish reliable visual tracking. For obtaining a robust appearance representation o...

Multi-source remote sensing image classification based on Transformer and dynamic 3D-convolution
GAO Feng, MENG Desen, XIE Zhengyuan, QI Lin, DONG Junyu
2024, 50(2): 606-614. doi: 10.13700/j.bh.1001-5965.2022.0397
Abstract:

Benefited from the complementarity and synergy of multi-source remote sensing data, deep learning-based methods have made significant progress in remote sensing image classification in recent years. Building a powerful multi-source data joint classification model is typically difficult for the follo...

Cross-modal hashing network based on self-attention similarity transfer
LIANG Huan, WANG Hairong, WANG Dong
2024, 50(2): 615-622. doi: 10.13700/j.bh.1001-5965.2022.0402
Abstract:

To further improve the performance of cross-modal retrieval, a cross-modal hashing network model is proposed based on self-attention similarity transfer. A channel spatial hybrid self-attention mechanism is designed to strengthen the key information of the concerned image, and the common attention m...

Multi-input Fourier neural network and its sparrow search optimization
LI Liangliang, ZHANG Zhuhong, ZHANG Yongdan
2024, 50(2): 623-633. doi: 10.13700/j.bh.1001-5965.2022.0404
Abstract:

In engineering applications, the back-propagation (BP) neural network often encounters many limitations due to its slow convergence and high noise sensitivity, and the reported Fourier neural networks cannot extract the features of multi-attribute input data. Hereby, this work proposes a gradient de...

Lossy point cloud geometry compression based on Transformer
LIU Gexin, ZHANG Junteng, DING Dandan
2024, 50(2): 634-642. doi: 10.13700/j.bh.1001-5965.2022.0412
Abstract:

Point clouds are widely used for 3D object representation, however, real-world captured point clouds often have huge data, which is unfavorable for transmission and storage. To address the redundancy problem of point cloud data, an end-to-end Transformer-based multiscale point cloud geometry compres...

Software defect prediction algorithm for intra-membrane sparrow optimizing ELM
TANG Yu, DAI Qi, YANG Mengyuan, CHEN Lifang
2024, 50(2): 643-654. doi: 10.13700/j.bh.1001-5965.2022.0438
Abstract:

The original sparrow search algorithm,easy to fall into local extremum in the later stage of iteration, has the problems of low optimization accuracy. Combining the improved sparrow search algorithm with efficient optimization performance and the membrane computing with parallel computing capability...

Person re-identification method based on attention mechanism and CondConv
JI Guangkai, WANG Rong, PENG Shufan
2024, 50(2): 655-662. doi: 10.13700/j.bh.1001-5965.2022.0454
Abstract:

Person Re-identification is an important part of the field of computer vision, but it is easily affected by the actual collection environment of person images, resulting in insufficient expression of person features and further leading to low model accuracy. An improved person re-identification meth...

Adversarial attack method based on loss smoothing
LI Meihong, JIN Shuang, DU Ye
2024, 50(2): 663-670. doi: 10.13700/j.bh.1001-5965.2022.0478
Abstract:

Deep neural networks (DNNs) are susceptible to attacks from adversairial samples. Most existing momentum-based adversarial attack methods achieve nearly 100% attack success rates under the white-box setting, but only achieve relatively low attack success rates under the black-box setting. An adversa...

Remote sensing image-text retrieval based on layout semantic joint representation
ZHANG Ruoyu, NIE Jie, SONG Ning, ZHENG Chengyu, WEI Zhiqiang
2024, 50(2): 671-683. doi: 10.13700/j.bh.1001-5965.2022.0527
Abstract:

Remote sensing image-text retrieval can retrieve valuable information from remote sensing data. It is of great significance to environmental assessment, urban planning and disaster prediction. However, there is a key problem that the spatial layout information of remote sensing images is ignored, wh...

Multilevel relation analysis and mining method of image-text
GUO Ruiping, WANG Hairong, WANG Dong
2024, 50(2): 684-694. doi: 10.13700/j.bh.1001-5965.2022.0599
Abstract:

How to efficiently mine the hidden semantic association between multi-modal data is one of the key tasks of multi-modal knowledge extraction. In order to mine fine-grained relation between image and text, multilevel relation analysis and mining method of image-text (MRAM) was proposed. BERT-Large (b...