The problem of zero-shot image classification has become a research focus in the field of image classification. In this paper, a method based on generative adversarial network (GAN) is used to solve the problem of zero-shot image classification. By generating image features of unseen classes, the ze...
The purpose of remote sensing image fusion is to fuse high-spectral-resolution low-spatial-resolution multispectral (MS) images and high-spatial-resolution low-spectral-resolution panchromatic (PAN) images, so as to obtain the fusion images with high spectral resolution and high spatial resolution. ...
The disadvantage of tongue image segmentation in traditional Chinese medicine are low accuracy, slow segmentation speed and manual calibration of candidate regions.To solve these problems, we propose an end-to-end tongue image segmentation algorithm. Compared with the traditional tongue segmentation...
Three-dimensional human pose estimation is a hot research topic in the field of computer vision. Aimed at the lack of labels in depth images and the low generalization ability of models caused by single human pose, this paper innovatively proposes a method of 3D human pose estimation based on multi-...
In virtual reality (VR) video streaming media transmission, how to further improve user's quality of experience (QoE) under bandwidth-constrained conditions is a huge challenge. In order to improve resource utilization rate and user QoE, a multi-user QoE-driven uplink and downlink joint VR video str...
Currently, underwater robot faces the tough challenges of lacking intelligent detection and autonomous capture system to guide. Therefore, autonomous capture is hard to be achieved. Toward this end, this paper proposes an intelligent detection and autonomous capture system to achieve intelligent det...
Watermarking is a technique of embedding a mark called a watermark in an image to prove the copyright of this image. This paper proposes an image texture based adaptive watermarking algorithm by taking advantage of the textured regions of an image being easy to hide the watermark. Firstly, a texture...
Merge mode saves the number of bits required to encode motion information by sharing the motion vector (MV) in neighboring blocks, which improves the rate-distortion performance of encoders effectively. However, motion compensation prediction (MCP) is not accurate enough in the merge mode currently....
In order to effectively display the mesoscale structure of the network, the force-directed layout algorithm is combined with the network community structure features, and a network visualization compression layout method based on the importance of the node in the community structure is proposed. Fir...
In order to solve the problems of too many variables and high model complexity, it is necessary to effectively reduce the dimension of the data according to the characteristics in establishing the prediction model of total phenol content in grape seeds by using hyperspectral data. In this paper, a M...
Due to the development of generative adversarial network (GAN), much progress has been achieved in the research of text-to-image synthesis. However, most of the researches focus on improving the stability and resolution of generated images rather than aesthetic quality. On the other hand, image aest...
A screen content image (SCI) has great difference compared with a natural image, and an SCI contains more text, graphic, and special layout. Considering the influences of texts, graphics, pictures, and layout on quality of an SCI, a blind quality assessment metric for SCIs based on edge and structur...
The key for 3D video network service is to improve the quality of experience (QoE) of users, which can be, however, affected by mutable network conditions and video contents. For conventional 2D videos, the HTTP adaptive streaming (HAS) technique has demonstrated its significance in improving user Q...
The existing 3D human pose estimation method focuses on estimating the 3D pose of the human body through a single frame image, while ignoring the correlation between the front and back frames in the video. Therefore, by investigating the information of the video in the time dimension, the accuracy o...
Monocular vision-based simultaneous localization and mapping (SLAM) is a popular technology in the field of robotics in recent years. However, due to the huge computation resource required by reconstruction, mainstream methods are not able to generate meaningful reconstruction of scene in real time ...
Video thumbnail, as the most intuitive form of video content, plays an important role in video sharing sites and is one of the key elements to attract users to click and watch the video. However, a descriptive statement related to video content with a video thumbnail associated with the content of t...
The subjective visual experience of viewing images using a variety of display devices is usually affected by image retargeting operation. To improve the consistency between subjective perception and objective assessment for the retargeted images, we present an objective retargeted image quality asse...
In view of the synergistic relationship among different parts of the body when human body performs actions, a behavior recognition method based on human body spatial cooperative motion structural features is proposed. Firstly, the contribution of different parts of the human body to the completion o...
Convolutional neural network (CNN) has achieved great success in the field of video enhancement. The existing video enhancement methods mainly explore the pixel correlations in spatial domain of an image, which ignores the temporal similarity between consecutive frames. To address the above issue, t...
Automatic classification of aesthetics in images has been a popular research field in these years. Chinese traditional painting is a pivotal embodiment of Chinese traditional arts, so its aesthetics shows a great potential for researching. In this paper, the automatic classification study and releva...
Two-dimensional shape recognition is a fundamental problem in object recognition, which is widely used in trademark retrieval, fingerprint recognition, object location, image retrieval and other fields. Recently, two-dimensional shape recognition based on bioinformatics has become a new research dir...
This paper proposes a stock price prediction model based on particle swarm optimization long short-term memory (PSO-LSTM). This model improves and optimizes the LSTM model, which makes it more appropriate for analyzing relationships such as long-term dependency and for solving complex nonlinear prob...