2019 Vol. 45, No. 12

Display Method:
Volume 45 Issue122019
iconDownload (14894) 691 iconPreview
Zero-shot image classification based on generative adversarial network
WEI Hongxi, ZHANG Yue
2019, 45(12): 2345-2350. doi: 10.13700/j.bh.1001-5965.2019.0363
Abstract:

The problem of zero-shot image classification has become a research focus in the field of image classification. In this paper, a method based on generative adversarial network (GAN) is used to solve the problem of zero-shot image classification. By generating image features of unseen classes, the zero-shot classification task is transformed into a conventional image classification task. At the same time, this paper makes modifications to the discriminant network in the generative adversarial network to make the discriminating process more accurate. The experimental results show that the performance of the proposed method has been increased by 0.4%, 0.4% and 0.5% on the datasets of AWA, CUB and SUN, respectively. Therefore, the proposed method can generate the better features by improving the generative adversarial networks, which results in solving the problem of zero-shot image calssification effectively.

Remote sensing image fusion method based on adaptive injection model
YANG Yong, LU Hangyuan, HUANG Shuying, TU Wei, LI Luyi
2019, 45(12): 2351-2363. doi: 10.13700/j.bh.1001-5965.2019.0372
Abstract:

The purpose of remote sensing image fusion is to fuse high-spectral-resolution low-spatial-resolution multispectral (MS) images and high-spatial-resolution low-spectral-resolution panchromatic (PAN) images, so as to obtain the fusion images with high spectral resolution and high spatial resolution. How to determine the injection details and injection coefficients in the injection model is the key to image fusion research. For detail optimization, a multi-scale Gaussian filter is defined by simulating the characteristics of MS sensor, and then the filter is used to convolve with PAN image to extract the details and obtain the details highly related to MS image. In order to optimize the injection coefficient, an adaptive injection coefficient is proposed based on spectral information and detail information. To better preserve the edge information, a new edge preserving weight matrix is proposed to achieve the dual fidelity of spectral information and space. Finally, the optimized injection coefficient is multiplied by details and injected into the up-sampled MS image to obtain the final fusion result. Performance analysis of the method proposed in this paper has been carried out and a large number of tests have been conducted in each satellite dataset. The experimental results show that, compared with the most advanced methods, the proposed method performs best in both subjective and comprehensive objective assessment.

Tongue image segmentation algorithm based on deep convolutional neural network and fully conditional random fields
ZHANG Xinfeng, GUO Yutong, CAI Yiheng, SUN Meng
2019, 45(12): 2364-2374. doi: 10.13700/j.bh.1001-5965.2019.0370
Abstract:

The disadvantage of tongue image segmentation in traditional Chinese medicine are low accuracy, slow segmentation speed and manual calibration of candidate regions.To solve these problems, we propose an end-to-end tongue image segmentation algorithm. Compared with the traditional tongue segmentation algorithm, more accurate segmentation results can be obtained by the proposed method which does not need any manual operation. Firstly, the atrous convolution algorithm is used to increase the feature map of the network without increasing the parameters. Secondly, the atrous spatial pyramid pooling (ASPP) module is used to enable the network to learn the multi-scale feature of the tongue image through different receptive fields. Finally, the deep convolutional neural networks (DCNN) are combined with fully connected conditional random fields (CRF) to refine the edge of the segmented tongue image. The experimental results show that the proposed method outperforms traditional tongue image segmentation algorithm and popular DCNN with higher segmentation accuracy, and the mean intersection over union reaches 95.41%.

Three-dimensional human pose estimation based on multi-source image weakly-supervised learning
CAI Yiheng, WANG Xueyan, HU Shaobin, LIU Jiaqi
2019, 45(12): 2375-2384. doi: 10.13700/j.bh.1001-5965.2019.0387
Abstract:

Three-dimensional human pose estimation is a hot research topic in the field of computer vision. Aimed at the lack of labels in depth images and the low generalization ability of models caused by single human pose, this paper innovatively proposes a method of 3D human pose estimation based on multi-source image weakly-supervised learning. This method mainly includes the following points. First, multi-source image fusion training method is used to improve the generalization ability of the model. Second, weakly-supervised learning approach is proposed to solve the problem of label insufficiency. Third, in order to improve the attitude estimation results, this paper improve the design of the residual module. The experimental results show that the regression accuracy from our improved network increases by 0.2%, and meanwhile the training time reduces by 28% compared with the original network. In a word, the proposed method obtains excellent estimation results with both depth images and color images.

QoE driven adaptation for VR video capturing and transmission
LI Jie, FENG Ransheng, YANG Yangzhao, SUN Wei, LI Qiyue
2019, 45(12): 2385-2392. doi: 10.13700/j.bh.1001-5965.2019.0364
Abstract:

In virtual reality (VR) video streaming media transmission, how to further improve user's quality of experience (QoE) under bandwidth-constrained conditions is a huge challenge. In order to improve resource utilization rate and user QoE, a multi-user QoE-driven uplink and downlink joint VR video streaming media adaptive acquisition and transmission system is proposed, which is different from the traditional VR video wireless transmission system. The proposed system considers the uplink transmission part. The video server selects code rate and allocates resources based on the rate adaptation based on the bandwidth information of the uplink channel and the downlink channel and the real-time view information of the user. In addition, we define the problem of QoE-driven rate selection and resource allocation to maximize the QoE value of all users across the system. Finally, we propose an optimal adaptive rate selection algorithm combining the KKT condition and the branch and bound method. The experimental results show that the system can effectively improve the QoE value of the total system users, improve the system performance by 14.27% based on the average uplink allocation, and improve the performance of the VR video rate adaptive algorithm by 23.47%.

Intelligent detection and autonomous capture system of seafood based on underwater robot
XU Fengqiang, DONG Peng, WANG Huibing, FU Xianping
2019, 45(12): 2393-2402. doi: 10.13700/j.bh.1001-5965.2019.0377
Abstract:

Currently, underwater robot faces the tough challenges of lacking intelligent detection and autonomous capture system to guide. Therefore, autonomous capture is hard to be achieved. Toward this end, this paper proposes an intelligent detection and autonomous capture system to achieve intelligent detection of marine target and guide the underwater robot to autonomously capture seafood. First, we employ convolutional neural network to perform object detection task in underwater scene and train the DSOD with underwater dataset to accurately detect marine objects. What's more, the short baseline positioning system is built to locate the underwater robot. To calculate the position of the object relative to robot, this paper proposes a coordinate transforming method to transform the target's location from camera coordinates system to underwater positioning coordinates. Furthermore, this paper designs a multi-signal analysis method based on feedback mechanism to command the robot to move ahead to the seafood until grasping them. To verify the effectiveness of the system, we develop an underwater picking robot and successfully apply the proposed methods to the robot to autonomously detect and capture the marine object.

Image texture based adaptive watermarking algorithm
HUANG Ying, NIU Baoning, GUAN Hu, ZHANG Shuwu
2019, 45(12): 2403-2414. doi: 10.13700/j.bh.1001-5965.2019.0369
Abstract:

Watermarking is a technique of embedding a mark called a watermark in an image to prove the copyright of this image. This paper proposes an image texture based adaptive watermarking algorithm by taking advantage of the textured regions of an image being easy to hide the watermark. Firstly, a texture measurement method is proposed to reflect the richness of image's texture, and the concepts of global texture value and local texture value are introduced to comprehensively analyze the image texture distribution. The textured regions of an image are located by using the sliding window and judged by the local texture value of the inner window area, and then we only embed the watermark in the textured regions and ensure the visual quality of embedded watermark. The function relationship among the global texture value, the local texture value of the textured regions and the embedding parameter is obtained by multiple regression analysis. It can adaptively adjust the embedding parameters with the texture values of the regions to maximally enhance the imperceptibility and robustness of the watermark. Moreover, embedding the same watermark in multiple non-overlapping textured regions further improves the accuracy of the extracted watermark. The simulation experiments on 100 images show the superiority of the proposed method compared with the state-of-the-art methods in terms of imperceptibility, adaptivity and robustness.

Manhattan distance based inter-frame weighted prediction algorithm
GUO Hongwei, ZHU Ce, LI Shuai, WANG Yonghua
2019, 45(12): 2415-2422. doi: 10.13700/j.bh.1001-5965.2019.0371
Abstract:

Merge mode saves the number of bits required to encode motion information by sharing the motion vector (MV) in neighboring blocks, which improves the rate-distortion performance of encoders effectively. However, motion compensation prediction (MCP) is not accurate enough in the merge mode currently. Therefore, this paper analyses the characteristics of residual distribution after MCP in the merge mode, and presents a Manhattan distance based weighted prediction method as an additional candidate for the merge mode. First, several predicted blocks are obtained by MCP with motion vectors in neighboring candidates. Second, the additional candidate is obtained by a weighted average method according to Manhattan distances from the neighboring candidate to the pixel points in the predicted blocks obtained. Finally, the best merge mode is selected by rate distortion optimization (RDO) among the additional candidate and the original candidates. The experimental results show that, on the joint exploration test model 7.0 (JEM 7.0), the proposed method achieves rate distortion performance improvement under the different configurations of encoder, where a bitrate saving of 1.34% on average is obtained under the configuration of low delay P frame.

Compression layout for network visualization based on node importance for community structure
WU Lingda, ZHANG Xitao, MENG Xiangli
2019, 45(12): 2423-2430. doi: 10.13700/j.bh.1001-5965.2019.0385
Abstract:

In order to effectively display the mesoscale structure of the network, the force-directed layout algorithm is combined with the network community structure features, and a network visualization compression layout method based on the importance of the node in the community structure is proposed. First, the Louvain algorithm is used to divide the network into multi-granular community structure. Then, the importance of the nodes is evaluated by calculating the topological potential of the nodes in the community structure. Through preserving the important nodes, the community is compressed, while the boundary nodes are merged. Finally, the force-directed layout algorithm is adopted to layout the network to achieve a visual compression layout. The experimental results show that the proposed method can completely preserve the original network community structure on the basis of compression nodes and connected edges, and can clearly display the internal structure of the community by retaining the representative points of the community structure, highlighting the position and role of the community and important nodes in the network structure.

Selection of measurement variables for hyperspectra of total phenol content in grape seeds based on Monte Carlo frequency method
CHENG Yunling, YANG Shuqin
2019, 45(12): 2431-2437. doi: 10.13700/j.bh.1001-5965.2019.0361
Abstract:

In order to solve the problems of too many variables and high model complexity, it is necessary to effectively reduce the dimension of the data according to the characteristics in establishing the prediction model of total phenol content in grape seeds by using hyperspectral data. In this paper, a Monte Carlo frequency (MCF) method was proposed to select the wavelength of hyperspectral data, and the support vector regression (SVR) prediction model of grape seed total phenols was established. The method uses Monte Carlo sampling to select wavelength subset, then establishes a large number of SVR sub-models, and selects sub-models with smaller root mean square error (RMSE) to count the frequency of each wavelength. Finally, the number of wavelengths is determined by exponential decline function, and the wavelength subset with the highest frequency is selected as the characteristic wavelength. The results show that the prediction performance of the model can be improved by using MCF method at the same time of dimensionality reduction. The number of wavelengths can be reduced from 196 to 9, the range of wavelengths is between 950 and 1400 nm, and the RMSE value can be reduced from 0.42 to 0.37. The prediction accuracy is better than other wavelength selection methods such as SPA. The results show that the proposed MCF method can effectively select characteristic wavelengths in hyperspectral data processing, which provides an effective method for the accurate establishment of prediction model.

Text-to-image synthesis optimization based on aesthetic assessment
XU Tianyu, WANG Zhi
2019, 45(12): 2438-2448. doi: 10.13700/j.bh.1001-5965.2019.0366
Abstract:

Due to the development of generative adversarial network (GAN), much progress has been achieved in the research of text-to-image synthesis. However, most of the researches focus on improving the stability and resolution of generated images rather than aesthetic quality. On the other hand, image aesthetic assessment research is also a classic task in computer vision field, and currently there exists several state-of-the-art models on image aesthetic assessment. In this work, we propose to improve the aesthetic quality of images generated by text-to-image GAN by incorporating an image aesthetic assessment model into a conditional GAN. We choose StackGAN++, a state-of-the-art text-to-image synthesis model, assess the aesthetic quality of images generated by it with a chosen aesthetic assessment model, then define a new loss function:aesthetic loss, and use it to improve StackGAN++. Compared with the original model, the total aesthetic score of generated images is improved by 3.17% and the inception score is improved by 2.68%, indicating that the proposed optimization is effective but still has several weaknesses that can be improved in future work.

Blind quality assessment for screen content images based on edge and structure
WEI Lesong, CHEN Junhao, NIU Yuzhen
2019, 45(12): 2449-2455. doi: 10.13700/j.bh.1001-5965.2019.0367
Abstract:

A screen content image (SCI) has great difference compared with a natural image, and an SCI contains more text, graphic, and special layout. Considering the influences of texts, graphics, pictures, and layout on quality of an SCI, a blind quality assessment metric for SCIs based on edge and structure (BES) has been proposed. Since texts, graphics, and pictures have a large number of edges and the human visual system is highly sensitive to edges, the BES metric first extracts edges using the imaginary part of the Gabor filter and computes an edge feature for each SCI. Second, a structure feature is extracted to represent the layout of an SCI. Specifically, the Scharr filter is exploited to calculate a local binary pattern (LBP) map which is used to compute a structure feature. Finally, a random forest regression algorithm is applied to map the edge and structure features to subjective scores. The experimental results show that in the database SIQAD and SCID, the Pearson linear correlation coefficient (PLCC) of the performance of the proposed BES index is 2.63% and 11.22% higher than the latest no reference index in the comparison respectively, and even higher than some full reference indexes.

3D video quality evaluation based on adaptive streaming over HTTP
ZHAI Yuxuan, LIU Yisang, XU Yiwen, CHEN Zhonghui, FANG Ying, ZHAO Tiesong
2019, 45(12): 2456-2462. doi: 10.13700/j.bh.1001-5965.2019.0383
Abstract:

The key for 3D video network service is to improve the quality of experience (QoE) of users, which can be, however, affected by mutable network conditions and video contents. For conventional 2D videos, the HTTP adaptive streaming (HAS) technique has demonstrated its significance in improving user QoE by utilizing dynamically switched bitrates, while for 3D video transmission with at least two video streams, this technique has not yet been extensively explored. Dynamic conversion policy of the video quality level is the core of HAS technique. In this work, we investigate the impact on user QoE when introducing dynamic bitrates to different 3D videos. A subjective database is built to illustrate the connection between block-level objective quality, which changes with bitrates, and the QoE of 3D vision. Through this, we propose a convolutional neural network (CNN) based QoE model that effectively assesses the QoE by block-level objective quality. The Pearson linear correlation coefficient (PLCC) of the model predictive value and the mean opinion score (MOS) is 0.906.The proposed framework can provide guidance to inter-view bitrate balancing of HAS for 3D video transmission.

Three-dimensional human pose estimation based on video
YANG Bin, LI Heping, ZENG Hui
2019, 45(12): 2463-2469. doi: 10.13700/j.bh.1001-5965.2019.0384
Abstract:

The existing 3D human pose estimation method focuses on estimating the 3D pose of the human body through a single frame image, while ignoring the correlation between the front and back frames in the video. Therefore, by investigating the information of the video in the time dimension, the accuracy of the 3D human pose estimation can be further improved. Based on this, the convolutional neural network structure that can fully extract the temporal information in the video is designed. It has the advantage of low computational resources and high precision. The complete 3D human pose can be restored only by using the coordinates of the 2D articulation point as input. Furthermore, a new loss function is proposed, which uses the continuity of human pose between adjacent frames to improve the smoothness of 3D pose estimation in video sequences, and also solves the problem of accuracy degradation due to lack of inter-frame information. By testing on the Human 3.6M dataset, the experimental results indicate that the average test error of the proposed method is 1.2 mm lower than that of the current standard 3D pose estimation algorithm, and the proposed method has a high accuracy for the 3D human pose estimation of video sequences.

A low-cost indoor passable area modeling method for robots
ZHANG Fukai, RUI Ting, HE Lei, YANG Chengsong
2019, 45(12): 2470-2478. doi: 10.13700/j.bh.1001-5965.2019.0393
Abstract:

Monocular vision-based simultaneous localization and mapping (SLAM) is a popular technology in the field of robotics in recent years. However, due to the huge computation resource required by reconstruction, mainstream methods are not able to generate meaningful reconstruction of scene in real time on platforms with low computing power. This paper proposes a new fast passable area modeling method for the specific situation of indoor environment and small robots. The method is based on the monocular feature-based SLAM. Firstly, it obtains the road segmentation image through segmentation in the HSV color space with adaptive threshold. Then, the system cross-matches the segmentation with the sparse point cloud generated by SLAM, to obtain the ground plane and accurate ground segmentation area. Finally, it projects the ground segmentation area to the ground plane for dense modeling of the floor. In the experiment of indoor scene, the average calculation speed of the proposed method can reach 21 frames per second, and the speed is about 70% of ORB-SLAM, which can meet the real-time requirements of mobile platforms. The position error for the floor plane is 5.8% on average, and the modeling error of the road width is between 3.5% and 12.8%.

Video thumbnail recommendation based on deep visual-semantic embedding
ZHANG Mengqin, MENG Quanling, ZHANG Weigang
2019, 45(12): 2479-2486. doi: 10.13700/j.bh.1001-5965.2019.0415
Abstract:

Video thumbnail, as the most intuitive form of video content, plays an important role in video sharing sites and is one of the key elements to attract users to click and watch the video. However, a descriptive statement related to video content with a video thumbnail associated with the content of the statement is often more attractive to user. Therefore, a complete video thumbnail recommendation framework with a deep visual-semantic embedding model is proposed in this paper. This model uses the convolutional neural network to extract the visual features of video keyframes, and uses recurrent neural network to extract the semantic features of description sentences. After embedding the visual features and the semantic features into the visual-semantic potential space of the same dimension, the key frames related to the content of the descriptive sentences are recommended as video thumbnails by comparing the correlation between the visual features and the semantic features. Experiments on different categories of web videos show that the proposed method can effectively recommend contented-related video thumbnail sequence from videos for given descriptive statements and enhance the user experience.

Retargeted image quality assessment based on multi-scale distortion-aware feature
WU Zhishan, ZHANG Shuai, NIU Yuzhen
2019, 45(12): 2487-2494. doi: 10.13700/j.bh.1001-5965.2019.0368
Abstract:

The subjective visual experience of viewing images using a variety of display devices is usually affected by image retargeting operation. To improve the consistency between subjective perception and objective assessment for the retargeted images, we present an objective retargeted image quality assessment (RIQA) method based on multi-scale distortion-aware (MSDA) features. Because semantic distortion and detail distortion appear on different scales of the image, we propose to extract distortion-aware features from multiple scales of the image. Specifically, we present an accurate measurement for the aspect ratio similarity (ARS) between the original and retargeted images. Furthermore, we use a fused visual attention map to simulate the subjective attention of the human visual system to the image. The experimental results on the two benchmark databases show that the Kendall rank correlation coefficient (KRCC), Pearson linear correlation coefficient (PLCC), and Spearman rank-order correlation coefficient (SRCC) indicators of the proposed MSDA method are 4.1%, 1.8%, and 4.5% higher than the optimal method in the comparative methods, respectively.

Structural feature representation and fusion of behavior recognition oriented human spatial cooperative motion
MO Yujian, HOU Zhenjie, CHANG Xingzhi, LIANG Jiuzhen, CHEN Chen, HUAN Juan
2019, 45(12): 2495-2505. doi: 10.13700/j.bh.1001-5965.2019.0373
Abstract:

In view of the synergistic relationship among different parts of the body when human body performs actions, a behavior recognition method based on human body spatial cooperative motion structural features is proposed. Firstly, the contribution of different parts of the human body to the completion of the action is measured, and the contribution of different parts of the human body is transformed into a structural feature model of cooperative motion. Then, the model is used to constrain the motion characteristics of different parts of the body self-adaptively without supervision. On this basis, feature selection and multi-modal feature fusion are carried out using JFSSL, a cross-media retrieval method. The experiments show that the recognition rate of the open test is obviously improved by the proposed method on the self-built behavior database. At the same time, the calculation process of the method is simple and easy to implement.

Video multi-frame quality enhancement method via spatial-temporal context learning
TONG Junchao, WU Xilin, DING Dandan
2019, 45(12): 2506-2513. doi: 10.13700/j.bh.1001-5965.2019.0374
Abstract:

Convolutional neural network (CNN) has achieved great success in the field of video enhancement. The existing video enhancement methods mainly explore the pixel correlations in spatial domain of an image, which ignores the temporal similarity between consecutive frames. To address the above issue, this paper proposes a multi-frame quality enhancement method, namely spatial-temporal multi-frame video enhancement (STMVE), through learning the spatial-temporal context of current frame. The basic idea of STMVE is utilizing the adjacent frames of current frame to help enhance the quality of current frame. To this end, the virtual frames of current frame are first predicted from its neighbouring frames and then current frame is enhanced by its virtual frames. And the adaptive separable convolutional neural network (ASCNN) is employed to generate the virtual frame. In the subsequent enhancement stage, a multi-frame CNN (MFCNN) is designed. An early-fusion CNN structure is developed to extract both temporal and spatial correlation between the current and virtual frames and output the enhanced current frame. The experimental results show that the proposed STMVE method obtains 0.47 dB, 0.43 dB, 0.38 dB and 0.28 dB PSNR gains compared with H.265/HEVC at quantized parameter values 37, 32, 27 and 22 respectively. Compared to the multi-frame quality enhancement (MFQE) method, an average 0.17 dB PSNR gain is obtained.

Aesthetic feature analysis and classification of Chinese traditional painting
ZHAN Ying, GAO Yan, XIE Lingyun
2019, 45(12): 2514-2522. doi: 10.13700/j.bh.1001-5965.2019.0375
Abstract:

Automatic classification of aesthetics in images has been a popular research field in these years. Chinese traditional painting is a pivotal embodiment of Chinese traditional arts, so its aesthetics shows a great potential for researching. In this paper, the automatic classification study and relevant feature analysis of aesthetics were conducted in a Chinese painting database annotated with 5 classes of aesthetics. First, based on subjective annotation, by employing feature extraction and selection, 33 optimal image features were filtered out for aesthetic classification. Then, a mapping analysis was conducted on the relationship among objective features, subjective aesthetics and image artistic elements. Finally, an automatic recognition using a variety of mainstream classifiers was implemented on the optimal feature set, and an acceptable performance was obtained, which proves the feasibility and effectiveness of automatic classification of Chinese painting aesthetics. The results show that the main artistic elements (in order) of aesthetic classification for Chinese traditional painting are:color, brushwork, brightness and lines.

Two-dimensional shape recognition based on contour and skeleton sequence coding
LU Yongqiang, LI Zhiyang, CHEN Yinan, LIU Zhaobin, HUANG Yiming
2019, 45(12): 2523-2532. doi: 10.13700/j.bh.1001-5965.2019.0376
Abstract:

Two-dimensional shape recognition is a fundamental problem in object recognition, which is widely used in trademark retrieval, fingerprint recognition, object location, image retrieval and other fields. Recently, two-dimensional shape recognition based on bioinformatics has become a new research direction, whose basic idea is to transform the contour of a planar shape into a biological information sequence. The two-dimensional shape matching and recognition are then achieved by the standard alignment tools of such biological information sequence. However, the classic coding method still suffers from the problems of code redundancy and low accuracy. In this paper, we present a new coding method based on both the shape contour and skeleton sequence. Firstly, skeletons are used to represent slender branches of the shape to reduce coding redundancy. Secondly, the contour and skeleton are coded in different ways to compact the code and improve the matching accuracy. Finally, extensive shape recognition experiments are conducted on three public datasets and the proposed method is compared with a variety of shape recognition methods. The experimental results demonstrate that the proposed method has achieved higher performance in several experiments, and the recognition accuracy rate is improved by nearly 5% compared with basic shape feature description methods.

Stock prediction model based on particle swarm optimization LSTM
SONG Gang, ZHANG Yunfeng, BAO Fangxun, QIN Chao
2019, 45(12): 2533-2542. doi: 10.13700/j.bh.1001-5965.2019.0388
Abstract:

This paper proposes a stock price prediction model based on particle swarm optimization long short-term memory (PSO-LSTM). This model improves and optimizes the LSTM model, which makes it more appropriate for analyzing relationships such as long-term dependency and for solving complex nonlinear problems. Through finding the key parameters in LSTM model by the PSO algorithm with adaptive learning strategy, the stock data feature matches the network topology structure, and the model's prediction accuracy of stock price is improved. In the experiment, PSO-LSTM models are constructed respectively based on the stock datasets from Shanghai, Shenzhen and Hong Kong, and then they are compared to other prediction models. The comparison results show that the PSO-LSTM stock price prediction model achieves higher prediction accuracy and has general applicability.

2019, 45(12): 2543-2561.
Abstract: