北京航空航天大学学报 ›› 2019, Vol. 45 ›› Issue (12): 2431-2437.doi: 10.13700/j.bh.1001-5965.2019.0361

• 论文 • 上一篇    下一篇

基于蒙特卡罗频率法的葡萄籽总酚含量高光谱测量变量选择

成云玲1,2, 杨蜀秦1,2   

  1. 1. 西北农林科技大学 机械与电子工程学院, 咸阳 712100;
    2. 农业农村部农业物联网重点实验室, 咸阳 712100
  • 收稿日期:2019-07-08 出版日期:2019-12-20 发布日期:2019-12-31
  • 通讯作者: 杨蜀秦 E-mail:yangshuqin1978@163.com
  • 作者简介:成云玲 女,硕士研究生。主要研究方向:高光谱技术在农业信息领域的应用;杨蜀秦 女,博士,副教授,硕士生导师。主要研究方向:计算机视觉和模式识别。
  • 基金资助:
    国家自然科学基金(31501228,61876153);中央高校基本科研业务费专项资金(2452019180)

Selection of measurement variables for hyperspectra of total phenol content in grape seeds based on Monte Carlo frequency method

CHENG Yunling1,2, YANG Shuqin1,2   

  1. 1. College of Mechanical and Electronic Engineering, Northwest A&F University, Xianyang 712100, China;
    2. Key Laboratory of Agricultural Internet of Things, Ministry of Agriculture and Rural Affairs, Xianyang 712100, China
  • Received:2019-07-08 Online:2019-12-20 Published:2019-12-31
  • Supported by:
    National Natural Science Foundation of China (31501228,61876153); the Fundamental Research Funds for the Central Universities (2452019180)

摘要: 在利用高光谱建立葡萄籽总酚含量的预测模型中,为解决变量过多、模型复杂度高等问题,需依据光谱特点进行有效地数据降维。提出了一种蒙特卡罗频率法(MCF)对高光谱数据进行波长选择,并建立了葡萄籽总酚的支持向量回归(SVR)预测模型。该方法首先采用蒙特卡罗采样(MCS)选择波长子集;然后建立大量SVR子模型,并选出均方根误差(RMSE)较小的子模型,统计每个波长出现的频次;最后根据指数递减函数确定波长个数,选取频次最高的波长子集作为特征波长。结果表明,采用MCF可以在降维的同时提高模型的预测性能,波长数目由原始的196个减少到9个,波长范围均在950~1 400 nm,RMSE值从0.42减少到0.37,预测精度优于SPA等其他波长选择方法。因此,提出的基于MCF在高光谱数据处理中能有效选择特征波长,为准确建立预测模型提供了一种有效的方法。

关键词: 变量选择, 蒙特卡罗采样(MCS), 近红外高光谱, 葡萄籽, 总酚含量

Abstract: In order to solve the problems of too many variables and high model complexity, it is necessary to effectively reduce the dimension of the data according to the characteristics in establishing the prediction model of total phenol content in grape seeds by using hyperspectral data. In this paper, a Monte Carlo frequency (MCF) method was proposed to select the wavelength of hyperspectral data, and the support vector regression (SVR) prediction model of grape seed total phenols was established. The method uses Monte Carlo sampling to select wavelength subset, then establishes a large number of SVR sub-models, and selects sub-models with smaller root mean square error (RMSE) to count the frequency of each wavelength. Finally, the number of wavelengths is determined by exponential decline function, and the wavelength subset with the highest frequency is selected as the characteristic wavelength. The results show that the prediction performance of the model can be improved by using MCF method at the same time of dimensionality reduction. The number of wavelengths can be reduced from 196 to 9, the range of wavelengths is between 950 and 1400 nm, and the RMSE value can be reduced from 0.42 to 0.37. The prediction accuracy is better than other wavelength selection methods such as SPA. The results show that the proposed MCF method can effectively select characteristic wavelengths in hyperspectral data processing, which provides an effective method for the accurate establishment of prediction model.

Key words: variable selection, Monte Carlo sampling (MCS), near infrared hyperspectra, grape seeds, total phenol content

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发