留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于支持向量的迭代修正质心文本分类算法

王德庆 张辉

王德庆, 张辉. 基于支持向量的迭代修正质心文本分类算法[J]. 北京航空航天大学学报, 2013, 39(2): 269-274.
引用本文: 王德庆, 张辉. 基于支持向量的迭代修正质心文本分类算法[J]. 北京航空航天大学学报, 2013, 39(2): 269-274.
Wang Deqing, Zhang Hui. Support-vector-based iteratively adjusted centroid classifier for text categorization[J]. Journal of Beijing University of Aeronautics and Astronautics, 2013, 39(2): 269-274. (in Chinese)
Citation: Wang Deqing, Zhang Hui. Support-vector-based iteratively adjusted centroid classifier for text categorization[J]. Journal of Beijing University of Aeronautics and Astronautics, 2013, 39(2): 269-274. (in Chinese)

基于支持向量的迭代修正质心文本分类算法

基金项目: 核高基重大专项资助项目(2010ZX01042-002)
详细信息
  • 中图分类号: TP181

Support-vector-based iteratively adjusted centroid classifier for text categorization

  • 摘要: 针对质心分类算法容易产生归纳偏置或模型失配问题的不足,提出一种基于支持向量的迭代修正质心分类算法.该方法仅使用由支持向量机(SVMs,Support Vector Machines)选出的支持向量来构造质心向量,然后利用训练集误分样本来迭代修正初始质心向量.与其他分类算法相比,该算法取得较好的宏平均F1和微平均F1,在8个常用文本分类数据集上的实验验证了该算法的有效性,特别是在不均衡文本语料上.

     

  • [1] Sebastiani F.Machine learning in automated text categorization[J].ACM Computing Surveys,2002,34(1):1-47
    [2] Wang D,Zhang H,Liu R,et al.Predicting bugs' components via mining bug reports[J].Journal of Software,2012,7(5): 1149-1154
    [3] Han E H,Karypis G.Centroid-based document classification: analysis & experimental results[C]//Proceedings of PKDD'00.London:Springer-Verlag,2000:424-431
    [4] Tam V,Santoso A,Setiono R.A comparative study of centroidbased,neighborhood-based and statistical approaches for effective document categorization[C]//Proceedings of 16th ICPR.Washington:IEEE Computer Society,2002:235-238
    [5] Guan H, Zhou J,Guo M.A class-feature-centroid classifier for text categorization[C]//Proceedings of WWW.New York:ACM,2009:201-210
    [6] Tan S.An improved centroid classifier for text categorization[J].Expert Systems with Applications,2008,35(1/2):1279-1285
    [7] Tan S,Wang Y,Wu G.Adapting centroid classifier for document categorization[J].Expert Systems with Applications,2011, 38(8):10264-10273
    [8] Lertnattee V,Theeramunkong T.Effect of term distributions on centroid-based text categorization[J].Information Sciences,2004,158:89-115
    [9] Shankar S,Karypis G.Weight adjustment schemes for a centroid based classifier .TR 00-035,2000
    [10] Foody G M.Issues in training set selection and refinement for classification by a feedforward neural network[C]//Proceedings of IGARSS.Seattle:IEEE,1998:409-411
    [11] Cortes C,Vapnik V.Support-vector networks[J].Machine Learning,1995,20:273-297
    [12] Joachims T.Text categorization with support vector machines .TR-23,University of Dortmund,1997
    [13] Salton G,Buckley C.Term-weighting approaches in automatic text retrieval[J].Information Processing & Management,1988,24(5):513-523
    [14] Jones K S.A statistical interpretation of term specificity and its application in retrieval[J].J Documentation,1972,28(1):11-21
    [15] Han E H.Tmdata .Minnesota:University of Minnesota,2000 .http://www.cs.umn.edu/~han/data/tmdata.tar.gz
    [16] Xiong H,Wu J,Chen J.K-means clustering versus validation measures:a data-distribution perspective[J].IEEE Transactions on Systems,Man,and Cybernetics Part B,2009,39(2):318-331
    [17] Lewis D.Reuters-21578 .Dublin:Trinty College,2007 .http://ronaldo.cs.tcd.ie/esslli07/sw/step01.tgz
    [18] Lang Ken.20Newsgroup .Massachusetts:Massachusetts Institute of Technology,2007 .http://people.csail.mit.edu/jrennie/20Newsgroups/
    [19] Lewis D D.Evaluating and optimizing autonomous text classification systems[C]//Proceedings of 18th SIGIR.New York:ACM,1995:246-254
    [20] Yu H,Hsieh C J,Chang K W,et al.Large linear classification when data cannot fit in memory[C]//Proceedings of KDD-10.New York:ACM,2010:833-842
    [21] Yang Y,Liu X.A re-examination of text categorization methods[C]//Proceedings of SIGIR '99.New York:ACM,1999: 42- 49
    [22] Chang C C,Lin C J.Libsvm:a library for support vector machines .Taiwan:Department of Computer Science and Information Engineering,National Taiwan University,2001 .http://www.csie.ntu.edu.tw/~cjlin/libsvm
  • 加载中
计量
  • 文章访问数:  1484
  • HTML全文浏览量:  200
  • PDF下载量:  603
  • 被引次数: 0
出版历程
  • 收稿日期:  2012-01-11
  • 网络出版日期:  2013-02-28

目录

    /

    返回文章
    返回
    常见问答