留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于图结构与多任务学习的指代消解

李开阳 王耀影 朱天佑 李继伟 任俊达 陈振宇

李开阳,王耀影,朱天佑,等. 基于图结构与多任务学习的指代消解[J]. 北京航空航天大学学报,2024,50(12):3825-3833 doi: 10.13700/j.bh.1001-5965.2022.0941
引用本文: 李开阳,王耀影,朱天佑,等. 基于图结构与多任务学习的指代消解[J]. 北京航空航天大学学报,2024,50(12):3825-3833 doi: 10.13700/j.bh.1001-5965.2022.0941
LI K Y,WANG Y Y,ZHU T Y,et al. Coreference resolution based on graph structure and multitask learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(12):3825-3833 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0941
Citation: LI K Y,WANG Y Y,ZHU T Y,et al. Coreference resolution based on graph structure and multitask learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(12):3825-3833 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0941

基于图结构与多任务学习的指代消解

doi: 10.13700/j.bh.1001-5965.2022.0941
基金项目: 国家电网有限公司大数据中心科技项目(SGSJ0000YFJS2200066)
详细信息
    通讯作者:

    E-mail:kaiyang2@163.com

  • 中图分类号: TP391.1

Coreference resolution based on graph structure and multitask learning

Funds: Technology Project of Big Data Center of State Grid Corporation of China (SGSJ0000YFJS2200066)
More Information
  • 摘要:

    指代提取是自然语言处理应用中的一项重要任务。学习有效的指称特征表示是指代消解的核心问题。现有的研究大多把指称文本片段识别和共指关系预测作为两阶段来学习,不能有效反映文本片段中的命名实体等信息和共指对之间的内在联系。因此,提出一种新的基于图结构和多任务学习的指代消解模型。该模型将序列语义和结构信息结合起来指称特征向量学习,利用多任务学习框架结合指代消解和命名实体识别2个任务,通过参数共享的底层网络结构实现指代消解和命名实体识别2个任务的互相学习和互相提高。在公共数据集和手动构建的数据集上进行了大量实验,验证了所提模型的优越性。

     

  • 图 1  本文模型框架

    Figure 1.  Framework of the proposed model

    表  1  SCIERC[23]数据集规模

    Table  1.   SCIERC[23] dataset size

    标注类别 数量
    任务 1284
    方法 2091
    度量标准 340
    材料 771
    通用 1338
    其他实体 2270
    共指链接 2754
    共指团 1015
    下载: 导出CSV

    表  2  CSGC-3数据集统计信息

    Table  2.   CSGC-3 dataset statistics information

    标注类别 数量
    发电设备 2204
    电动设备 2992
    环境信息 3321
    电力电缆 1046
    显示设备 1543
    控制器件 865
    其他实体 4134
    共指链接 4589
    共指团 1657
    下载: 导出CSV

    表  3  SCIERC[23]数据集上的指代消解结果

    Table  3.   Result of coreference resolution on SCIERC[23] dataset %

    模型 F1
    MUC B3 $ {\mathrm{CEA}}{{\mathrm{F}}_{{\phi_4}}} $
    SpanRe[14] 69.31 58.54 53.67
    c2f-coref[7] 70.32 59.37 55.53
    SECT[15] 71.64 61.12 59.71
    coref-HGAT[17] 72.45 61.98 60.43
    GMCor 73.82 63.23 61.69
    下载: 导出CSV

    表  4  CSGC-3数据集上的指代消解结果

    Table  4.   Result of coreference resolution on CSGC-3 dataset %

    模型 F1
    MUC B3 $ {\mathrm{CEA}}{{\mathrm{F}}_{{\phi_4}}} $
    SpanRe[14] 65.12 55.09 51.23
    c2f-coref[7] 66.45 56.21 53.78
    SECT[15] 68.24 58.05 56.81
    coref-HGAT[17] 69.31 59.13 57.67
    GMCor 70.32 60.51 58.68
    下载: 导出CSV

    表  5  SCIERC[23]数据集上的命名实体识别结果

    Table  5.   Result of named entity recognition on SCIERC[23] dataset %

    模型 精确率 召回率 F1
    CRF[20] 78.61 75.35 76.95
    Bi-LSTM+CRF[18] 80.55 77.12 78.80
    LSTM-CNNs+CRF[26] 81.12 78.17 79.62
    WCL-BBCD[27] 82.12 78.57 80.30
    GMCor 82.89 79.45 81.13
    下载: 导出CSV

    表  6  CSGC-3数据集上的命名实体识别结果

    Table  6.   Result of named entity recognition on CSGC-3 dataset %

    模型 精确率 召回率 F1
    CRF[20] 87.31 86.50 86.90
    Bi-LSTM+CRF[18] 89.03 88.12 88.57
    LSTM-CNNs+CRF[26] 89.34 88.56 88.95
    WCL-BBCD[27] 90.54 88.96 89.74
    GMCor 91.03 89.78 90.40
    下载: 导出CSV

    表  7  SCIERC[23]数据集上的消融实验结果

    Table  7.   Result of ablation tests on SCIERC[23] dataset %

    模型 F1
    MUC B3 $ {\mathrm{CEA}}{{\mathrm{F}}_{{\phi _4}}} $ NER
    GMCor-BERT 71.23 61.55 59.48
    GMCor-Graph 70.67 59.87 58.83
    GMCor-Entity 68.97 58.34 57.45
    GMCor-Core 78.34
    GMCor 73.82 63.23 61.69 81.13
    下载: 导出CSV

    表  8  CSGC-3数据集上的消融实验结果

    Table  8.   Result of ablation tests on CSGC-3 dataset %

    模型 F1
    MUC B3 $ {\mathrm{CEA}}{{\mathrm{F}}_{{\phi _4}}} $ NER
    GMCor-BERT 67.33 58.22 56.24
    GMCor-Graph 66.12 56.34 55.52
    GMCor-Entity 64.45 55.21 54.73
    GMCor-Core 85.79
    GMCor 70.32 60.51 58.68 90.40
    下载: 导出CSV
  • [1] DODDINGTON G R, MITCHELL A, PRZYBOCKI M A, et al. The automatic content extraction (ACE) program-tasks, data, and evaluation[C]//Proceedings of the International Conference on Language Resources and Evaluation. Brussels: European Language Resources Association, 2004: 837-840.
    [2] HOBBS J R. Resolving pronoun references[J]. Lingua, 1978, 44(4): 311-338. doi: 10.1016/0024-3841(78)90006-2
    [3] GE N, HALE J, CHARNIAK E. A statistical approach to anaphora resolution[C]//Proceedings of the 6th Workshop on Very Large Corpora. Stroudsbury: Association for Computational Linguistics, 1998: 161-170.
    [4] ZHENG J P, CHAPMAN W W, MILLER T A, et al. A system for coreference resolution for the clinical narrative[J]. Journal of the American Medical Informatics Association, 2012, 19(4): 660-667. doi: 10.1136/amiajnl-2011-000599
    [5] ZHANG R, DOS SANTOS C N, YASUNAGA M, et al. Neural coreference resolution with deep biaffine attention by joint mention detection and mention clustering[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 2450-2355.
    [6] LEE K, HE L H, ZETTLEMOYER L. Higher-order coreference resolution with coarse-to-fine inference[EB/OL]. (2018-04-15)[2022-08-01]. http://arxiv.org/abs/1804.05392.
    [7] JOSHI M, LEVY O, WELD D S, et al. BERT for coreference resolution: Baselines and analysis[EB/OL]. (2019-08-24)[2022-08-01]. http://arxiv.org/abs/1908.09091.
    [8] LAPPIN S, LEASS H J. An algorithm for pronominal anaphora resolution[J]. Computational Linguistics, 1994, 20(4): 535-561.
    [9] DAGAN I, ITAI A. Automatic acquisition of constraints for the resolution of anaphoric references and syntactic ambiguities[C]//Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 1990: 122-129.
    [10] CLARK K, MANNING C D. Improving coreference resolution by learning entity-level distributed representations[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 1231-1241.
    [11] LEE K, HE L, LEWIS M, et al. End-to-end neural coreference resolution[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsbury: Association for Computational Linguistics, 2017: 561-570.
    [12] WU W, WANG F, YUAN A, et al. CorefQA: Coreference resolution as query-based span prediction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsbury: Association for Computational Linguistics, 2020: 6953-6963.
    [13] LUAN Y, WADDEN D, HE L H, et al. A general framework for information extraction using dynamic span graphs[EB/OL]. (2019-05-05)[2022-08-01]. http://arxiv.org/abs/1904.03296.
    [14] GANDHI N, FIELD A, TSVETKOV Y. Improving span representation for domain-adapted coreference resolution[EB/OL]. (2021-09-20)[2022-08-01]. http://arxiv.org/abs/2109.09811.
    [15] 付健, 孔芳. 融入结构化信息的端到端中文指代消解[J]. 计算机工程, 2020, 46(1): 45-51.

    FU J, KONG F. End to end Chinese coreference resolution with structural information[J]. Computer Engineering, 2020, 46(1): 45-51(in Chinese).
    [16] AL-RFOU R, PEROZZI B, SKIENA S. Polyglot: Distributed word representations for multilingual NLP[EB/OL]. (2013-07-05)[2022-08-01]. http://arxiv.org/abs/1307.1662.
    [17] JIANG F, COHN T. Incorporating syntax and semantics in coreference resolution with heterogeneous graph attention network[C]// Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2021: 1584-1591.
    [18] HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. (2015-08-09)[2022-08-01]. http://arxiv.org/abs/1508.01991.
    [19] LIU Y H, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[EB/OL]. (2019-07-26)[2022-08-01]. http://arxiv.org/abs/1907.11692.
    [20] LI J, WANG X, TU Z P, et al. On the diversity of multi-head attention[J]. Neurocomputing, 2021, 454: 14-24. doi: 10.1016/j.neucom.2021.04.038
    [21] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2261-2269.
    [22] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the Eighteenth International Conference on Machine Learning. Williamstown: Morgan Kaufmann, 2001: 282-289.
    [23] RAM R V S, AKILANDESWARI A, DEVI S L. Linguistic features for named entity recognition using CRFs[C]//Proceedings of the International Conference on Asian Language Processing. Piscataway: IEEE Press, 2010: 158-161.
    [24] AUGENSTEIN I, DAS M, RIEDEL S, et al. SemEval 2017 task 10: ScienceIE-extracting keyphrases and relations from scientific publications[EB/OL]. (2017-04-10)[2022-08-01]. http://arxiv.org/abs/1704.02853.
    [25] GÁBOR K, BUSCALDI D, SCHUMANN A K, et al. SemEval2018 task 7: Semantic relation extraction and classification in scientific papers[C]//Proceedings of the 12th International Workshop on Semantic Evaluation. Stroudsburg: Association for Computational Linguistics, 2018: 679-688.
    [26] MA X Z, HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[EB/OL]. (2016-03-04)[2022-08-01]. http://arxiv.org/abs/1603.01354.
    [27] ZHOU R J, HU Q, WAN J, et al. WCL-BBCD: A contrastive learning and knowledge graph approach to named entity recognition[EB/OL]. (2022-03-14)[2022-08-01]. http://arxiv.org/abs/2203.06925.
  • 加载中
图(1) / 表(8)
计量
  • 文章访问数:  138
  • HTML全文浏览量:  49
  • PDF下载量:  1
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-11-24
  • 录用日期:  2023-05-29
  • 网络出版日期:  2023-07-06
  • 整期出版日期:  2024-12-31

目录

    /

    返回文章
    返回
    常见问答