留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于图注意力堆叠自编码器微生物-药物关联预测

王波 何洋 杜晓昕 张剑飞 徐靖然 贾娜

王波,何洋,杜晓昕,等. 基于图注意力堆叠自编码器微生物-药物关联预测[J]. 北京航空航天大学学报,2026,52(1):61-72
引用本文: 王波,何洋,杜晓昕,等. 基于图注意力堆叠自编码器微生物-药物关联预测[J]. 北京航空航天大学学报,2026,52(1):61-72
WANG B,HE Y,DU X X,et al. Prediction of microbe-drug association based on graph attention stacked autoencoder[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(1):61-72 (in Chinese)
Citation: WANG B,HE Y,DU X X,et al. Prediction of microbe-drug association based on graph attention stacked autoencoder[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(1):61-72 (in Chinese)

基于图注意力堆叠自编码器微生物-药物关联预测

doi: 10.13700/j.bh.1001-5965.2023.0730
基金项目: 

黑龙江省省属高等学校基本科研业务费国自然培育一般项目(145409324)

详细信息
    通讯作者:

    E-mail:bowangdr@qqhru.edu.cn

  • 中图分类号: TP391

Prediction of microbe-drug association based on graph attention stacked autoencoder

Funds: 

General Program for National Natural Science Foundation Cultivation under the Basic Research Fund for Provincial Universities in Heilongjiang (145409324)

More Information
  • 摘要:

    传统方法发掘微生物与药物新关联主要通过生物实验完成,耗费时间且开销极大。基于此,提出基于图注意力堆叠自编码器微生物与药物关联预测方法GATSAE。建立微生物与药物异构网络,丰富关联信息;通过图卷积网络(GCN)提取多层潜在特征,得到微生物和药物的卷积融合矩阵;采用改进的堆叠自编码器学习有意义的高阶相似特征的无监督低维表示,在堆叠自编码器的基础上追加图卷积和注意力机制,进一步优化高阶特征信息的提取;将低维特征与关联特征串联,使用多层感知机(MLP)对最终的微生物-药物进行评分预测。通过效能评估,GATSAE方法的受试者工作特征曲线下面积(AUROC)及精确率-召回率曲线下面积(AUPR)分别达到0.96190.9577,优于经典的机器学习方法和常见的深度学习方法。案例研究表明,GATSAE方法能够准确预测到与SARS-CoV-2、大肠杆菌相关的候选药物,以及与阿司匹林相关的候选微生物。

     

  • 图 1  GATSAE模型框架

    Figure 1.  Framework of GATSAE model

    图 2  图注意力堆叠自编码器结构

    Figure 2.  Structure of graph attention stacking autoencoder

    图 3  GATSAE交叉验证下的ROC曲线和PR曲线

    Figure 3.  ROC and PR curves under GATSAE cross validation

    图 4  不同卷积核带宽γ对应模型AUPR值

    Figure 4.  Bandwidth of different convolutional kernels γ Corresponding model AUPR value

    图 5  不同卷积层层数l对非卷积模型AUPR值的增长率

    Figure 5.  Growth rate of AUPR values in non convolutional models with different number of convolutional layers l

    图 6  不同学习率对训练损失的变化值

    Figure 6.  Variation value of training loss under different learning rates

    图 7  消融实验下的ROC和PR曲线对比

    Figure 7.  Comparison of ROC and PR curves under ablation experiments

    图 8  不同数据集下GATSAE预测结果对比

    Figure 8.  Comparison of GATSAE prediction results under different datasets

    图 9  MLP与常见分类模型比较

    Figure 9.  Comparison between MLP and common classification models

    图 10  模型指标对比

    Figure 10.  Comparison of model indicators

    表  1  3种数据集的数据记录

    Table  1.   Data recording for three datasets

    数据集 药物种类 微生物种类 已知关联关系种类
    MDAD 627 142 1152
    aBiofilm 1720 140 2884
    DrugVirus 175 95 933
    下载: 导出CSV

    表  2  对比实验分组信息

    Table  2.   Comparison experiment grouping information

    实验组 GCN 图注意力堆叠自编码器 数据串联
    分组1 引入
    分组2 引入
    分组3 引入
    分组4
    GATSAE 引入 引入 引入
    下载: 导出CSV

    表  3  MLP与常见分类模型评价指标

    Table  3.   Evaluation indicators of MLP and common classification models

    分类器 AUC AUPR Pre Rec F1
    DT 0.9163 0.9164 0.9126 0.9203 0.9162
    RF 0.9366 0.9309 0.9124 0.8915 0.9214
    KNN 0.9489 0.9486 0.9041 0.9161 0.9098
    SVM 0.8744 0.8469 0.8212 0.8516 0.8357
    MLP 0.9619 0.9577 0.9166 0.9500 0.9329
    下载: 导出CSV

    表  4  模型指标对比值

    Table  4.   Comparison value of model indicators

    模型 AUC AUPR Pre Rec F1
    GCNMDA 0.9240 0.9124 0.8817 0.9257 0.9029
    EGATMDA 0.9434 0.9335 0.9015 0.9213 0.9118
    HGATDVA 0.9159 0.9008 0.8976 0.8989 0.8971
    HNERMDA 0.8977 0.9026 0.8537 0.8430 0.8463
    NIRBMMDA 0.8485 0.8327 0.8391 0.8388 0.8268
    GATSAE 0.9619 0.9577 0.9166 0.9300 0.9329
    下载: 导出CSV

    表  5  与大肠杆菌相关排名前10的药物

    Table  5.   Top 10 drugs related to Escherichia coli

    排名相关药物PMID
    1Ceftizoxime6299968
    2Aminosalicylic Acid33468700
    3Citral35776056
    4Clozapine25448498
    5Palmitic acid29719215
    6Esculin15137927
    7Azlocillin7033199
    8Azidocillin4563142
    9Aspirin30658983
    10Glipizide32995125
    下载: 导出CSV

    表  6  与阿司匹林相关排名前10的微生物

    Table  6.   Top 10 microbes related to Aspirin

    排名相关微生物PMID
    1Candida albicans33242673
    2Pseudomonas aeruginosa25088031
    3Staphylococcus epidermidis12555346
    4Human immunodeficiency virus28480270
    5Streptococcus mutansunconfirmed
    6Staphylococcus aureus34692677
    7Mycobacterium tuberculosis23997233
    8Escherichia coli30658983
    9Clostridium perfringens31865684
    10Human herpesvirusunconfirmed
    下载: 导出CSV

    表  7  与SARS-CoV-2相关排名前20的药物

    Table  7.   Top 20 drugs related to SARS-CoV-2

    排名相关微生物PMID
    1Chloroquine35859449
    2ABT37414987
    3Favipiravir33108587
    4BCX35062212
    5Luteolin32389723
    6Amodiaquine32486229
    7Cyclosporine34081806
    8Emetine33302852
    9Gemcitabine32432977
    10Hydroxychloroquine32373993
    11Amiodarone36426888
    12Obatoclax34989664
    13Remdesivir33436624
    14Chlorpromazine32773341
    15Nelfinavir35390430
    16EIPA37632140
    17Arbidol32955901
    18Niclosamide35348204
    19Dasatinib36704839
    20Eflornithine34055746
    下载: 导出CSV
  • [1] 杨博图. 基于相似性信息的微生物-药物关联关系预测方法研究[D]. 长沙: 中南大学, 2022: 69.

    YANG B T. Study on prediction method of microbial-drug correlation based on similarity information[D]. Changsha: Central South University, 2022: 69(in Chinese).
    [2] SHREINER A B, KAO J Y, YOUNG V B. The gut microbiome in health and in disease[J]. Current Opinion in Gastroenterology, 2015, 31(1): 69-75.
    [3] LEY R E, BÄCKHED F, TURNBAUGH P, et al. Obesity alters gut microbial ecology[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005, 102(31): 11070-11075.
    [4] TURNBAUGH P J, RIDAURA V K, FAITH J J, et al. The effect of diet on the human gut microbiome: a metagenomic analysis in humanized gnotobiotic mice[J]. Science Translational Medicine, 2009, 1(6): 6ra14.
    [5] GOLDMAN E. Antibiotic abuse in animal agriculture: exacerbating drug resistance in human pathogens[J]. Human and Ecological Risk Assessment: an International Journal, 2004, 10(1): 121-134.
    [6] VRBANAC A, DEBELIUS J W, JIANG L J, et al. An elegan(t) screen for drug-microbe interactions[J]. Cell Host & Microbe, 2017, 21(5): 555-556.
    [7] AARNOUDSE A L H J, DIELEMAN J P, VISSER L E, et al. Common ATP-binding cassette B1 variants are associated with increased digoxin serum concentration[J]. Pharmacogenetics and Genomics, 2008, 18(4): 299-305.
    [8] HAISER H J, SEIM K L, BALSKUS E P, et al. Mechanistic insight into digoxin inactivation by Eggerthella lenta augments our understanding of its pharmacokinetics[J]. Gut Microbes, 2014, 5(2): 233-238.
    [9] ONG F S, DEIGNAN J L, KUO J Z, et al. Clinical utility of pharmacogenetic biomarkers in cardiovascular therapeutics: a challenge for clinical implementation[J]. Pharmacogenomics, 2012, 13(4): 465-475.
    [10] VOORA D, SHAH S H, SPASOJEVIC I, et al. The SLCO1B1*5 genetic variant is associated with statin-induced side effects[J]. Journal of the American College of Cardiology, 2009, 54(17): 1609-1616.
    [11] RAMSEY L B, JOHNSON S G, CAUDLE K E, et al. The clinical pharmacogenetics implementation consortium guideline for SLCO1B1 and simvastatin-induced myopathy: 2014 update[J]. Clinical Pharmacology & Therapeutics, 2014, 96(4): 423-428.
    [12] VIOLI F, LIP G Y, PIGNATELLI P, et al. Interaction between dietary vitamin K intake and anticoagulation by vitamin K antagonists: is it really true? : a systematic review[J]. Medicine, 2016, 95(10): e2895.
    [13] GUTHRIE L, GUPTA S, DAILY J, et al. Human microbiome signatures of differential colorectal cancer drug metabolism[J]. NPJ Biofilms and Microbiomes, 2017, 3: 27.
    [14] ZHU L Z, DUAN G H, YAN C, et al. Prediction of microbe-drug associations based on chemical structures and the KATZ measure[J]. Current Bioinformatics, 2021, 16(6): 807-819.
    [15] MA Y J, LIU Q Q. Generalized matrix factorization based on weighted hypergraph learning for microbe-drug association prediction[J]. Computers in Biology and Medicine, 2022, 145: 105503.
    [16] ZHU B, XU Y, ZHAO P C, et al. NNAN: nearest neighbor attention network to predict drug-microbe associations[J]. Frontiers in Microbiology, 2022, 13: 846915.
    [17] LONG Y H, WU M, KWOH C K, et al. Predicting human microbe-drug associations via graph convolutional network with conditional random field[J]. Bioinformatics, 2020, 36(19): 4918-4927.
    [18] LONG Y H, WU M, LIU Y, et al. Ensembling graph attention networks for human microbe-drug association prediction[J]. Bioinformatics, 2020, 36(Supplement_2): i779-i786.
    [19] LONG Y H, ZHANG Y, WU M, et al. Predicting drugs for COVID-19/SARS-CoV-2 via heterogeneous graph attention networks[C]// Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine. Piscataway: IEEE Press, 2020: 455-459.
    [20] SUN Y Z, ZHANG D H, CAI S B, et al. MDAD: a special resource for microbe-drug associations[J]. Frontiers in Cellular and Infection Microbiology, 2018, 8: 424.
    [21] RAJPUT A, THAKUR A, SHARMA S, et al. aBiofilm: a resource of anti-biofilm agents and their potential implications in targeting antibiotic drug resistance[J]. Nucleic Acids Research, 2018, 46(D1): D894-D900.
    [22] ANDERSEN P I, IANEVSKI A, LYSVAND H, et al. Discovery and development of safe-in-man broad-spectrum antiviral agents[J]. International Journal of Infectious Diseases, 2020, 93: 268-276.
    [23] STEINBECK C, HOPPE C, KUHN S, et al. Recent developments of the chemistry development kit (CDK): an open-source Java library for chemo- and bioinformatics[J]. Current Pharmaceutical Design, 2006, 12(17): 2111-2120.
    [24] WEININGER D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules[J]. Journal of Chemical Information and Computer Sciences, 1988, 28(1): 31-36.
    [25] KAMNEVA O K. Genome composition and phylogeny of microbes predict their co-occurrence in the environment[J]. PLoS Computational Biology, 2017, 13(2): e1005366.
    [26] CHEN X, HUANG Y, YOU Z H, et al. A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases[J]. Bioinformatics, 2017, 33(5): 733-739.
    [27] CHEN X. KATZLDA: KATZ measure for the lncRNA-disease association prediction[J]. Scientific Reports, 2015, 5: 16840.
    [28] DENG L, HUANG Y B, LIU X J, et al. Graph2MDA: a multi-modal variational graph embedding model for predicting microbe-drug associations[J]. Bioinformatics, 2022, 38(4): 1118-1125.
    [29] JIANG H J, HUANG Y, YOU Z H. Predicting drug-disease associations via using Gaussian interaction profile and kernel-based autoencoder[J]. BioMed Research International, 2019, 2019: 2426958.
    [30] YANG H P, DING Y J, TANG J J, et al. Inferring human microbe-drug associations via multiple kernel fusion on graph neural network[J]. Knowledge-Based Systems, 2022, 238: 107888.
    [31] WANG C C, LI T H, HUANG L, et al. Prediction of potential miRNA-disease associations based on stacked autoencoder[J]. Briefings in Bioinformatics, 2022, 23(2): bbac021.
    [32] LIU D Y, HUANG Y B, NIE W J, et al. SMALF: miRNA-disease associations prediction based on stacked autoencoder and XGBoost[J]. BMC Bioinformatics, 2021, 22(1): 219.
    [33] WANG S D, LIN B Y, ZHANG Y Y, et al. SGAEMDA: predicting miRNA-disease associations based on stacked graph autoencoder[J]. Cells, 2022, 11(24): 3984.
    [34] LI J, ZHANG S, LIU T, et al. Neural inductive matrix completion with graph convolutional networks for miRNA-disease association prediction[J]. Bioinformatics, 2020, 36(8): 2538-2546.
    [35] LONG Y H, LUO J W. Association mining to identify microbe drug interactions based on heterogeneous network embedding representation[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 25(1): 266-275.
    [36] CHENG X L, QU J, SONG S B, et al. Neighborhood-based inference and restricted Boltzmann machine for microbe and drug associations prediction[J]. PeerJ, 2022, 10: e13848.
    [37] WANG M L, CAO R Y, ZHANG L K, et al. Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-nCoV) in vitro[J]. Cell Research, 2020, 30(3): 269-271.
    [38] GHASEMNEJAD-BERENJI M, PASHAPOUR S. Favipiravir and COVID-19: a simplified summary[J]. Drug Research, 2021, 71(3): 166-170.
    [39] THEOHARIDES T C, CHOLEVAS C, POLYZOIDIS K, et al. Long-COVID syndrome-associated brain fog and chemofog: Luteolin to the rescue[J]. BioFactors, 2021, 47(2): 232-241.
    [40] CHOY K T, WONG A Y, KAEWPREEDEE P, et al. Remdesivir, lopinavir, emetine, and homoharringtonine inhibit SARS-CoV-2 replication in vitro[J]. Antiviral Research, 2020, 178: 104786.
    [41] DHAR J, SAMANTA J, KOCHHAR R. Corona virus disease-19 pandemic: the gastroenterologists’ perspective[J]. Indian Journal of Gastroenterology, 2020, 39(3): 220-231.
  • 加载中
图(10) / 表(7)
计量
  • 文章访问数:  716
  • HTML全文浏览量:  212
  • PDF下载量:  54
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-11-07
  • 录用日期:  2024-01-12
  • 网络出版日期:  2024-03-09
  • 整期出版日期:  2026-01-15

目录

    /

    返回文章
    返回
    常见问答