摘要
目的:探索基于胸部CT图像的结合放射组学特征和语义特征的机器学习模型,以准确诊断非结核分枝杆菌肺病和肺结核。方法:回顾性收集天津市海河医院2017年1月至2020年12月确诊的120例非结核分枝杆菌肺病和120例肺结核患者的胸部CT图像,分层随机抽取168例(70%)作为训练集,72例(30%)作为测试集。收集西安市胸科医院确诊的25例非结核分枝杆菌肺病和25例肺结核患者的胸部CT图像,作为外部验证集。从全部胸部CT图像中提取12种语义特征和2107个放射组学特征,其中放射组学特征通过特征降维保留40个。采用支持向量机(support vector machines,SVM)算法建立了三个机器学习分类模型,分别是语义模型、放射组学模型、结合放射组学和语义特征的放射组学-语义模型。通过受试者工作特征曲线及曲线下面积对机器学习模型的诊断性能进行评估,用DeLong检验比较三种模型之间差异的统计学意义。结果:在测试集上,放射组学-语义模型、放射组学模型和语义模型的曲线下面积分别为0.9853、0.9282、0.7901。语义模型和放射组学-语义模型,语义模型和放射组学模型之间差异均有统计学意义(Z=2.759,P=0.006;Z=2.230,P=0.026);放射组学-语义模型和放射组学模型之间差异无统计学意义(Z=0.761,P=0.502)。在外部验证集上,放射组学-语义模型、放射组学模型和语义模型的曲线下面积分别为0.9216、0.9024和0.7624。放射组学-语义模型和语义模型之间差异有统计学意义(Z=2.126,P=0.034);放射组学-语义模型和放射组学模型之间差异无统计学意义(Z=0.368,P=0.713)。结论:与语义模型相比,结合放射组学和语义特征的机器学习模型在区分肺结核和非结核分枝杆菌肺病方面具有良好的诊断效率和临床应用价值,尽管与放射组学模型相比,其性能改进并不显著。
Objective:To explore a machine learning model based on chest CT images for differential diagnosis of nontuberculous mycobacterium lung disease(NTM-LD)and pulmonary tuberculosis(PTB).Methods:Chest CT images of 120 patients(NTM-LD)and 120 patients(PTB)were retrospectively collected in Tianjin Haihe Hospital from January 2017 to December 2020.168 cases(70%)were randomly selected as the training set,and 72 cases(30%)were selected as the testing set.Chest CT images of 25 patients(NTM-LD)and 25 patients(PTB)from Xi’an Chest Hospital were collected as an external validation set.A total of 12 radiologist semantic features and 2107 radiomic features were extracted from chest CT images,and 40 radiomic features were retained through feature dimensionality reduction.Three distinct machine learning classification models were constructed utilizing the Support Vector Machines(SVM)algorithm.These models encompass a semantic model,a radiomics model,and a hybrid radiomics-semantic model.The diagnostic performance of the three models were evaluated by the receiver operating characteristic(ROC)curve and the area under the curve(AUC).The statistical significance of differences between the three models were compared by DeLong test.Results:In the testing set,the AUC of radiomics-semantic model,radiomics model and semantic model were 0.9853,0.9282,and 0.7901,respectively.There were statistically significant differences between semantic model and radiomics-semantic model,as well as between semantic model and radiomics model(Z=2.759,P=0.006;Z=2.230,P=0.026).However,there was no statistically significant difference between radiomics-semantic model and radiomics model(Z=0.761,P=0.502).In the external validation set,the AUC of radiomics-semantic model,radiomics model and semantic model were 0.9216,0.9024 and 0.7624,respectively.There was a statistically significant difference between radiomics-semantic model and semantic model(Z=2.126,P=0.034).However,there was no statistically significant difference between radiomics-semantic model and radiomics model(Z=0.368,P=0.713).Conclusion:Compared with semantic model,the machine learning model combining radiomics and semantic features showed an excellent diagnostic efficiency and great clinical application value in distinguishing NTM-LD and PTB.Although its performance improvement was not significant compared to radiomics model.
作者
仲玲珊
王莉
张硕
李楠
杨晴媛
丁文龙
陈星枝
黄陈翠
邢志珩
Zhong Lingshan;Wang Li;Zhang Shuo;Li Nan;Yang Qingyuan;Ding Wenlong;Chen Xingzhi;Huang Chencui;Xing Zhiheng(Haihe Hospital,Tianjin University,Department of Radiology,Tianjin Haihe Hospital,Tianjin Institute of Respiratory Diseases,TCM Key Research Laboratory for Infectious Disease Prevention for State Administration of Traditional Chinese Medicine,Tianjin 300350,China;Deepwise AI Lab,Beijing Deepwise and League of PHD Technology Co.,Ltd,Beijing 100080,China)
出处
《中国防痨杂志》
CAS
CSCD
北大核心
2024年第9期1042-1049,共8页
Chinese Journal of Antituberculosis
基金
天津市科技计划项目-基于CT标注大数据资源NTM诊断应用研究(21JCYBJC00510)
天津市海河医院科技基金项目-基于CT标注大数据资源的AI辅助NTM-LD诊断应用研究(HHYY-202007)
天津市医学重点学科(专科)建设项目资助(TJYXZDXK-067C
TJYXZDXK-063B)
关键词
结核
肺
分枝杆菌感染
诊断
鉴别
体层摄影术
X线计算机
影像组学
Tuberculosis
pulmonary
Mycobacterium infections
Diagnosis
differential
Tomography
X-ray computed
Radiomics