摘要
主要工作集中在医疗数据的预处理和神经网络模型的训练.首先结合数据挖掘和自然语言处理技术,在保留医疗数据语义基础上对分词工具语料库进行补充,将中文数据分词,同时对其中大量冗余信息进行清洗,将中文数据转换成计算机可识别的编码,其次利用多种经典热门的神经网络模型来训练医疗数据,同时对比基于传统决策树模型的GBDT模型的训练结果,最后实验结果证明,对于多种疾病诊断,神经网络模型的效果要优于其他模型,诊断准确率接近90%.
In the report,the main work was focused on the preprocessing of medical data and the training of neural network model.Firstly,combined with data mining and natural language processing technology,the corpus of word segmentation tool was supplemented on the basis of preserving the semantics of medical data,and Chinese data is segmented.At the same time,a large number of redundant information was cleaned,and Chinese data was converted into computer recognizable coding.Secondly,many classical popular neural networks were used,and the network model was used to train medical data,and the training results of GBDT model based on traditional decision tree model were compared.Finally,the experimental results showed that the effects of neural network model was better than that of the other models for diagnosis of various diseases,and the diagnostic accuracy was close to 90%.
作者
欧明望
叶春杨
Ou Mingwang;Ye Chunyang(College of Computer and Cybersecurity,Hainan University,Haikou 570228,China;State Key Laboratory of Marine Resources Utilization in South China Sea,Hainan University,Haikou 570228,China)
出处
《海南大学学报(自然科学版)》
CAS
2019年第3期219-226,共8页
Natural Science Journal of Hainan University
基金
国家自然科学基金(61562019,61379047)
海南省重点研发计划(ZDYF2017010)
海南省自然科学基金(20156223)
海南省高等学校教育教改重点项目(hnjg2017ZD-1)
关键词
神经网络
医疗诊断
个自然语言处理
数据预处理
neural network
machine learning
personalized medicine
data preprocessing