期刊文献+

基于NLP技术的建筑工程质量隐患信息抽取 被引量:1

NLP-based Construction Hazard Information Extraction for Construction On-site Management
下载PDF
导出
摘要 在工程施工质量管理中,质量隐患整改单蕴含着丰富的数据价值和质量管理潜力。由于其具有非结构化、数量庞大、分布散乱的特点,依靠人工进行信息处理耗时费力,影响建筑工程项目的质量控制和管理效率。为解决以上问题,本文提出了一种基于自然语言处理(NLP)技术的建筑工程质量隐患整改单信息抽取方法,研究主要分三步:(1)数据预处理:对原数据进行分词、词向量表示和数据标注;(2)模型构建:构建基于Bi-LSTM-CRF的质量隐患实体识别模型;(3)模型训练及可视化:对识别结果进行内外部评价以及词义信息的可视化分析。该模型在分析武汉中建八局在建工程的1741条质量隐患整改记录中得到了应用和验证,结果表明该模型对质量隐患整改单中的词义信息有较好的抽取和识别能力。 In construction quality management,quality hazard rectification forms contain rich data value and quality management potential.However,manually identifying and analyzing the numerous unstructured or semi-structured text information in the quality hazard rectification form is time-consuming and labour-intensive.To improve the automated processing of hazard information extraction efficiency and safety management decision,this paper proposes a novel nature language processing(NLP)technology-based method to extract hazard information in quality hazard rectification forms and visualization.This method consists of:(1)data pre-processing for the construction quality hazard rectification forms,word segmentation,word vector representation and data annotation are carried out on the original data;(2)developing a hazard information(entity)identification model based on Bi-LSTM-CRF;and(3)training the model and conducting internal and external evaluation of the identification results as well as the visualization analysis of the hazard information.Finally,the model has been tested and evaluated with 1,741 pieces of quality hazard rectification information in the construction project of China Construction Eighth Engineering Division Co LTD.The experimental results reveal that the model had excellent ability to extract and identify hazard information in quality hazard rectification forms.
作者 钟雪妍 钟波涛 沈罗昕 向然 潘杏 ZHONG Xueyan;ZHONG Botao;SHEN Luoxin;XIANG Ran;PAN Xing(School of Civil and Hydraulic Engineering,Huazhong University of Science and Technology,Wuhan 430074,China)
出处 《土木工程与管理学报》 2023年第5期113-120,128,共9页 Journal of Civil Engineering and Management
基金 国家重点研发计划(2022YFC3801700)。
关键词 质量管理 质量隐患信息 信息抽取 NLP技术 信息可视化分析 construction quality and safety management quality hazard information information extraction NLP technology information visualization analysis
  • 相关文献

参考文献7

二级参考文献62

  • 1向晓雯,史晓东,曾华琳.一个统计与规则相结合的中文命名实体识别系统[J].计算机应用,2005,25(10):2404-2406. 被引量:37
  • 2Bengio Y,Ducharme R, Vincent P. A neural probabilistic language model[ J]. Journal of Machine Learning Research,2003,3(7) :1 137-1 155.
  • 3Michael U G, AapoHyvrinen. Noise-contrastive estimation of unnormalized statistical models,with applications to natural imagestatistics[ J] ? The Journal of Machine Learning Research,2012,13( 2) ;307-361.
  • 4Tomas M,Chen K,Corrado G. Efficient estimation of word representations in vector space[ EB/OL].( 2013-08-18) [ 2013-09-07]http : / / arxiv. org/ abs/1301.3781.
  • 5Bengio Y,LeCun Y. Scaling Learning Algorithms Towards AI [ M ]//Large-Scale Kernel Machines. Cambridge: MIT Press,2007.
  • 6Mikolov T, Karafi M, Burget L, et al. Recurrent neural network based language model [ C]//Proceedings of Interspeech.Chiba,Japan:MIT Press,2010: 131 -138.
  • 7Mikolov T,Ilya S,Kai C,et al. Distributed representations of words and phrases and their compositionality[EB/OL]. [2013-10-16]http:// arxiv.org/ abs/1310.4546.
  • 8Elman J. Finding structure in time[ J]. Cognitive Science, 1990,14(7) : 179-211.
  • 9Rumelhart D E, Hinton G E, Williams R J. Learning internal representations by back-propagating errors[ J]. Nature, 1986,323(9) :533-536.
  • 10Andriy M,Yee W T. A fast and simple algorithm for training neural probabilistic language models[ EB/OL] .(2009-10-12)[2012-06-10] http : / / arxiv. org/ftp/arxiv/papers/12061.

共引文献158

同被引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部