摘要
随着数字化转型的浪潮席卷全球,制造企业每天都会产生大量的图表数据,传统的图表分析方法很难对图表数据进行高效、准确的分析,自动化图表分析方法成为图表分析的重要手段。为解决自动化图表分析方法在实际应用时很难满足具体需求的问题,提出了一种基于自然语言生成的制造企业自动化图表分析方法。该方法基于LSTM对图表数据进行分析,并针对分析过程中出现的多余数据误导LSTM等问题,在嵌入层之后增加判别器层使LSTM能够根据图表类型进行更有针对性的语义理解和文本预测;针对图表分析过程中生成描述语句质量差等问题,参考集束搜索和随机采样策略,提出随机集束采样策略以提高图表分析质量,并引入知识蒸馏方法对LSTM进行优化,进一步提高描述文本的质量。实验证明,相较于LSTM,该方法文本质量提升了8.9%。为了便于将该方法应用在实际中,设计并开发了制造企业自动化图表分析系统,并将该方法引入作为图表分析工具。实验结果表明,所提方法能够提高制造企业图表分析的质量和效率。
With the wave of digital transformation,manufacturing enterprises produce a large number of chart data every day.Traditional chart analysis methods are difficult to analyze chart data efficiently and accurately.Automated chart analysis methods have become an important means of chart analysis.In order to solve the problem that the automatic chart analysis method is difficult to meet the specific needs in practical application,an automatic chart analysis method of manufacturing enterprises based on natural language generation is proposed.This method analyzes the chart data based on LSTM,and in order to solve the problem of misleading LSTM by redundant data in the analysis process,a discriminator layer is added after the embedding layer to enable LSTM to perform more targeted semantic understanding and text prediction according to the type of chart.Aiming at the problem of poor quality of description sentences generated in the process of diagram analysis,a random cluster sampling strategy is proposed to improve the quality of diagram analysis by referring to beam search and random sampling strategy,and knowledge distillation method is introduced to optimize LSTM to further improve the quality of description text.Experiments show that this method improves the text quality by 8.9%compared with LSTM.In order to apply the method in practice,an automatic chart analysis system for manufacturing enterprises is designed and developed,and the method is introduced as a chart analysis tool.Experimental results show that the application of this method can improve the quality and efficiency of chart analysis in manufacturing enterprises.
作者
王旭
刘昌宏
李生春
刘爽
赵康廷
陈亮
WANG Xu;LIU Changhong;LI Shengchun;LIU Shuang;ZHAO Kangting;CHEN Liang(School of Computer Science,Xi’an Polytechnic University,Xi’an 710048,China;China Tobacco Chongqing Industrial Co.Ltd.,Qianjiang Cigarette Factory,Chongqing 409000,China;School of Mathematics and Statistics,Shaanxi Normal University,Xi’an 710119,China)
出处
《计算机科学》
CSCD
北大核心
2024年第4期174-181,共8页
Computer Science
基金
陕西省教育厅重点科学研究计划(22JS021)。
关键词
图表分析
自然语言生成
LSTM
知识蒸馏
Chart analysis
Natural language generation
LSTM
Knowledge distillation