摘要
为提高数据质量及使数据能够适用于数据分析及数据挖掘算法,获取的数据需要进行数据清洗,讨论了缺失值的处理方法,及使用python编程语言实现拉格朗日函数对数据集的清洗,并给出清洗前后线图用于对比分析。
In order to improve data quality and adapt data to data analysis and data mining algorithms,data cleaning is needed for the obtained data.This paper discusses methods to handle missing values.Data cleaning is achieved by Lagrangian interpolation method with the help of Python programming language.Line charts of the dataset before and after data cleaning are also shown for comparison and analysis.
作者
赵莉
孙娜
李丽萍
崔杰
ZHAO Li;SUN Na;LI Li-ping;CUI Jie(School of Electronics&Information Engineering,Liaoning University of Technology,Jinzhou 121001,China)
出处
《辽宁工业大学学报(自然科学版)》
2022年第2期102-105,117,共5页
Journal of Liaoning University of Technology(Natural Science Edition)
关键词
数据清洗
拉格朗日
插值
data cleaning
Lagrangian
interpolation