摘要
针对互联网日益泛滥的色情信息,分析了向量空间模型中KNN算法,并对它的缺陷进行了改进,将其运用于色情网页过滤中,提出了一种色情网页过滤解决方案。该方法首先对特征项的选取和权重计算的方法进行了优化,然后使用改进后KNN算法进行网页分类。实验表明,通过改进,有效地降低了向量空间的维数,提高了网页分类的精度和速度,能有效地识别并过滤色情网页。
Owing to pornographic information increasingly overruns on Internet,KNN algorithm used in vector space model is analyzed and improved.By applying it in pornographic web page filtering,a new model for page filtering is designed.The method improves feature vectors selection and weight computing,then classifies the web page using KNN algorithm,Experiments prove this method improves the classify speed and precision,narrow down vector space,and can realizes efficient pornographic web page filtering.
出处
《计算机安全》
2009年第9期17-19,22,共4页
Network & Computer Security
关键词
KNN算法
向量空间模型
特征选择
权重计算
色情网页过滤
KNN algorithm
Vector space model
Feature selection
Weight algorithm
Pornographic webpage filtering