摘要
为准确提取电子邮件的内容,对邮件的组成结构进行详尽的分析,归纳出邮件正文特征,并设计出一个基于MIME邮件结构的邮件预处理系统。该系统采用分块处理和特征识别的方法,克服电子邮件不规范的缺点,并对邮件正文中的回复行和广告行进行过滤,从而实现对邮件内容快速准确提取。
In order to accurately extract the information of E - mail, E - mail' s structure and content features are analyzed, and an E - mail pretreatment system based on structure of MIME mail is designed. Using block - treatment and feature identification methods, this system overcomes the shortcomings of informal style and filteres reply lines and advertising lines. The system finally realizes expectative goal of extracting E - mail information quickly and accurately.
出处
《现代图书情报技术》
CSSCI
北大核心
2008年第5期85-88,共4页
New Technology of Library and Information Service