摘要
可视化文本分析表达了一种对文本数据运用交互式图形呈现方式,实现知识发现的信息分析技术和过程,其应用过程一般分为文本处理、可视化呈现和交互理解三个阶段。进行文本可视化分析操作时,需根据研究对象的特征,选择恰当的工具,从原始材料中抽取文本的特征属性或元数据,在合适的视觉编码描绘和概括文本内容、结构、关系等基础上,与用户互动,揭示文本信息的特征和规律。已有研究表明,借助技术在计算和可视化上的能力,可视化文本分析技术可以弥补人工分析时存在的耗时长、主观性强等问题,提升文本信息处理与理解的效率,深入探察数据中隐藏的特征、关系和模式。基于全国教育科学规划教育技术类课题的案例研究验证了这些优势,且这一做法正逐渐引发业内的研究关注,成为一大发展趋势。案例研究还发现:受中文自然语言处理技术还不够成熟的影响,可视化文本分析在中文文本应用中还比较有限,在分词、工具选用以及分析深度等方面还存在不足。
Visual text analytics shows a kind of information analysis techniques and processes of using interactive graphical methods to achieve knowledge discovery. Its application has three steps, including text processing, visual presentation and interactive interpretation. First, feature attributes or metadata should be extracted from the original text material with appropriate tools according to the characteristics of the study object. Then, based on the proper visual coding to describe and summarize the content, structure and relations of texts, traits and rules of the textual information are discovered through user interaction. Studies have shown that, with the capability of calculation and visualization of the technology, visual text analytics can make up the problems existing in manual analysis such as the time-consuming and subjectivity, enhance the efficiency of text information processing and understanding, and deeply explore the hidden characteristics, relationships and patterns of data. A case study carried out on the topics of educational technology research projects of the National Education Science Plan from 2006 to 2013 verified the advantages of visual text analytics in text comprehension. Visual text analytics is gradually arousing research attention and becomes a major trend. The case study also discussed the weakness of title tokenization, the inadequacies of research tools to support Chinese text analytics and the insufficient utilization of text features. Affected by the immaturity of Chinese natural language processing technology, the application of visual text analytics in Chinese is still limited.
出处
《现代远程教育研究》
CSSCI
2015年第3期104-112,共9页
Modern Distance Education Research
基金
全国教育科学"十二五"规划2013年度教育部重点课题"智慧教育视域下学习活动流及其信息模型建构与应用"(DCA130222)
关键词
信息可视化
文本分析
可视化工具
操作方法
案例研究
Information Visualization
Text Analytics
Visualization Tools
Operation Method
Case Study