This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synony...This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.展开更多
As watershed algorithm suffers from over-segmentation problem, this paper presented an efficient method to resolve this problem. First, pre-process of the image using median filter is made to reduce the effect of nois...As watershed algorithm suffers from over-segmentation problem, this paper presented an efficient method to resolve this problem. First, pre-process of the image using median filter is made to reduce the effect of noise. Second, watershed algorithm is employed to provide initial regions. Third, regions are merged according to the information between the region and boundary. In the merger processing based on the region information, an adaptive threshold of the difference between the neighboring regions is used as the region merge criteria, which is based on the human visual character. In the merger processing on the boundary information, the gradient is used to judge the true boundary of the image to avoid merging the foreground with the background regions. Finally, post-process to the regions using mathematical morphology open and close filter is done to smooth object boundaries. The experimental results show that this method is very efficient.展开更多
The issue of proper names recognition in Chinese text was discussed. An automatic approach based on association analysis to extract rules from corpus was presented. The method tries to discover rules relevant to exter...The issue of proper names recognition in Chinese text was discussed. An automatic approach based on association analysis to extract rules from corpus was presented. The method tries to discover rules relevant to external evidence by association analysis, without additional manual effort. These rules can be used to recognize the proper nouns in Chinese texts. The experimental result shows that our method is practical in some applications. Moreover, the method is language independent.展开更多
基金Project (No. 60082003) supported by the National Natural Science Foundation of China
文摘This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.
文摘As watershed algorithm suffers from over-segmentation problem, this paper presented an efficient method to resolve this problem. First, pre-process of the image using median filter is made to reduce the effect of noise. Second, watershed algorithm is employed to provide initial regions. Third, regions are merged according to the information between the region and boundary. In the merger processing based on the region information, an adaptive threshold of the difference between the neighboring regions is used as the region merge criteria, which is based on the human visual character. In the merger processing on the boundary information, the gradient is used to judge the true boundary of the image to avoid merging the foreground with the background regions. Finally, post-process to the regions using mathematical morphology open and close filter is done to smooth object boundaries. The experimental results show that this method is very efficient.
基金The National Hi-Tech Research and Development Program ( 863 )of China ( No2002AA119050)
文摘The issue of proper names recognition in Chinese text was discussed. An automatic approach based on association analysis to extract rules from corpus was presented. The method tries to discover rules relevant to external evidence by association analysis, without additional manual effort. These rules can be used to recognize the proper nouns in Chinese texts. The experimental result shows that our method is practical in some applications. Moreover, the method is language independent.