Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set ...Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set of well defined rules,is proposed.The features describe the contents of web pages from different aspects and they can be used to improve the ranking performance for web search.The acquired feature has the advantages of unified form and less noise,and can easily be used in web page relevance ranking.A special specs for judging the relevance between user queries and acquired features is also proposed.Experimental results show that the features acquired by the proposed approach and the feature relevance specs can significantly improve the relevance ranking performance for web search.展开更多
This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed ac...This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations.展开更多
Popularity of blogs and the amount of information in the blogosphere increase so fast that it is difficult for Internet users to search the information they care about. Compared with conventional webs,links in the blo...Popularity of blogs and the amount of information in the blogosphere increase so fast that it is difficult for Internet users to search the information they care about. Compared with conventional webs,links in the blogosphere are more abundant and conversations between bloggers are more fre-quent. This paper proposes a method of ranking bloggers based on link analysis,which can exemplify the characteristics of blogs,and reduce the influence of link spamming. This method can also bring convenience to users to read blogs,and it can supply a new methodology for information retrieval in the blogosphere. To ensure the reliability of the ranking results,some evaluation indicators of the im-portant bloggers are proposed,and the grading results of bloggers using the proposed method is compared with that using other indicators. At last,correlation analysis is used to verify the consistency between the proposed method and the evaluation indicators.展开更多
基金The National Natural Science Foundation of China(No.60673087)
文摘Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set of well defined rules,is proposed.The features describe the contents of web pages from different aspects and they can be used to improve the ranking performance for web search.The acquired feature has the advantages of unified form and less noise,and can easily be used in web page relevance ranking.A special specs for judging the relevance between user queries and acquired features is also proposed.Experimental results show that the features acquired by the proposed approach and the feature relevance specs can significantly improve the relevance ranking performance for web search.
文摘This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations.
基金the National Natural Science Foundation of China (No.60435020, 60302021).
文摘Popularity of blogs and the amount of information in the blogosphere increase so fast that it is difficult for Internet users to search the information they care about. Compared with conventional webs,links in the blogosphere are more abundant and conversations between bloggers are more fre-quent. This paper proposes a method of ranking bloggers based on link analysis,which can exemplify the characteristics of blogs,and reduce the influence of link spamming. This method can also bring convenience to users to read blogs,and it can supply a new methodology for information retrieval in the blogosphere. To ensure the reliability of the ranking results,some evaluation indicators of the im-portant bloggers are proposed,and the grading results of bloggers using the proposed method is compared with that using other indicators. At last,correlation analysis is used to verify the consistency between the proposed method and the evaluation indicators.