Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred t...Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred to as concept drift,mining this data stream is a challenging problem for researchers.The majority of the existing drift detection techniques are based on classification errors,which have higher probabilities of false-positive or missed detections.To improve classification accuracy,there is a need to develop more intuitive detection techniques that can identify a great number of drifts in the data streams.This paper presents an adaptive unsupervised learning technique,an ensemble classifier based on drift detection for opinion mining and sentiment classification.To improve classification performance,this approach uses four different dissimilarity measures to determine the degree of concept drifts in the data stream.Whenever a drift is detected,the proposed method builds and adds a new classifier to the ensemble.To add a new classifier,the total number of classifiers in the ensemble is first checked if the limit is exceeded before the classifier with the least weight is removed from the ensemble.To this end,a weighting mechanism is used to calculate the weight of each classifier,which decides the contribution of each classifier in the final classification results.Several experiments were conducted on real-world datasets and the resultswere evaluated on the false positive rate,miss detection rate,and accuracy measures.The proposed method is also compared with the state-of-the-art methods,which include DDM,EDDM,and PageHinkley with support vector machine(SVM)and Naive Bayes classifiers that are frequently used in concept drift detection studies.In all cases,the results show the efficiency of our proposed method.展开更多
Nowadays,with the advent of the age of Web 2.0,several social recommendation methods that use social network information have been proposed and achieved distinct developments.However,the most critical challenges for ...Nowadays,with the advent of the age of Web 2.0,several social recommendation methods that use social network information have been proposed and achieved distinct developments.However,the most critical challenges for the existing majority of these methods are:(1)They tend to utilize only the available social relation between users and deal just with the cold-start user issue.(2)Besides,these methods are suffering from the lack of exploitation of content information such as social tagging,which can provide various sources to extract the item information to overcome the cold-start item and improve the recommendation quality.In this paper,we investigated the efficiency of data fusion by integrating multi-source of information.First,two essential factors,user-side information,and item-side information,are identified.Second,we developed a novel social recommendation model called Two-Sided Regularization(TSR),which is based on the probabilistic matrix factorization method.Finally,the effective quantum-based similarity method is adapted to measure the similarity between users and between items into the proposed model.Experimental results on the real dataset show that our proposed model TSR addresses both of cold-start user and item issues and outperforms state-ofthe-art recommendation methods.These results indicate the importance of incorporating various sources of information in the recommendation process.展开更多
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups(Project under Grant Number(RGP.2/49/43)).
文摘Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred to as concept drift,mining this data stream is a challenging problem for researchers.The majority of the existing drift detection techniques are based on classification errors,which have higher probabilities of false-positive or missed detections.To improve classification accuracy,there is a need to develop more intuitive detection techniques that can identify a great number of drifts in the data streams.This paper presents an adaptive unsupervised learning technique,an ensemble classifier based on drift detection for opinion mining and sentiment classification.To improve classification performance,this approach uses four different dissimilarity measures to determine the degree of concept drifts in the data stream.Whenever a drift is detected,the proposed method builds and adds a new classifier to the ensemble.To add a new classifier,the total number of classifiers in the ensemble is first checked if the limit is exceeded before the classifier with the least weight is removed from the ensemble.To this end,a weighting mechanism is used to calculate the weight of each classifier,which decides the contribution of each classifier in the final classification results.Several experiments were conducted on real-world datasets and the resultswere evaluated on the false positive rate,miss detection rate,and accuracy measures.The proposed method is also compared with the state-of-the-art methods,which include DDM,EDDM,and PageHinkley with support vector machine(SVM)and Naive Bayes classifiers that are frequently used in concept drift detection studies.In all cases,the results show the efficiency of our proposed method.
文摘Nowadays,with the advent of the age of Web 2.0,several social recommendation methods that use social network information have been proposed and achieved distinct developments.However,the most critical challenges for the existing majority of these methods are:(1)They tend to utilize only the available social relation between users and deal just with the cold-start user issue.(2)Besides,these methods are suffering from the lack of exploitation of content information such as social tagging,which can provide various sources to extract the item information to overcome the cold-start item and improve the recommendation quality.In this paper,we investigated the efficiency of data fusion by integrating multi-source of information.First,two essential factors,user-side information,and item-side information,are identified.Second,we developed a novel social recommendation model called Two-Sided Regularization(TSR),which is based on the probabilistic matrix factorization method.Finally,the effective quantum-based similarity method is adapted to measure the similarity between users and between items into the proposed model.Experimental results on the real dataset show that our proposed model TSR addresses both of cold-start user and item issues and outperforms state-ofthe-art recommendation methods.These results indicate the importance of incorporating various sources of information in the recommendation process.