Sentiment analysis or opinion mining(OM)concepts become familiar due to advances in networking technologies and social media.Recently,massive amount of text has been generated over Internet daily which makes the patte...Sentiment analysis or opinion mining(OM)concepts become familiar due to advances in networking technologies and social media.Recently,massive amount of text has been generated over Internet daily which makes the pattern recognition and decision making process difficult.Since OM find useful in business sectors to improve the quality of the product as well as services,machine learning(ML)and deep learning(DL)models can be considered into account.Besides,the hyperparameters involved in the DL models necessitate proper adjustment process to boost the classification process.Therefore,in this paper,a new Artificial Fish Swarm Optimization with Bidirectional Long Short Term Memory(AFSO-BLSTM)model has been developed for OM process.The major intention of the AFSO-BLSTM model is to effectively mine the opinions present in the textual data.In addition,the AFSO-BLSTM model undergoes pre-processing and TF-IFD based feature extraction process.Besides,BLSTM model is employed for the effectual detection and classification of opinions.Finally,the AFSO algorithm is utilized for effective hyperparameter adjustment process of the BLSTM model,shows the novelty of the work.A complete simulation study of the AFSO-BLSTM model is validated using benchmark dataset and the obtained experimental values revealed the high potential of the AFSO-BLSTM model on mining opinions.展开更多
Opinion Mining(OM)studies in Arabic are limited though it is one of the most extensively-spoken languages worldwide.Though the interest in OM studies in the Arabic language is growing among researchers,it needs a vast...Opinion Mining(OM)studies in Arabic are limited though it is one of the most extensively-spoken languages worldwide.Though the interest in OM studies in the Arabic language is growing among researchers,it needs a vast number of investigations due to the unique morphological principles of the language.Arabic OM studies experience multiple challenges owing to the poor existence of language sources and Arabic-specific linguistic features.The comparative OM studies in the English language are wide and novel.But,comparative OM studies in the Arabic language are yet to be established and are still in a nascent stage.The unique features of the Arabic language make it essential to expand the studies regarding the Arabic text.It contains unique featuressuchasdiacritics,elongation,inflectionandwordlength.Thecurrent study proposes a Political Optimizer with Probabilistic Neural Network-based Comparative Opinion Mining(POPNN-COM)model for the Arabic text.The proposed POPNN-COM model aims to recognize comparative and non-comparative texts in Arabic in the context of social media.Initially,the POPNN-COM model involves different levels of data pre-processing to transform the input data into a useful format.Then,the pre-processed data is fed into the PNN model for classification and recognition of the data under different class labels.At last,the PO algorithm is employed for fine-tuning the parameters involved in this model to achieve enhanced results.The proposed POPNN-COM model was experimentally validated using two standard datasets,and the outcomes established the promising performance of the proposed POPNN-COM method over other recent approaches.展开更多
Sentiment analysis or Opinion Mining (OM) has gained significant interest among research communities and entrepreneurs in the recentyears. Likewise, Machine Learning (ML) approaches is one of the interestingresearch d...Sentiment analysis or Opinion Mining (OM) has gained significant interest among research communities and entrepreneurs in the recentyears. Likewise, Machine Learning (ML) approaches is one of the interestingresearch domains that are highly helpful and are increasingly applied in severalbusiness domains. In this background, the current research paper focuses onthe design of automated opinion mining model using Deer Hunting Optimization Algorithm (DHOA) with Fuzzy Neural Network (FNN) abbreviatedas DHOA-FNN model. The proposed DHOA-FNN technique involves fourdifferent stages namely, preprocessing, feature extraction, classification, andparameter tuning. In addition to the above, the proposed DHOA-FNN modelhas two stages of feature extraction namely, Glove and N-gram approach.Moreover, FNN model is utilized as a classification model whereas GTOA isused for the optimization of parameters. The novelty of current work is thatthe GTOA is designed to tune the parameters of FNN model. An extensiverange of simulations was carried out on the benchmark dataset and the resultswere examined under diverse measures. The experimental results highlightedthe promising performance of DHOA-FNN model over recent state-of-the-arttechniques with a maximum accuracy of 0.9928.展开更多
This paper focuses on how to improve aspect-level opinion mining for online customer reviews. We first propose a novel generative topic model, the Joint Aspect/Sentiment (JAS) model, to jointly extract aspects and asp...This paper focuses on how to improve aspect-level opinion mining for online customer reviews. We first propose a novel generative topic model, the Joint Aspect/Sentiment (JAS) model, to jointly extract aspects and aspect-dependent sentiment lexicons from online customer reviews. An aspect-dependent sentiment lexicon refers to the aspect-specific opinion words along with their aspect-aware sentiment polarities with respect to a specific aspect. We then apply the extracted aspectdependent sentiment lexicons to a series of aspect-level opinion mining tasks, including implicit aspect identification, aspect-based extractive opinion summarization, and aspect-level sentiment classification. Experimental results demonstrate the effectiveness of the JAS model in learning aspectdependent sentiment lexicons and the practical values of the extracted lexicons when applied to these practical tasks.展开更多
At present,the immense development of social networks allows generating a significant amount of textual data,which has facilitated researchers to explore the field of opinion mining.In addition,the processing of textu...At present,the immense development of social networks allows generating a significant amount of textual data,which has facilitated researchers to explore the field of opinion mining.In addition,the processing of textual opinions based on the term frequency-inverse document frequency method gives rise to a dimensionality problem.This study aims to detect the nature of opinions in the Arabic language employing a swarm intelligence(SI)-based algorithm,Harris hawks algorithm,to select the most relevant terms.The experimental study has been tested on two datasets:Arabic Jordanian General Tweets and Opinion Corpus for Arabic.In terms of accuracy and number of features,the results are better than those of other SI based algorithms,such as grey wolf optimizer and grasshopper optimization algorithm,and other algorithms in the literature,such as differential evolution,genetic algorithm,particle swarm optimization,basic and enhanced whale optimizer algorithm,slap swarm algorithm,and ant–lion optimizer.展开更多
Opinion summarization recapitulates the opinions about a common topic automatically.The primary motive of summarization is to preserve the properties of the text and is shortened in a way with no loss in the semantics...Opinion summarization recapitulates the opinions about a common topic automatically.The primary motive of summarization is to preserve the properties of the text and is shortened in a way with no loss in the semantics of the text.The need of automatic summarization efficiently resulted in increased interest among communities of Natural Language Processing and Text Mining.This paper emphasis on building an extractive summarization system combining the features of principal component analysis for dimensionality reduction and bidirectional Recurrent Neural Networks and Long Short-Term Memory(RNN-LSTM)deep learning model for short and exact synopsis using seq2seq model.It presents a paradigm shift with regard to the way extractive summaries are generated.Novel algorithms for word extraction using assertions are proposed.The semantic framework is well-grounded in this research facilitating the correct decision making process after reviewing huge amount of online reviews,considering all its important features into account.The advantages of the proposed solution provides greater computational efficiency,better inferences from social media,data understanding,robustness and handling sparse data.Experiments on the different datasets also outperforms the previous researches and the accuracy is claimed to achieve more than the baselines,showing the efficiency and the novelty in the research paper.The comparisons are done by calculating accuracy with different baselines using Rouge tool.展开更多
Global changes took place at a neck-breaking speed in lots of fields along with the Web 2.0 era, which can be stated as the new Internet trend. Web pages which once were a statical structure that can be said to become...Global changes took place at a neck-breaking speed in lots of fields along with the Web 2.0 era, which can be stated as the new Internet trend. Web pages which once were a statical structure that can be said to become dynamic pages created by users, and in this regard they can be said to have been democratized by evolving. Social media, which were structured alongside with this era, by providing a large data flow for businesses, present new and improvable opportunities in the field of creating effective strategies. There are lots of blogs in today's Internet environment which includes customer ideas regarding the products/services that they possess. This environment, which in a way globalizes the customer ideas, is a new medium suitable for examination in terms of its increasing the business-customer interaction and due to its transporter nature; it provides the text data that may be analyzed in the field of Customer Relationship Management to businesses. Thus, businesses should follow blog environments to see how the product/service they provide is greeted in terms of the customer focus and it should be seen as an important job on which they can conduct effective analyses. For this purpose, a model proposal that will assign the ideas to the Turkish blogs was given in the study. Opinion mining methods were used in the model, and so to perceive a general look-on about products/services, a methodology was devised, which will assign the text based opinion data on the Turkish blogs to the poles. Success of the pole assignment of the model is evaluated with the precision measure.展开更多
The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative ...The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting(IWW). IWW is computed based on the significance of the word in the document(SWD) and the significance of the word in the expression(SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed:(1) Classification performance is enhanced;(2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.展开更多
Sentiment lexicons(SL)(aka lexical resources)are the repositories of one or several dictionaries that consist of known and precompiled sentiment terms.These lexicons play an important role in performing several differ...Sentiment lexicons(SL)(aka lexical resources)are the repositories of one or several dictionaries that consist of known and precompiled sentiment terms.These lexicons play an important role in performing several different opinion mining tasks.The efficacy of the lexicon-based approaches in performing opinion mining(OM)tasks solely depends on selecting an appropriate opinion lexicon to analyze the text.Therefore,one has to explore the available sentiment lexicons and then select the most suitable resource.Among available resources,SentiWordNet(SWN)is the most widely used lexicon to perform tasks related to opinion mining.In SWN,each synset of WordNet is being assigned the three sentiment numerical scores;positive,negative and objective that are calculated using by a set of classifiers.In this paper,a detailed and comprehensive review of the work related to opinion mining using Senti-WordNet is provided in a very distinctive way.This survey will be useful for the researchers contributing to the field of opinion mining.Following features make our contribution worthwhile and unique among the reviews of similar kind:(i)our review classifies the existing literature with respect to opinion mining tasks and subtasks(ii)it covers a very different outlook of the opinion mining field by providing in-depth discussions of the existing works at different granularity levels(word,sentences,document,aspect,clause,and concept levels)(iii)this state-ofart review covers each article in the following dimensions:the designated task performed,granularity level of the task completed,results obtained,and feature dimensions,and(iv)lastly it concludes the summary of the related articles according to the granularity levels,publishing years,related tasks(or subtasks),and types of classifiers used.In the end,major challenges and tasks related to lexicon-based approaches towards opinion mining are also discussed.展开更多
Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However...Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed.展开更多
The Internet provides a large number of tools and resources, such as social media sites, online newsgroups, blogs, electronic forums, virtual communities, and online travel sites, for consumers to express their views ...The Internet provides a large number of tools and resources, such as social media sites, online newsgroups, blogs, electronic forums, virtual communities, and online travel sites, for consumers to express their views or opinions regarding various issues. These opinions can help organizations like tourism to improve their products and services for their consumers. Opinion mining refers to a process of identifying emotions by applying Natural Language Processing (NLP) techniques to a pool of texts. This paper mainly focuses on mining public opinion from the hotel reviews domain. To do so, we proposed a novel technique called the Attention-Based Long Short Term Memory (Attention-LSTM) Network using a transfer learning approach. We empirically analyzed several machine learning and deep learning methods and observed our proposed technique provided an adequate performance for mining public opinion in the hotel reviews domain.展开更多
Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing ...Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing of posts provides insights into someone’s current emotions. In artificial intelligence (AI) and deep learning (DL), researchers emphasize opinion mining and analysis of sentiment, particularly on social media platforms such as Twitter (currently known as X), which has a global user base. This research work revolves explicitly around a comparison between two popular approaches: Lexicon-based and Deep learning-based Approaches. To conduct this study, this study has used a Twitter dataset called sentiment140, which contains over 1.5 million data points. The primary focus was the Long Short-Term Memory (LSTM) deep learning sequence model. In the beginning, we used particular techniques to preprocess the data. The dataset is divided into training and test data. We evaluated the performance of our model using the test data. Simultaneously, we have applied the lexicon-based approach to the same test data and recorded the outputs. Finally, we compared the two approaches by creating confusion matrices based on their respective outputs. This allows us to assess their precision, recall, and F1-Score, enabling us to determine which approach yields better accuracy. This research achieved 98% model accuracy for deep learning algorithms and 95% model accuracy for the lexicon-based approach.展开更多
Blog opinion retrieval aims to find blogs with opinionated information related to a given topic.Its main problem is to compute the opinion score,which balances topic relevance and opinion relevance.To deal with this p...Blog opinion retrieval aims to find blogs with opinionated information related to a given topic.Its main problem is to compute the opinion score,which balances topic relevance and opinion relevance.To deal with this problem a generative model deduced by a Bayesian approach is pro-posed,and an improved mixture model is proposed to estimate the opinion relevance between a blog and a given topic in our retrieval framework.Moreover,pointwise mutual information is used to expand sentiment words for different topics based on a general sentimental lexicon.The correlation between topic and candidate words is applied in the process of both expanding sentiment words and estimating sentence opinion scores.Experimental results show that the proposed approaches improve upon the state-of-the-art opinion retrieval method on TREC2010 dataset.展开更多
Community based churn prediction,or the assignment of recognising the influence of a customer’s community in churn prediction has become an important concern for firms in many different industries.While churn predi...Community based churn prediction,or the assignment of recognising the influence of a customer’s community in churn prediction has become an important concern for firms in many different industries.While churn prediction until recent times have focused only on transactional dataset(targeted approach),the untargeted approach through product advisement,digital marketing and expressions in customer’s opinion on the social media like Twitter,have not been fully harnessed.Although this data source has become an important influencing factor with lasting impact on churn management.Since Social Network Analysis(SNA)has become a blended approach for churn prediction and management in modern era,customers residing online predominantly and collectively decide and determines the momentum of churn prediction,retention and decision support.In existing SNA approaches,customers are classified as churner or non-churner(1 or 0).Oftentimes,the customer’s opinion is also neglected and the network structure of community members are not exploited.Consequently,the pattern and influential abilities of customers’opinion on relative members of the community are not analysed.Thus,the research developed a Churn Service Information Graph(CSIG)to define a quadruple churn category(churner,potential churner,inertia customer,premium customer)for non-opinionated customers via the power of relative affinity around opinionated customers on a direct node to node SNA.The essence is to use data mining technique to investigate the patterns of opinion between people in a network or group.Consequently,every member of the online social network community is dynamically classified into a churn category for an improved targeted customer acquisition,retention and/or decision supports in churn management.展开更多
Movies are the better source of entertainment.Every year,a great percentage of movies are released.People comment on movies in the form of reviews after watching them.Since it is difficult to read all of the reviews f...Movies are the better source of entertainment.Every year,a great percentage of movies are released.People comment on movies in the form of reviews after watching them.Since it is difficult to read all of the reviews for a movie,summarizing all of the reviews will help make this decision without wasting time in reading all of the reviews.Opinion mining also known as sentiment analysis is the process of extracting subjective information from textual data.Opinion mining involves identifying and extracting the opinions of individuals,which can be positive,neutral,or negative.The task of opinion mining also called sentiment analysis is performed to understand people’s emotions and attitudes in movie reviews.Movie reviews are an important source of opinion data because they provide insight into the general public’s opinions about a particular movie.The summary of all reviews can give a general idea about the movie.This study compares baseline techniques,Logistic Regression,Random Forest Classifier,Decision Tree,K-Nearest Neighbor,Gradient Boosting Classifier,and Passive Aggressive Classifier with Linear Support Vector Machines and Multinomial Naïve Bayes on the IMDB Dataset of 50K reviews and Sentiment Polarity Dataset Version 2.0.Before applying these classifiers,in pre-processing both datasets are cleaned,duplicate data is dropped and chat words are treated for better results.On the IMDB Dataset of 50K reviews,Linear Support Vector Machines achieve the highest accuracy of 89.48%,and after hyperparameter tuning,the Passive Aggressive Classifier achieves the highest accuracy of 90.27%,while Multinomial Nave Bayes achieves the highest accuracy of 70.69%and 71.04%after hyperparameter tuning on the Sentiment Polarity Dataset Version 2.0.This study highlights the importance of sentiment analysis as a tool for understanding the emotions and attitudes in movie reviews and predicts the performance of a movie based on the average sentiment of all the reviews.展开更多
Sentiment analysis of online reviews and other user generated content is an important research problem for its wide range of applications.In this paper,we propose a feature-based vector model and a novel weighting alg...Sentiment analysis of online reviews and other user generated content is an important research problem for its wide range of applications.In this paper,we propose a feature-based vector model and a novel weighting algorithm for sentiment analysis of Chinese product reviews.Specifically,an opinionated document is modeled by a set of feature-based vectors and corresponding weights.Different from previous work,our model considers modifying relationships between words and contains rich sentiment strength descriptions which are represented by adverbs of degree and punctuations.Dependency parsing is applied to construct the feature vectors.A novel feature weighting algorithm is proposed for supervised sentiment classification based on rich sentiment strength related information.The experimental results demonstrate the effectiveness of the proposed method compared with a state of the art method using term level weighting algorithms.展开更多
Sentiment analysis attracts the attention of Egyptian Decisionmakers in the education sector.It offers a viable method to assess education quality services based on the students’feedback as well as that provides an u...Sentiment analysis attracts the attention of Egyptian Decisionmakers in the education sector.It offers a viable method to assess education quality services based on the students’feedback as well as that provides an understanding of their needs.As machine learning techniques offer automated strategies to process big data derived from social media and other digital channels,this research uses a dataset for tweets’sentiments to assess a few machine learning techniques.After dataset preprocessing to remove symbols,necessary stemming and lemmatization is performed for features extraction.This is followed by several machine learning techniques and a proposed Long Short-Term Memory(LSTM)classifier optimized by the Salp Swarm Algorithm(SSA)and measured the corresponding performance.Then,the validity and accuracy of commonly used classifiers,such as Support Vector Machine,Logistic Regression Classifier,and Naive Bayes classifier,were reviewed.Moreover,LSTM based on the SSA classification model was compared with Support Vector Machine(SVM),Logistic Regression(LR),and Naive Bayes(NB).Finally,as LSTM based SSA achieved the highest accuracy,it was applied to predict the sentiments of students’feedback and evaluate their association with the course outcome evaluations for education quality purposes.展开更多
Currently,the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon.Thus,this issue is addressed in this paper in order to enhance the accuracy of sentiment a...Currently,the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon.Thus,this issue is addressed in this paper in order to enhance the accuracy of sentiment analysis.In this study,a new lexicon for sentiment analysis is constructed.A detailed review of existing approaches has been conducted,and a new bilingual sentiment lexicon known as MELex(Malay-English Lexicon)has been generated.Constructing MELex involves three activities:seed words selection,polarity assignment,and synonym expansions.Our approach differs from previous works in that MELex can analyze text for the two most widely used languages in Malaysia,Malay,and English,with the accuracy achieved,is 90%.It is evaluated based on the experimentation and case study approaches where the affordable housing projects in Malaysia are selected as case projects.This finding has given an implication on the ability of MELex to analyze public sentiments in the Malaysian context.The novel aspects of this paper are two-fold.Firstly,it introduces the new technique in assigning the polarity score,and second,it improves the performance over the classification of mixed language content.展开更多
The sentiment of a text depends on the clausal structure of the sentence and the connectives’discourse arguments.In this work,the clause boundary,discourse argument,and syntactic and semantic information of the sente...The sentiment of a text depends on the clausal structure of the sentence and the connectives’discourse arguments.In this work,the clause boundary,discourse argument,and syntactic and semantic information of the sentence are used to assign the text’s sentiment.The clause boundaries identify the span of the text,and the discourse connectives identify the arguments.Since the lexicon-based analysis of traditional sentiment analysis gives the wrong sentiment of the sentence,a deeper-level semantic analysis is required for the correct analysis of sentiments.Hence,in this study,explicit connectives in Malayalam are considered to identify the discourse arguments.A supervised method,conditional random fields,is used to identify the clause boundary and discourse arguments.For the study,1,000 sentiment sentences from Malayalam documents were analyzed.Experimental results show that the discourse structure integration considerably improves sentiment analysis performance from the baseline system.展开更多
In the field of sentiment analysis,extracting aspects or opinion targets fromuser reviews about a product is a key task.Extracting the polarity of an opinion is much more useful if we also know the targeted Aspect or ...In the field of sentiment analysis,extracting aspects or opinion targets fromuser reviews about a product is a key task.Extracting the polarity of an opinion is much more useful if we also know the targeted Aspect or Feature.Rule based approaches,like dependency-based rules,are quite popular and effective for this purpose.However,they are heavily dependent on the authenticity of the employed parts-of-speech(POS)tagger and dependency parser.Another popular rule based approach is to use sequential rules,wherein the rules formulated by learning from the user’s behavior.However,in general,the sequential rule-based approaches have poor generalization capability.Moreover,existing approaches mostly consider an aspect as a noun or noun phrase,so these approaches are unable to extract verb aspects.In this article,we have proposed a multi-layered rule-based(ML-RB)technique using the syntactic dependency parser based rules along with some selective sequential rules in separate layers to extract noun aspects.Additionally,after rigorous analysis,we have also constructed rules for the extraction of verb aspects.These verb rules primarily based on the association between verb and opinion words.The proposed multi-layer technique compensates for the weaknesses of individual layers and yields improved results on two publicly available customer review datasets.The F1 score for both the datasets are 0.90 and 0.88,respectively,which are better than existing approaches.These improved results can be attributed to the application of sequential/syntactic rules in a layered manner as well as the capability to extract both noun and verb aspects.展开更多
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/142/43).
文摘Sentiment analysis or opinion mining(OM)concepts become familiar due to advances in networking technologies and social media.Recently,massive amount of text has been generated over Internet daily which makes the pattern recognition and decision making process difficult.Since OM find useful in business sectors to improve the quality of the product as well as services,machine learning(ML)and deep learning(DL)models can be considered into account.Besides,the hyperparameters involved in the DL models necessitate proper adjustment process to boost the classification process.Therefore,in this paper,a new Artificial Fish Swarm Optimization with Bidirectional Long Short Term Memory(AFSO-BLSTM)model has been developed for OM process.The major intention of the AFSO-BLSTM model is to effectively mine the opinions present in the textual data.In addition,the AFSO-BLSTM model undergoes pre-processing and TF-IFD based feature extraction process.Besides,BLSTM model is employed for the effectual detection and classification of opinions.Finally,the AFSO algorithm is utilized for effective hyperparameter adjustment process of the BLSTM model,shows the novelty of the work.A complete simulation study of the AFSO-BLSTM model is validated using benchmark dataset and the obtained experimental values revealed the high potential of the AFSO-BLSTM model on mining opinions.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaThe authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4310373DSR56.
文摘Opinion Mining(OM)studies in Arabic are limited though it is one of the most extensively-spoken languages worldwide.Though the interest in OM studies in the Arabic language is growing among researchers,it needs a vast number of investigations due to the unique morphological principles of the language.Arabic OM studies experience multiple challenges owing to the poor existence of language sources and Arabic-specific linguistic features.The comparative OM studies in the English language are wide and novel.But,comparative OM studies in the Arabic language are yet to be established and are still in a nascent stage.The unique features of the Arabic language make it essential to expand the studies regarding the Arabic text.It contains unique featuressuchasdiacritics,elongation,inflectionandwordlength.Thecurrent study proposes a Political Optimizer with Probabilistic Neural Network-based Comparative Opinion Mining(POPNN-COM)model for the Arabic text.The proposed POPNN-COM model aims to recognize comparative and non-comparative texts in Arabic in the context of social media.Initially,the POPNN-COM model involves different levels of data pre-processing to transform the input data into a useful format.Then,the pre-processed data is fed into the PNN model for classification and recognition of the data under different class labels.At last,the PO algorithm is employed for fine-tuning the parameters involved in this model to achieve enhanced results.The proposed POPNN-COM model was experimentally validated using two standard datasets,and the outcomes established the promising performance of the proposed POPNN-COM method over other recent approaches.
基金Taif University Researchers Supporting Project Number(TURSP-2020/216),Taif University,Taif,Saudi Arabia.
文摘Sentiment analysis or Opinion Mining (OM) has gained significant interest among research communities and entrepreneurs in the recentyears. Likewise, Machine Learning (ML) approaches is one of the interestingresearch domains that are highly helpful and are increasingly applied in severalbusiness domains. In this background, the current research paper focuses onthe design of automated opinion mining model using Deer Hunting Optimization Algorithm (DHOA) with Fuzzy Neural Network (FNN) abbreviatedas DHOA-FNN model. The proposed DHOA-FNN technique involves fourdifferent stages namely, preprocessing, feature extraction, classification, andparameter tuning. In addition to the above, the proposed DHOA-FNN modelhas two stages of feature extraction namely, Glove and N-gram approach.Moreover, FNN model is utilized as a classification model whereas GTOA isused for the optimization of parameters. The novelty of current work is thatthe GTOA is designed to tune the parameters of FNN model. An extensiverange of simulations was carried out on the benchmark dataset and the resultswere examined under diverse measures. The experimental results highlightedthe promising performance of DHOA-FNN model over recent state-of-the-arttechniques with a maximum accuracy of 0.9928.
基金supported by National Natural Science Foundation of China under Grants No.61232010, No.60903139, No.60933005, No.61202215, No.61100083National 242 Project under Grant No.2011F65China Information Technology Security Evaluation Center Program under Grant No.Z1277
文摘This paper focuses on how to improve aspect-level opinion mining for online customer reviews. We first propose a novel generative topic model, the Joint Aspect/Sentiment (JAS) model, to jointly extract aspects and aspect-dependent sentiment lexicons from online customer reviews. An aspect-dependent sentiment lexicon refers to the aspect-specific opinion words along with their aspect-aware sentiment polarities with respect to a specific aspect. We then apply the extracted aspectdependent sentiment lexicons to a series of aspect-level opinion mining tasks, including implicit aspect identification, aspect-based extractive opinion summarization, and aspect-level sentiment classification. Experimental results demonstrate the effectiveness of the JAS model in learning aspectdependent sentiment lexicons and the practical values of the extracted lexicons when applied to these practical tasks.
基金This research was supported by Misr International University(MIU),(Grant Number.DSA28211231302952)to Diaa Salama,https://www.miuegypt.edu.eg/.
文摘At present,the immense development of social networks allows generating a significant amount of textual data,which has facilitated researchers to explore the field of opinion mining.In addition,the processing of textual opinions based on the term frequency-inverse document frequency method gives rise to a dimensionality problem.This study aims to detect the nature of opinions in the Arabic language employing a swarm intelligence(SI)-based algorithm,Harris hawks algorithm,to select the most relevant terms.The experimental study has been tested on two datasets:Arabic Jordanian General Tweets and Opinion Corpus for Arabic.In terms of accuracy and number of features,the results are better than those of other SI based algorithms,such as grey wolf optimizer and grasshopper optimization algorithm,and other algorithms in the literature,such as differential evolution,genetic algorithm,particle swarm optimization,basic and enhanced whale optimizer algorithm,slap swarm algorithm,and ant–lion optimizer.
基金to the Deanship of Scientific Research at King Faisal University for its financial support,with reference to the research grant number as 216082.
文摘Opinion summarization recapitulates the opinions about a common topic automatically.The primary motive of summarization is to preserve the properties of the text and is shortened in a way with no loss in the semantics of the text.The need of automatic summarization efficiently resulted in increased interest among communities of Natural Language Processing and Text Mining.This paper emphasis on building an extractive summarization system combining the features of principal component analysis for dimensionality reduction and bidirectional Recurrent Neural Networks and Long Short-Term Memory(RNN-LSTM)deep learning model for short and exact synopsis using seq2seq model.It presents a paradigm shift with regard to the way extractive summaries are generated.Novel algorithms for word extraction using assertions are proposed.The semantic framework is well-grounded in this research facilitating the correct decision making process after reviewing huge amount of online reviews,considering all its important features into account.The advantages of the proposed solution provides greater computational efficiency,better inferences from social media,data understanding,robustness and handling sparse data.Experiments on the different datasets also outperforms the previous researches and the accuracy is claimed to achieve more than the baselines,showing the efficiency and the novelty in the research paper.The comparisons are done by calculating accuracy with different baselines using Rouge tool.
文摘Global changes took place at a neck-breaking speed in lots of fields along with the Web 2.0 era, which can be stated as the new Internet trend. Web pages which once were a statical structure that can be said to become dynamic pages created by users, and in this regard they can be said to have been democratized by evolving. Social media, which were structured alongside with this era, by providing a large data flow for businesses, present new and improvable opportunities in the field of creating effective strategies. There are lots of blogs in today's Internet environment which includes customer ideas regarding the products/services that they possess. This environment, which in a way globalizes the customer ideas, is a new medium suitable for examination in terms of its increasing the business-customer interaction and due to its transporter nature; it provides the text data that may be analyzed in the field of Customer Relationship Management to businesses. Thus, businesses should follow blog environments to see how the product/service they provide is greeted in terms of the customer focus and it should be seen as an important job on which they can conduct effective analyses. For this purpose, a model proposal that will assign the ideas to the Turkish blogs was given in the study. Opinion mining methods were used in the model, and so to perceive a general look-on about products/services, a methodology was devised, which will assign the text based opinion data on the Turkish blogs to the poles. Success of the pole assignment of the model is evaluated with the precision measure.
文摘The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting(IWW). IWW is computed based on the significance of the word in the document(SWD) and the significance of the word in the expression(SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed:(1) Classification performance is enhanced;(2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.
基金This work was supported by the Department of Computer Science&IT,The Islamia University of Bahawalpur,Pakistan in collaboration with Laboratoire Informatique,Image et Interaction(L3i),University of La Rochelle,France.
文摘Sentiment lexicons(SL)(aka lexical resources)are the repositories of one or several dictionaries that consist of known and precompiled sentiment terms.These lexicons play an important role in performing several different opinion mining tasks.The efficacy of the lexicon-based approaches in performing opinion mining(OM)tasks solely depends on selecting an appropriate opinion lexicon to analyze the text.Therefore,one has to explore the available sentiment lexicons and then select the most suitable resource.Among available resources,SentiWordNet(SWN)is the most widely used lexicon to perform tasks related to opinion mining.In SWN,each synset of WordNet is being assigned the three sentiment numerical scores;positive,negative and objective that are calculated using by a set of classifiers.In this paper,a detailed and comprehensive review of the work related to opinion mining using Senti-WordNet is provided in a very distinctive way.This survey will be useful for the researchers contributing to the field of opinion mining.Following features make our contribution worthwhile and unique among the reviews of similar kind:(i)our review classifies the existing literature with respect to opinion mining tasks and subtasks(ii)it covers a very different outlook of the opinion mining field by providing in-depth discussions of the existing works at different granularity levels(word,sentences,document,aspect,clause,and concept levels)(iii)this state-ofart review covers each article in the following dimensions:the designated task performed,granularity level of the task completed,results obtained,and feature dimensions,and(iv)lastly it concludes the summary of the related articles according to the granularity levels,publishing years,related tasks(or subtasks),and types of classifiers used.In the end,major challenges and tasks related to lexicon-based approaches towards opinion mining are also discussed.
基金the National Natural Science Foundation of China(No.61375053)the Project of Shanghai University of Finance and Economics(Nos.2018110565 and 2016110743)。
文摘Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed.
文摘The Internet provides a large number of tools and resources, such as social media sites, online newsgroups, blogs, electronic forums, virtual communities, and online travel sites, for consumers to express their views or opinions regarding various issues. These opinions can help organizations like tourism to improve their products and services for their consumers. Opinion mining refers to a process of identifying emotions by applying Natural Language Processing (NLP) techniques to a pool of texts. This paper mainly focuses on mining public opinion from the hotel reviews domain. To do so, we proposed a novel technique called the Attention-Based Long Short Term Memory (Attention-LSTM) Network using a transfer learning approach. We empirically analyzed several machine learning and deep learning methods and observed our proposed technique provided an adequate performance for mining public opinion in the hotel reviews domain.
文摘Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing of posts provides insights into someone’s current emotions. In artificial intelligence (AI) and deep learning (DL), researchers emphasize opinion mining and analysis of sentiment, particularly on social media platforms such as Twitter (currently known as X), which has a global user base. This research work revolves explicitly around a comparison between two popular approaches: Lexicon-based and Deep learning-based Approaches. To conduct this study, this study has used a Twitter dataset called sentiment140, which contains over 1.5 million data points. The primary focus was the Long Short-Term Memory (LSTM) deep learning sequence model. In the beginning, we used particular techniques to preprocess the data. The dataset is divided into training and test data. We evaluated the performance of our model using the test data. Simultaneously, we have applied the lexicon-based approach to the same test data and recorded the outputs. Finally, we compared the two approaches by creating confusion matrices based on their respective outputs. This allows us to assess their precision, recall, and F1-Score, enabling us to determine which approach yields better accuracy. This research achieved 98% model accuracy for deep learning algorithms and 95% model accuracy for the lexicon-based approach.
基金Supported by the National Natural Science Foundation of China(61370137,61672098,61272361)the Ministry of Education-China Mobile Research Foundation Project(2015/5-9,2016/2-7)
文摘Blog opinion retrieval aims to find blogs with opinionated information related to a given topic.Its main problem is to compute the opinion score,which balances topic relevance and opinion relevance.To deal with this problem a generative model deduced by a Bayesian approach is pro-posed,and an improved mixture model is proposed to estimate the opinion relevance between a blog and a given topic in our retrieval framework.Moreover,pointwise mutual information is used to expand sentiment words for different topics based on a general sentimental lexicon.The correlation between topic and candidate words is applied in the process of both expanding sentiment words and estimating sentence opinion scores.Experimental results show that the proposed approaches improve upon the state-of-the-art opinion retrieval method on TREC2010 dataset.
文摘Community based churn prediction,or the assignment of recognising the influence of a customer’s community in churn prediction has become an important concern for firms in many different industries.While churn prediction until recent times have focused only on transactional dataset(targeted approach),the untargeted approach through product advisement,digital marketing and expressions in customer’s opinion on the social media like Twitter,have not been fully harnessed.Although this data source has become an important influencing factor with lasting impact on churn management.Since Social Network Analysis(SNA)has become a blended approach for churn prediction and management in modern era,customers residing online predominantly and collectively decide and determines the momentum of churn prediction,retention and decision support.In existing SNA approaches,customers are classified as churner or non-churner(1 or 0).Oftentimes,the customer’s opinion is also neglected and the network structure of community members are not exploited.Consequently,the pattern and influential abilities of customers’opinion on relative members of the community are not analysed.Thus,the research developed a Churn Service Information Graph(CSIG)to define a quadruple churn category(churner,potential churner,inertia customer,premium customer)for non-opinionated customers via the power of relative affinity around opinionated customers on a direct node to node SNA.The essence is to use data mining technique to investigate the patterns of opinion between people in a network or group.Consequently,every member of the online social network community is dynamically classified into a churn category for an improved targeted customer acquisition,retention and/or decision supports in churn management.
文摘Movies are the better source of entertainment.Every year,a great percentage of movies are released.People comment on movies in the form of reviews after watching them.Since it is difficult to read all of the reviews for a movie,summarizing all of the reviews will help make this decision without wasting time in reading all of the reviews.Opinion mining also known as sentiment analysis is the process of extracting subjective information from textual data.Opinion mining involves identifying and extracting the opinions of individuals,which can be positive,neutral,or negative.The task of opinion mining also called sentiment analysis is performed to understand people’s emotions and attitudes in movie reviews.Movie reviews are an important source of opinion data because they provide insight into the general public’s opinions about a particular movie.The summary of all reviews can give a general idea about the movie.This study compares baseline techniques,Logistic Regression,Random Forest Classifier,Decision Tree,K-Nearest Neighbor,Gradient Boosting Classifier,and Passive Aggressive Classifier with Linear Support Vector Machines and Multinomial Naïve Bayes on the IMDB Dataset of 50K reviews and Sentiment Polarity Dataset Version 2.0.Before applying these classifiers,in pre-processing both datasets are cleaned,duplicate data is dropped and chat words are treated for better results.On the IMDB Dataset of 50K reviews,Linear Support Vector Machines achieve the highest accuracy of 89.48%,and after hyperparameter tuning,the Passive Aggressive Classifier achieves the highest accuracy of 90.27%,while Multinomial Nave Bayes achieves the highest accuracy of 70.69%and 71.04%after hyperparameter tuning on the Sentiment Polarity Dataset Version 2.0.This study highlights the importance of sentiment analysis as a tool for understanding the emotions and attitudes in movie reviews and predicts the performance of a movie based on the average sentiment of all the reviews.
基金This work was supported in part by National Natural Science Foundation of China under Grants No.60970052,the Beijing Natural Science Foundation under Grants No.4133084,the Beijing Educational Committee Science and Technology Development Planned under Grants No.KM201410028017 and the Beijing Key Disciplines of Computer Application Technology
文摘Sentiment analysis of online reviews and other user generated content is an important research problem for its wide range of applications.In this paper,we propose a feature-based vector model and a novel weighting algorithm for sentiment analysis of Chinese product reviews.Specifically,an opinionated document is modeled by a set of feature-based vectors and corresponding weights.Different from previous work,our model considers modifying relationships between words and contains rich sentiment strength descriptions which are represented by adverbs of degree and punctuations.Dependency parsing is applied to construct the feature vectors.A novel feature weighting algorithm is proposed for supervised sentiment classification based on rich sentiment strength related information.The experimental results demonstrate the effectiveness of the proposed method compared with a state of the art method using term level weighting algorithms.
文摘Sentiment analysis attracts the attention of Egyptian Decisionmakers in the education sector.It offers a viable method to assess education quality services based on the students’feedback as well as that provides an understanding of their needs.As machine learning techniques offer automated strategies to process big data derived from social media and other digital channels,this research uses a dataset for tweets’sentiments to assess a few machine learning techniques.After dataset preprocessing to remove symbols,necessary stemming and lemmatization is performed for features extraction.This is followed by several machine learning techniques and a proposed Long Short-Term Memory(LSTM)classifier optimized by the Salp Swarm Algorithm(SSA)and measured the corresponding performance.Then,the validity and accuracy of commonly used classifiers,such as Support Vector Machine,Logistic Regression Classifier,and Naive Bayes classifier,were reviewed.Moreover,LSTM based on the SSA classification model was compared with Support Vector Machine(SVM),Logistic Regression(LR),and Naive Bayes(NB).Finally,as LSTM based SSA achieved the highest accuracy,it was applied to predict the sentiments of students’feedback and evaluate their association with the course outcome evaluations for education quality purposes.
文摘Currently,the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon.Thus,this issue is addressed in this paper in order to enhance the accuracy of sentiment analysis.In this study,a new lexicon for sentiment analysis is constructed.A detailed review of existing approaches has been conducted,and a new bilingual sentiment lexicon known as MELex(Malay-English Lexicon)has been generated.Constructing MELex involves three activities:seed words selection,polarity assignment,and synonym expansions.Our approach differs from previous works in that MELex can analyze text for the two most widely used languages in Malaysia,Malay,and English,with the accuracy achieved,is 90%.It is evaluated based on the experimentation and case study approaches where the affordable housing projects in Malaysia are selected as case projects.This finding has given an implication on the ability of MELex to analyze public sentiments in the Malaysian context.The novel aspects of this paper are two-fold.Firstly,it introduces the new technique in assigning the polarity score,and second,it improves the performance over the classification of mixed language content.
文摘The sentiment of a text depends on the clausal structure of the sentence and the connectives’discourse arguments.In this work,the clause boundary,discourse argument,and syntactic and semantic information of the sentence are used to assign the text’s sentiment.The clause boundaries identify the span of the text,and the discourse connectives identify the arguments.Since the lexicon-based analysis of traditional sentiment analysis gives the wrong sentiment of the sentence,a deeper-level semantic analysis is required for the correct analysis of sentiments.Hence,in this study,explicit connectives in Malayalam are considered to identify the discourse arguments.A supervised method,conditional random fields,is used to identify the clause boundary and discourse arguments.For the study,1,000 sentiment sentences from Malayalam documents were analyzed.Experimental results show that the discourse structure integration considerably improves sentiment analysis performance from the baseline system.
文摘In the field of sentiment analysis,extracting aspects or opinion targets fromuser reviews about a product is a key task.Extracting the polarity of an opinion is much more useful if we also know the targeted Aspect or Feature.Rule based approaches,like dependency-based rules,are quite popular and effective for this purpose.However,they are heavily dependent on the authenticity of the employed parts-of-speech(POS)tagger and dependency parser.Another popular rule based approach is to use sequential rules,wherein the rules formulated by learning from the user’s behavior.However,in general,the sequential rule-based approaches have poor generalization capability.Moreover,existing approaches mostly consider an aspect as a noun or noun phrase,so these approaches are unable to extract verb aspects.In this article,we have proposed a multi-layered rule-based(ML-RB)technique using the syntactic dependency parser based rules along with some selective sequential rules in separate layers to extract noun aspects.Additionally,after rigorous analysis,we have also constructed rules for the extraction of verb aspects.These verb rules primarily based on the association between verb and opinion words.The proposed multi-layer technique compensates for the weaknesses of individual layers and yields improved results on two publicly available customer review datasets.The F1 score for both the datasets are 0.90 and 0.88,respectively,which are better than existing approaches.These improved results can be attributed to the application of sequential/syntactic rules in a layered manner as well as the capability to extract both noun and verb aspects.