The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orient...The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orientation detection.Political articles(especially in the Arab world)are different from other articles due to their subjectivity,in which the author’s beliefs and political affiliation might have a significant influence on a political article.With categories representing the main political ideologies,this problem may be thought of as a subset of the text categorization(classification).In general,the performance of machine learning models for text classification is sensitive to hyperparameter settings.Furthermore,the feature vector used to represent a document must capture,to some extent,the complex semantics of natural language.To this end,this paper presents an intelligent system to detect political Arabic article orientation that adapts the categorical boosting(CatBoost)method combined with a multi-level feature concept.Extracting features at multiple levels can enhance the model’s ability to discriminate between different classes or patterns.Each level may capture different aspects of the input data,contributing to a more comprehensive representation.CatBoost,a robust and efficient gradient-boosting algorithm,is utilized to effectively learn and predict the complex relationships between these features and the political orientation labels associated with the articles.A dataset of political Arabic texts collected from diverse sources,including postings and articles,is used to assess the suggested technique.Conservative,reform,and revolutionary are the three subcategories of these opinions.The results of this study demonstrate that compared to other frequently used machine learning models for text classification,the CatBoost method using multi-level features performs better with an accuracy of 98.14%.展开更多
With the rapid growth of the Internet in recent years, the ability to analyze and identify its users has become increasingly important. Authorship analysis provides a means to glean information about the author of a d...With the rapid growth of the Internet in recent years, the ability to analyze and identify its users has become increasingly important. Authorship analysis provides a means to glean information about the author of a document originating from the internet or elsewhere, including but not limited to the author’s gender. There are well-known linguistic differences between the writing of men and women, and these differences can be effectively used to predict the gender of a document’s author. Capitalizing on these linguistic nuances, this study uses a set of stylometric features and a set of word count features to facilitate automatic gender discrimination on emails from the popular Enron email dataset. These features are used in conjunction with the Modified Balanced Winnow Neural Network proposed by Carvalho and Cohen, an improvement on the original Balanced Winnow created by Littlestone. Experiments with the Modified Balanced Winnow show that it is effectively able to discriminate gender using both stylometric and word count features, with the word count features providing superior results.展开更多
Cecily Swanson argues that“modernism's Gurdjieff craze in fact played a surprising role in the development of an overlooked canon of popular autobiographies:Muriel Draper's memoir,Music at Midnight;Margaret A...Cecily Swanson argues that“modernism's Gurdjieff craze in fact played a surprising role in the development of an overlooked canon of popular autobiographies:Muriel Draper's memoir,Music at Midnight;Margaret Anderson's memoir,My Thirty Years'War;and Kathryn Hulme's autobiographical novel,We Lived As Children.”Swanson reads Draper,Anderson,and Hulme because they wrote as esotericists,while she divorces the memoirs from any overt esoteric influences,contents,or aesthetics.There is no need to search further for the source of the mode of the popular autobiographies by Anderson and Draper than what of Loos's novel comes through the Peggy Hopkins Joyce/Zora Neale Hurston memoir.Marriage,Men,and Me appears near the commencement of a line of esoteric memoirs that becomes visible in the best-selling works by Draper and Anderson but then continues expansively.展开更多
Social media is a platform to express one′s views and opinions freely and has made communication easier than it was before.This also opens up an opportunity for people to spread fake news intentionally.The ease of ac...Social media is a platform to express one′s views and opinions freely and has made communication easier than it was before.This also opens up an opportunity for people to spread fake news intentionally.The ease of access to a variety of news sources on the web also brings the problem of people being exposed to fake news and possibly believing such news.This makes it important for us to detect and flag such content on social media.With the current rate of news generated on social media,it is difficult to differentiate between genuine news and hoaxes without knowing the source of the news.This paper discusses approaches to detection of fake news using only the features of the text of the news,without using any other related metadata.We observe that a combination of stylometric features and text-based word vector representations through ensemble methods can predict fake news with an accuracy of up to 95.49%.展开更多
文摘The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orientation detection.Political articles(especially in the Arab world)are different from other articles due to their subjectivity,in which the author’s beliefs and political affiliation might have a significant influence on a political article.With categories representing the main political ideologies,this problem may be thought of as a subset of the text categorization(classification).In general,the performance of machine learning models for text classification is sensitive to hyperparameter settings.Furthermore,the feature vector used to represent a document must capture,to some extent,the complex semantics of natural language.To this end,this paper presents an intelligent system to detect political Arabic article orientation that adapts the categorical boosting(CatBoost)method combined with a multi-level feature concept.Extracting features at multiple levels can enhance the model’s ability to discriminate between different classes or patterns.Each level may capture different aspects of the input data,contributing to a more comprehensive representation.CatBoost,a robust and efficient gradient-boosting algorithm,is utilized to effectively learn and predict the complex relationships between these features and the political orientation labels associated with the articles.A dataset of political Arabic texts collected from diverse sources,including postings and articles,is used to assess the suggested technique.Conservative,reform,and revolutionary are the three subcategories of these opinions.The results of this study demonstrate that compared to other frequently used machine learning models for text classification,the CatBoost method using multi-level features performs better with an accuracy of 98.14%.
文摘With the rapid growth of the Internet in recent years, the ability to analyze and identify its users has become increasingly important. Authorship analysis provides a means to glean information about the author of a document originating from the internet or elsewhere, including but not limited to the author’s gender. There are well-known linguistic differences between the writing of men and women, and these differences can be effectively used to predict the gender of a document’s author. Capitalizing on these linguistic nuances, this study uses a set of stylometric features and a set of word count features to facilitate automatic gender discrimination on emails from the popular Enron email dataset. These features are used in conjunction with the Modified Balanced Winnow Neural Network proposed by Carvalho and Cohen, an improvement on the original Balanced Winnow created by Littlestone. Experiments with the Modified Balanced Winnow show that it is effectively able to discriminate gender using both stylometric and word count features, with the word count features providing superior results.
文摘Cecily Swanson argues that“modernism's Gurdjieff craze in fact played a surprising role in the development of an overlooked canon of popular autobiographies:Muriel Draper's memoir,Music at Midnight;Margaret Anderson's memoir,My Thirty Years'War;and Kathryn Hulme's autobiographical novel,We Lived As Children.”Swanson reads Draper,Anderson,and Hulme because they wrote as esotericists,while she divorces the memoirs from any overt esoteric influences,contents,or aesthetics.There is no need to search further for the source of the mode of the popular autobiographies by Anderson and Draper than what of Loos's novel comes through the Peggy Hopkins Joyce/Zora Neale Hurston memoir.Marriage,Men,and Me appears near the commencement of a line of esoteric memoirs that becomes visible in the best-selling works by Draper and Anderson but then continues expansively.
文摘Social media is a platform to express one′s views and opinions freely and has made communication easier than it was before.This also opens up an opportunity for people to spread fake news intentionally.The ease of access to a variety of news sources on the web also brings the problem of people being exposed to fake news and possibly believing such news.This makes it important for us to detect and flag such content on social media.With the current rate of news generated on social media,it is difficult to differentiate between genuine news and hoaxes without knowing the source of the news.This paper discusses approaches to detection of fake news using only the features of the text of the news,without using any other related metadata.We observe that a combination of stylometric features and text-based word vector representations through ensemble methods can predict fake news with an accuracy of up to 95.49%.