期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Representation learning: serial-autoencoder for personalized recommendation
1
作者 Yi ZHU Yishuai GENG +2 位作者 Yun LI jipeng qiang Xindong WU 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期61-72,共12页
Nowadays,the personalized recommendation has become a research hotspot for addressing information overload.Despite this,generating effective recommendations from sparse data remains a challenge.Recently,auxiliary info... Nowadays,the personalized recommendation has become a research hotspot for addressing information overload.Despite this,generating effective recommendations from sparse data remains a challenge.Recently,auxiliary information has been widely used to address data sparsity,but most models using auxiliary information are linear and have limited expressiveness.Due to the advantages of feature extraction and no-label requirements,autoencoder-based methods have become quite popular.However,most existing autoencoder-based methods discard the reconstruction of auxiliary information,which poses huge challenges for better representation learning and model scalability.To address these problems,we propose Serial-Autoencoder for Personalized Recommendation(SAPR),which aims to reduce the loss of critical information and enhance the learning of feature representations.Specifically,we first combine the original rating matrix and item attribute features and feed them into the first autoencoder for generating a higher-level representation of the input.Second,we use a second autoencoder to enhance the reconstruction of the data representation of the prediciton rating matrix.The output rating information is used for recommendation prediction.Extensive experiments on the MovieTweetings and MovieLens datasets have verified the effectiveness of SAPR compared to state-of-the-art models. 展开更多
关键词 personalized recommendation autoencoder representation learning collaborative filtering
原文传递
Representation learning via an integrated autoencoder for unsupervised domain adaptation 被引量:1
2
作者 Yi ZHU Xindong WU +2 位作者 jipeng qiang Yunhao YUAN Yun LI 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第5期75-87,共13页
The purpose of unsupervised domain adaptation is to use the knowledge of the source domain whose data distribution is different from that of the target domain for promoting the learning task in the target domain.The k... The purpose of unsupervised domain adaptation is to use the knowledge of the source domain whose data distribution is different from that of the target domain for promoting the learning task in the target domain.The key bottleneck in unsupervised domain adaptation is how to obtain higher-level and more abstract feature representations between source and target domains which can bridge the chasm of domain discrepancy.Recently,deep learning methods based on autoencoder have achieved sound performance in representation learning,and many dual or serial autoencoderbased methods take different characteristics of data into consideration for improving the effectiveness of unsupervised domain adaptation.However,most existing methods of autoencoders just serially connect the features generated by different autoencoders,which pose challenges for the discriminative representation learning and fail to find the real cross-domain features.To address this problem,we propose a novel representation learning method based on an integrated autoencoders for unsupervised domain adaptation,called IAUDA.To capture the inter-and inner-domain features of the raw data,two different autoencoders,which are the marginalized autoencoder with maximum mean discrepancy(mAE)and convolutional autoencoder(CAE)respectively,are proposed to learn different feature representations.After higher-level features are obtained by these two different autoencoders,a sparse autoencoder is introduced to compact these inter-and inner-domain representations.In addition,a whitening layer is embedded for features processed before the mAE to reduce redundant features inside a local area.Experimental results demonstrate the effectiveness of our proposed method compared with several state-of-the-art baseline methods. 展开更多
关键词 unsupervised domain adaptation representation learning marginalized autoencoder convolutional autoen-coder sparse autoencoder
原文传递
Unsupervised statistical text simplification using pre-trained language modeling for initialization 被引量:1
3
作者 jipeng qiang Feng ZHANG +3 位作者 Yun LI Yunhao YUAN Yi ZHU Xindong WU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第1期81-90,共10页
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based mach... Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines. 展开更多
关键词 text simplification pre-trained language modeling BERT word embeddings
原文传递
Lexical simplification via single-word generation 被引量:1
4
作者 jipeng qiang Yang LI +2 位作者 Yun LI Yunhao YUAN Yi ZHU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第6期163-165,共3页
1 Introduction Lexical simplification(LS)aims to simplify a sentence by replacing complex words with simpler words without changing the meaning of the sentence,which can facilitate comprehension of the text for people... 1 Introduction Lexical simplification(LS)aims to simplify a sentence by replacing complex words with simpler words without changing the meaning of the sentence,which can facilitate comprehension of the text for people with non-native speakers and children.Traditional LS methods utilize linguistic databases(e.g.,WordNet)[1]or word embedding models[2]to extract synonyms or high-similar words for the complex word,and then sort them based on their appropriateness in context.Recently,BERT-based LS methods[3,4]entirely or partially mask the complex word of the original sentence,and then feed the sentence into pretrained modeling BERT[5]to obtain the top probability tokens corresponding to the masked word as the substitute candidates.They have made remarkable progress in generating substitutes by making full use of the context information of complex words,that can effectively alleviate the shortcomings of traditional methods. 展开更多
关键词 TOKEN utilize SPEAKERS
原文传递
Safeguarding text generation API’s intellectual property through meaning-preserving lexical watermarks
5
作者 Shiyu ZHU Yun LI +2 位作者 Xiaoye OUYANG Xiaocheng HU jipeng qiang 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第6期195-197,共3页
1 Introduction Recent advancements in encoder-decoder based text generation technology,like ChatGPT by OpenAI,and PaLM[1]by Google,have garnered attention in the AI community.Pay-per-use APIs offer access to these mod... 1 Introduction Recent advancements in encoder-decoder based text generation technology,like ChatGPT by OpenAI,and PaLM[1]by Google,have garnered attention in the AI community.Pay-per-use APIs offer access to these models,but research shows they are prone to imitation attacks,where malicious users train their models through skillfully crafted queries to get responses from lawful APIs.Such attacks violate the intellectual property(IP)and deter further research[2].Recent work introduced lexical watermarking(LW)methods to protect legal APIs’IP.LW modifies the original outputs and uses null-hypothesis test for ownership verification on imitation models[2,3].High-frequency words are selected,and WordNet synonyms replace them,but this one-size-fits-all approach neglects rational substitutes. 展开更多
关键词 property PRESERVING replace
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部