从2011年开始,我们发起了一个收集个人信息生活记录数据的项目,该项目收集了22位志愿者的4万条lifelogging数据。随着时间的推移,志愿者的lifelogging数据越来越多,其中收集到的视频就多达3020条,想要搜索这些lifelogging数据中的视频...从2011年开始,我们发起了一个收集个人信息生活记录数据的项目,该项目收集了22位志愿者的4万条lifelogging数据。随着时间的推移,志愿者的lifelogging数据越来越多,其中收集到的视频就多达3020条,想要搜索这些lifelogging数据中的视频变得非常困难。因此,我们提出了一种视频分解 + 图像分析的模型Liu-VTM (Video Tags Model),该模型从lifelogging视频中筛选能够代表该视频内容的关键帧,并依据关键帧进行图像识别得到视频的标签,最后可以通过标签直接检索到相应的视频。在本次实验中我们探讨了多种视频选取关键帧的方法对模型的影响,并提出了一个新的评价指标“最佳内容覆盖率”用于评价lifelog领域内视频选取到的关键帧的性能。我们的实验结果证明了Liu-VTM模型可以有效对lifelogging数据集打上视频标签并依据标签直接检索到相应视频。Since 2011, we have initiated a project to collect personal lifelogging data, gathering 40,000 lifelogging entries from 22 volunteers. Over time, the amount of lifelogging data from the volunteers has increased, including as many as 3020 videos, making it extremely difficult to search through these lifelogging videos. Therefore, we propose a video decomposition and image analysis model called Liu-VTM (Video Tags Model). This model selects keyframes from lifelogging videos that represent the content of the video and uses image recognition on these keyframes to generate video tags. These tags can then be used to directly retrieve the corresponding videos. In this experiment, we explored various methods for selecting keyframes from videos and proposed a new evaluation metric called “Optimal Content Coverage Rate” to assess the performance of keyframe selection in the lifelogging domain. Our experimental results demonstrate that the Liu-VTM model can effectively tag videos in lifelogging datasets and retrieve the corresponding videos based on these tags.展开更多
短剧随着时代发展逐渐崛起,成为当今国内外新潮的娱乐载体。本文爬取腾讯短剧品牌十分剧场的短剧用户评价,对该不平衡样本数据进行情感分析,比较多种模型与模型组合的效率与效果。1) 使用Word2vec的连续词袋模型将预处理后的文本转为词...短剧随着时代发展逐渐崛起,成为当今国内外新潮的娱乐载体。本文爬取腾讯短剧品牌十分剧场的短剧用户评价,对该不平衡样本数据进行情感分析,比较多种模型与模型组合的效率与效果。1) 使用Word2vec的连续词袋模型将预处理后的文本转为词向量,构建LSTM/BILSTM模型,两者无效果差别,LSTM所用时间最短;2) 构建TextCNN + LSTM/BILSTM模型,使用TextCNN获取向量特征,通过LSTM/BILSTM学习情感规律,稀少数据的F1-Score提升约10%;3) 构建TextCNN + LSTM + Muti_Head_Attention模型,添加多头注意力机制把握字与字之间的多重联系,耗时增加一倍,稀少数据的F1-Score上限再次提升1%;4) 使用随机删除增强数据会以降低20%的精准率的代价提高10%的召回率;5) 在第3点的基础上在卷积层中添加残差连接,稀少数据的F1-Score上限提高2%;6) 使用Bert/Roberta的分词器和模型取代Word2vec与传统RNN,得到的结果对比第5点,提升约为9%/12%,泛化性更强,时间和硬件成本大幅提升,但添加TextCNN、LSTM与多头注意力后,效果反而出现下降。As Micro-Dramas grow in popularity worldwide, this article evaluates user reviews from Tencent’s “Shifen Theater”, analyzing imbalanced data sentiment and comparing various models and combinations. 1) Word2Vec’s bag-of-words model turns preprocessed text into vectors, building LSTM/BiLSTM models—both perform poorly, with LSTM being the fastest;2) The TextCNN + LSTM/ BiLSTM model uses TextCNN for vector features and LSTM/BiLSTM for sentiment learning, boosting the F1-Score for rare data by about 10%;3) Adding Multi-Head Attention to TextCNN + LSTM/BiLSTM captures intricate character relationships, doubling the runtime and increasing the F1-Score by 1%;4) Random deletion enhances data but sacrifices 20% precision for 10% better recall;5) Add residual connections to the convolution layers in model 3, improving the F1-Score by 2% on sparse data;6) Replacing Word2Vec and traditional RNNs with Bert/Roberta improves results by 11%/14% over the third model, offers better generalizability, but increases time and cost significantly. However, incorporating TextCNN, LSTM, and Multi-Head Attention can decrease performance.展开更多
隐私问题一直是Lifelog研究领域的热点问题之一。然而,由于目前数据集中存在隐私风险,这不但限制了研究者公开Lifelog数据集,也妨碍了研究者之间分享他们的数据集及研究成果。随着可穿戴设备和智能手机的广泛应用,Lifelog研究进入了一...隐私问题一直是Lifelog研究领域的热点问题之一。然而,由于目前数据集中存在隐私风险,这不但限制了研究者公开Lifelog数据集,也妨碍了研究者之间分享他们的数据集及研究成果。随着可穿戴设备和智能手机的广泛应用,Lifelog研究进入了一个新的阶段,其数据类型也变得愈发丰富,通常涵盖GPS、视频、图片、文本、语音等多种形式。针对目前多种数据格式的Lifelog数据集,我们提出了一个LPPM (Lifelog Privacy Protection Model)隐私保护模型。针对不同的数据类型,该模型可以选择不同的隐私策略。同时该模型还提出了一种基于场景的图片隐私策略SPP (Scene-Based Privacy Protection),该策略将首先预测Lifelog图片的场景,然后根据场景选取不同的隐私保护方法。我们在LiuLifelog数据集上对提出的模型进行了验证,通过LPPM模型对数据集的处理,我们认为我们的Lifelog数据集达到了可公开的程度,图片中大多数隐私被很好地掩盖了,这进一步说明我们提出的模型方法是有效的。Privacy issues have always been a hot topic in the field of Lifelog research. However, due to the current privacy risks present in datasets, researchers are not only limited in publicly sharing Lifelog datasets but also hindered in sharing their datasets and research findings among themselves. With the widespread adoption of wearable devices and smartphones, Lifelog research has entered a new stage, and the data types have become increasingly rich, typically encompassing various forms such as GPS, video, images, text, and audio. In response to the current multi-format Lifelog datasets, we propose an LPPM (Lifelog Privacy Protection Model) privacy protection model. For different data types, this model can choose different privacy strategies. Moreover, the model proposes a scene-based image privacy strategy called SPP (Scene-based Privacy Protection), which will first predict the scenes of Lifelog images and then select different privacy protection methods based on the scenes. We validated the proposed model on the LiuLifelog dataset. Through the processing of the dataset using the LPPM model, we believe our Lifelog dataset has reached a publishable level, with most privacy in the images well obscured. This further demonstrates the effectiveness of our proposed model and method.展开更多
文摘从2011年开始,我们发起了一个收集个人信息生活记录数据的项目,该项目收集了22位志愿者的4万条lifelogging数据。随着时间的推移,志愿者的lifelogging数据越来越多,其中收集到的视频就多达3020条,想要搜索这些lifelogging数据中的视频变得非常困难。因此,我们提出了一种视频分解 + 图像分析的模型Liu-VTM (Video Tags Model),该模型从lifelogging视频中筛选能够代表该视频内容的关键帧,并依据关键帧进行图像识别得到视频的标签,最后可以通过标签直接检索到相应的视频。在本次实验中我们探讨了多种视频选取关键帧的方法对模型的影响,并提出了一个新的评价指标“最佳内容覆盖率”用于评价lifelog领域内视频选取到的关键帧的性能。我们的实验结果证明了Liu-VTM模型可以有效对lifelogging数据集打上视频标签并依据标签直接检索到相应视频。Since 2011, we have initiated a project to collect personal lifelogging data, gathering 40,000 lifelogging entries from 22 volunteers. Over time, the amount of lifelogging data from the volunteers has increased, including as many as 3020 videos, making it extremely difficult to search through these lifelogging videos. Therefore, we propose a video decomposition and image analysis model called Liu-VTM (Video Tags Model). This model selects keyframes from lifelogging videos that represent the content of the video and uses image recognition on these keyframes to generate video tags. These tags can then be used to directly retrieve the corresponding videos. In this experiment, we explored various methods for selecting keyframes from videos and proposed a new evaluation metric called “Optimal Content Coverage Rate” to assess the performance of keyframe selection in the lifelogging domain. Our experimental results demonstrate that the Liu-VTM model can effectively tag videos in lifelogging datasets and retrieve the corresponding videos based on these tags.
文摘短剧随着时代发展逐渐崛起,成为当今国内外新潮的娱乐载体。本文爬取腾讯短剧品牌十分剧场的短剧用户评价,对该不平衡样本数据进行情感分析,比较多种模型与模型组合的效率与效果。1) 使用Word2vec的连续词袋模型将预处理后的文本转为词向量,构建LSTM/BILSTM模型,两者无效果差别,LSTM所用时间最短;2) 构建TextCNN + LSTM/BILSTM模型,使用TextCNN获取向量特征,通过LSTM/BILSTM学习情感规律,稀少数据的F1-Score提升约10%;3) 构建TextCNN + LSTM + Muti_Head_Attention模型,添加多头注意力机制把握字与字之间的多重联系,耗时增加一倍,稀少数据的F1-Score上限再次提升1%;4) 使用随机删除增强数据会以降低20%的精准率的代价提高10%的召回率;5) 在第3点的基础上在卷积层中添加残差连接,稀少数据的F1-Score上限提高2%;6) 使用Bert/Roberta的分词器和模型取代Word2vec与传统RNN,得到的结果对比第5点,提升约为9%/12%,泛化性更强,时间和硬件成本大幅提升,但添加TextCNN、LSTM与多头注意力后,效果反而出现下降。As Micro-Dramas grow in popularity worldwide, this article evaluates user reviews from Tencent’s “Shifen Theater”, analyzing imbalanced data sentiment and comparing various models and combinations. 1) Word2Vec’s bag-of-words model turns preprocessed text into vectors, building LSTM/BiLSTM models—both perform poorly, with LSTM being the fastest;2) The TextCNN + LSTM/ BiLSTM model uses TextCNN for vector features and LSTM/BiLSTM for sentiment learning, boosting the F1-Score for rare data by about 10%;3) Adding Multi-Head Attention to TextCNN + LSTM/BiLSTM captures intricate character relationships, doubling the runtime and increasing the F1-Score by 1%;4) Random deletion enhances data but sacrifices 20% precision for 10% better recall;5) Add residual connections to the convolution layers in model 3, improving the F1-Score by 2% on sparse data;6) Replacing Word2Vec and traditional RNNs with Bert/Roberta improves results by 11%/14% over the third model, offers better generalizability, but increases time and cost significantly. However, incorporating TextCNN, LSTM, and Multi-Head Attention can decrease performance.
文摘隐私问题一直是Lifelog研究领域的热点问题之一。然而,由于目前数据集中存在隐私风险,这不但限制了研究者公开Lifelog数据集,也妨碍了研究者之间分享他们的数据集及研究成果。随着可穿戴设备和智能手机的广泛应用,Lifelog研究进入了一个新的阶段,其数据类型也变得愈发丰富,通常涵盖GPS、视频、图片、文本、语音等多种形式。针对目前多种数据格式的Lifelog数据集,我们提出了一个LPPM (Lifelog Privacy Protection Model)隐私保护模型。针对不同的数据类型,该模型可以选择不同的隐私策略。同时该模型还提出了一种基于场景的图片隐私策略SPP (Scene-Based Privacy Protection),该策略将首先预测Lifelog图片的场景,然后根据场景选取不同的隐私保护方法。我们在LiuLifelog数据集上对提出的模型进行了验证,通过LPPM模型对数据集的处理,我们认为我们的Lifelog数据集达到了可公开的程度,图片中大多数隐私被很好地掩盖了,这进一步说明我们提出的模型方法是有效的。Privacy issues have always been a hot topic in the field of Lifelog research. However, due to the current privacy risks present in datasets, researchers are not only limited in publicly sharing Lifelog datasets but also hindered in sharing their datasets and research findings among themselves. With the widespread adoption of wearable devices and smartphones, Lifelog research has entered a new stage, and the data types have become increasingly rich, typically encompassing various forms such as GPS, video, images, text, and audio. In response to the current multi-format Lifelog datasets, we propose an LPPM (Lifelog Privacy Protection Model) privacy protection model. For different data types, this model can choose different privacy strategies. Moreover, the model proposes a scene-based image privacy strategy called SPP (Scene-based Privacy Protection), which will first predict the scenes of Lifelog images and then select different privacy protection methods based on the scenes. We validated the proposed model on the LiuLifelog dataset. Through the processing of the dataset using the LPPM model, we believe our Lifelog dataset has reached a publishable level, with most privacy in the images well obscured. This further demonstrates the effectiveness of our proposed model and method.