融合多种参数高效微调技术的深度伪造检测方法

Deepfake Detection Method Integrating Multiple Parameter-Efficient Fine-Tuning Techniques

下载PDF

导出

摘要近年来,随着深度伪造技术趋于成熟,换脸软件、合成视频已经随处可见。尽管深度伪造技术为人们带来了娱乐,但同时也为不法分子提供了滥用的机会。因此,深度伪造检测技术的重要性也日益凸显。现有的深度伪造检测方法普遍存在跨压缩率鲁棒性差、跨数据集泛化性差以及模型训练开销大等问题。为解决上述问题,提出一种融合多种参数高效微调技术的深度伪造检测方法,使用以掩码图像建模(MIM)自监督方法预训练的视觉自注意力模型作为主干,使用克罗内克积改进的低秩自适应方法对预训练模型的自注意力模块参数进行微调,同时采用并行结构加入卷积适配器对图像局部纹理信息进行学习,以增强预训练模型在深度伪造检测任务中的适应能力,采用并行结构引入经典适配器对预训练模型的前馈网络微调以充分利用预训练阶段学习到的知识,使用多层感知机代替原预训练模型分类头实现深度伪造检测。在六个数据集上的实验结果表明,该模型在可训练参数仅有2×10^(7)的情况下,在六个主流数据集上实现了平均约0.996的帧水平AUC。在跨压缩率实验中,帧水平AUC的平均下降为0.135。在跨数据集泛化性实验中,帧水平AUC达到了平均0.765。 In recent years,as deepfake technology matures,face-swapping software and synthesized videos have become widespread.While these techniques offer entertainment,they also provide opportunities for misuse by malicious actors.Consequently,the significance of deepfake detection technology has grown markedly.Existing methods for deepfake detection commonly suffer from issues including poor cross-compression robustness,weak crossdataset generalization,and high model training overheads.To address these challenges,this paper proposes a deepfake detection approach that combines multiple parameter-efficient fine-tuning techniques.This method utilizes a visual Transformer model pretrained with the masked image modeling self-supervised method as its backbone.Initially,it employs the low-rank adaptation(LoRA)method for fine-tuning the self-attention module parameters of the pretrained model.Concurrently,it introduces a parallel structure incorporating convolutional adapters to capture local texture information,enhancing the model’s adaptability in deepfake detection tasks.Subsequently,a serial structure introduces classical adapters to fine-tune the feed-forward network of the pretrained model,maximizing the utilization of knowledge acquired during the pretraining phase.Ultimately,a multi-layer perceptron replaces the original pretrained model’s classification head for deepfake detection.Experimental results across six datasets demonstrate that this model achieves an average frame-level AUC of approximately 0.996 with only 2×10^(7)trainable parameters.In cross-compression experiments,the average frame-level AUC drop is 0.135.In cross-dataset generalization experiments,the frame-level AUC averages around 0.765.

作者张溢文蔡满春陈咏豪朱懿姚利峰 ZHANG Yiwen;CAI Manchun;CHEN Yonghao;ZHU Yi;YAO Lifeng(College of Information and Cyber Security,People’s Public Security University of China,Beijing 100038,China)

机构地区中国人民公安大学信息网络安全学院

出处《计算机科学与探索》 CSCD 北大核心 2024年第12期3335-3347,共13页 Journal of Frontiers of Computer Science and Technology

基金中国人民公安大学网络空间安全执法技术双一流创新研究专项(2023SYL07)。

关键词深度伪造视觉自注意力模型自监督预训练模型低秩自适应参数高效微调 deepfakes vision Transformer self-supervised pretrained models low-rank adaptation(LoRA) parameterefficient fine-tuning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1田泽庶,刘春雨,张云婷,张嘉宇,孟超,张宏莉.基于软提示微调和强化学习的网络安全命名实体识别方法研究[J].通信学报,2024,45(10):1-16.
2于晨,毕广超,安丰迪.基于深度学习的图像识别与分类技术研究[J].中华传奇（下旬）,2022(18):0128-0130.
3王鹏,张磊,左鹏,刘新娜,安杰飞.基于探地雷达数据的地层模型快速反演技术[J].电子设计工程,2024,32(23):136-139.
4吴奇隆,鲁彦禹.“三全育人”视阈下大模型赋能高校网络精准思政--一种基于ERNIE3.0模型的实施路径[J].金融理论与教学,2024,42(5):113-118.
5厦门大学纪荣嵘教授团队在深度伪造检测领域取得新进展[J].信息网络安全,2024(11):1695-1695.
6李小菲,苟光磊,韩岩奇,朱东华.多尺度知识引导局部增强的小样本细粒度图像分类方法[J].智能系统学报,2024,19(5):1157-1167.
7陈仁祥,邱天然,杨黎霞,张芷僮,夏亮.基于空间信息聚合的遮挡目标抓取位姿检测[J].光学精密工程,2024,32(18):2792-2802.
8曹伟康,林宏刚.基于加权特征融合的物联网设备识别方法[J].计算机科学,2024,51(S02):875-883.
9谭会生,杨威,严舒琪.交通速度预测时空图卷积网络及其FPGA实现研究[J].电子测量技术,2024,47(18):108-119.
10堵红群,李岳阳,崔方正,罗海驰,顾中轩.基于多维度融合的肺结节分类算法[J].中国医学物理学杂志,2024,41(11):1428-1436.

计算机科学与探索

2024年第12期

浏览历史

内容加载中请稍等...

融合多种参数高效微调技术的深度伪造检测方法

相关作者

相关机构

相关主题

浏览历史