Evaluating Privacy Leakage and Memorization Attacks on Large Language Models (LLMs) in Generative AI Applications

Evaluating Privacy Leakage and Memorization Attacks on Large Language Models (LLMs) in Generative AI Applications

下载PDF

导出

摘要 The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks. The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.

作者 Harshvardhan Aditya Siddansh Chawla Gunika Dhingra Parijat Rai Saumil Sood Tanmay Singh Zeba Mohsin Wase Arshdeep Bahga Vijay K. Madisetti Harshvardhan Aditya;Siddansh Chawla;Gunika Dhingra;Parijat Rai;Saumil Sood;Tanmay Singh;Zeba Mohsin Wase;Arshdeep Bahga;Vijay K. Madisetti(School of Computer Science Engineering & Technology, Bennett University, Greater Noida, India;Cloudemy Technology Labs, Chandigarh, India;School of Cybersecurity and Privacy, Georgia Institute of Technology, Atlanta, USA)

机构地区 School of Computer Science Engineering & Technology Cloudemy Technology Labs School of Cybersecurity and Privacy

出处《Journal of Software Engineering and Applications》 2024年第5期421-447,共27页 软件工程与应用（英文）

关键词 Large Language Models PII Leakage Privacy Memorization OVERFITTING Membership Inference Attack (MIA) Large Language Models PII Leakage Privacy Memorization Overfitting Membership Inference Attack (MIA)

分类号 H31 [语言文字—英语]

引文网络
相关文献

1张春蕾,王丽红,杨丽,赵贺,文惠,袁全,周彤.基于网络药理学探究黑涩楠果实对Ⅱ型糖尿病作用机制[J].中国林副特产,2024(3):31-34.
2Fang Gong,Yuhang Ai,Lina Zhang,Qianyi Peng,Quan Zhou,Chunmei Gui.Erratum to “Relationship between PaO_(2)/FiO_(2) and delirium in intensive care:A cross-sectional study”[Journal of Intensive Medicine volume 3(2023)73–78.][J].Journal of Intensive Medicine,2024,4(1):136-136.
3Sina Ahmadi.Security Implications of Edge Computing in Cloud Networks[J].Journal of Computer and Communications,2024,12(2):26-46. 被引量：1
4张永梅,齐昊宇,郭奥.基于WGAN和多头注意力机制的学生数据生成模型[J].北方工业大学学报,2024,36(1):76-83.
5郑文川,钟献阳,顾银中,张淇钏,吴敏丹.胸部CT肺部炎症指数联合乳酸、NLR在新冠肺炎预后中的临床价值[J].中国CT和MRI杂志,2024,22(4):45-48.
6Siddharth M. Madikeri,Vijay K. Madisetti.Ad Blockers & Online Privacy: A Comparative Analysis of Privacy Enhancing Technologies (PET)[J].Journal of Software Engineering and Applications,2024,17(5):378-395.
7赵静,玄祖兴,黄可佳,李雅馨.基于改进GFPGAN的墓室壁画盲人脸修复研究[J].东北师大学报（自然科学版）,2024,56(2):53-59.
8伍凌川,史慧芳,邱枫,石义官.基于近似存在性查询的高效图像异常检测方法[J].电子科技大学学报,2024,53(3):424-430.
9卓佩妍,张瑶娜,刘炜,刘自金,宋友.CTGANBoost:基于CTGAN与Boosting的信贷欺诈检测研究[J].计算机科学,2024,51(S01):607-613.
10Stanislav Kotlyarov.Importance of the gut microbiota in the gut-liver axis in normal and liver disease[J].World Journal of Hepatology,2024,16(6):878-882.

Journal of Software Engineering and Applications

2024年第5期

浏览历史

内容加载中请稍等...

Evaluating Privacy Leakage and Memorization Attacks on Large Language Models (LLMs) in Generative AI Applications

相关作者

相关机构

相关主题

浏览历史