摘要
随着网络攻防对抗日益激烈,威胁情报的深度挖掘与有效利用成为提升网络安全防御策略的关键。针对传统信息抽取技术在训练数据构建和模型泛化能力方面的局限性,提出了一种基于大语言模型(Large Language Models,LLMs)的威胁情报实体及其相互关系抽取框架。借助LLMs的深度语义理解能力,通过提示工程技术准确抽取威胁实体及其相互关系,同时辅以LangChain扩展抽取广度。此外,通过搜索引擎集成提高情报挖掘的时效性和准确性。实验结果显示,该框架在少样本或零样本情境下表现出色,有效减少了误导信息的生成,实现了实时高效的情报知识提取。总体而言,引入一种灵活高效的威胁情报智能化挖掘方法,优化了威胁情报的知识融合过程,提升了网络防御的主动性与先进性。
In the context of escalating cybersecurity confrontations,the effective extraction and utilization of threat intelligence were imperative for the enhancement of network security defense strategies.Due to the limitations of traditional information extraction methods in training data construction and model generalization,a framework for extracting threat intelligence entities and relationships based on Large Language Models(LLMs)was proposed.Leveraging LLMs profound semantic comprehension,the framework employed prompt engineering to precisely identify threat entities and their connections,complemented by LangChain for broader extraction coverage.Moreover,integrating search engines enhanced the timeliness and accuracy of intelligence mining.Experimental results demonstrated the framework’s exceptional performance in few-sample or zero-sample scenarios,significantly reducing misinformation and enabling efficient,real-time intelligence extraction.In general,a flexible and efficient intelligent mining method for threat intelligence is introduced,the knowledge fusion process of threat intelligence is optimized,the proactivity and sophistication of network defense are enhanced.
作者
马冰琦
周盈海
王梓宇
田志宏
MA Bingqi;ZHOU Yinghai;WANG Ziyu;TIAN Zhihong(Cyberspace Institute of Advanced Technology,Guangzhou University,Guangzhou 510006,China)
基金
国家自然科学基金(U20B2046)
国家重点研发计划(2021YFB2012402)。