In recent years,cyber attacks have been intensifying and causing great harm to individuals,companies,and countries.The mining of cyber threat intelligence(CTI)can facilitate intelligence integration and serve well in ...In recent years,cyber attacks have been intensifying and causing great harm to individuals,companies,and countries.The mining of cyber threat intelligence(CTI)can facilitate intelligence integration and serve well in combating cyber attacks.Named Entity Recognition(NER),as a crucial component of text mining,can structure complex CTI text and aid cybersecurity professionals in effectively countering threats.However,current CTI NER research has mainly focused on studying English CTI.In the limited studies conducted on Chinese text,existing models have shown poor performance.To fully utilize the power of Chinese pre-trained language models(PLMs)and conquer the problem of lengthy infrequent English words mixing in the Chinese CTIs,we propose a residual dilated convolutional neural network(RDCNN)with a conditional random field(CRF)based on a robustly optimized bidirectional encoder representation from transformers pre-training approach with whole word masking(RoBERTa-wwm),abbreviated as RoBERTa-wwm-RDCNN-CRF.We are the first to experiment on the relevant open source dataset and achieve an F1-score of 82.35%,which exceeds the common baseline model bidirectional encoder representation from transformers(BERT)-bidirectional long short-term memory(BiLSTM)-CRF in this field by about 19.52%and exceeds the current state-of-the-art model,BERT-RDCNN-CRF,by about 3.53%.In addition,we conducted an ablation study on the encoder part of the model to verify the effectiveness of the proposed model and an in-depth investigation of the PLMs and encoder part of the model to verify the effectiveness of the proposed model.The RoBERTa-wwm-RDCNN-CRF model,the shared pre-processing,and augmentation methods can serve the subsequent fundamental tasks such as cybersecurity information extraction and knowledge graph construction,contributing to important applications in downstream tasks such as intrusion detection and advanced persistent threat(APT)attack detection.展开更多
This paper extends the literature on the economics of sharing cybersecurity information by and among profit-seeking firms by modeling the case where a government agency or department publicly shares unclassified cyber...This paper extends the literature on the economics of sharing cybersecurity information by and among profit-seeking firms by modeling the case where a government agency or department publicly shares unclassified cyber threat information with all organizations. In prior cybersecurity information sharing models a common element was reciprocity—i.e., firms receiving shared information are also asked to share their private cybersecurity information with all other firms (via an information sharing arrangement). In contrast, sharing of unclassified cyber threat intelligence (CTI) by a government agency or department is not based on reciprocal sharing by the recipient organizations. After considering the government’s cost of preparing and disseminating CTI, as well as the benefits to the recipients of the CTI, we provide sufficient conditions for sharing of CTI to result in an increase in social welfare. Under a broad set of general conditions, sharing of CTI will increase social welfare gross of the costs to the government agency or department sharing the information. Thus, if the entity can keep the sharing costs low, sharing cybersecurity information will result in an increase in net social welfare.展开更多
The continuous improvement of the cyber threat intelligence sharing mechanism provides new ideas to deal with Advanced Persistent Threats(APT).Extracting attack behaviors,i.e.,Tactics,Techniques,Procedures(TTP)from Cy...The continuous improvement of the cyber threat intelligence sharing mechanism provides new ideas to deal with Advanced Persistent Threats(APT).Extracting attack behaviors,i.e.,Tactics,Techniques,Procedures(TTP)from Cyber Threat Intelligence(CTI)can facilitate APT actors’profiling for an immediate response.However,it is difficult for traditional manual methods to analyze attack behaviors from cyber threat intelligence due to its heterogeneous nature.Based on the Adversarial Tactics,Techniques and Common Knowledge(ATT&CK)of threat behavior description,this paper proposes a threat behavioral knowledge extraction framework that integrates Heterogeneous Text Network(HTN)and Graph Convolutional Network(GCN)to solve this issue.It leverages the hierarchical correlation relationships of attack techniques and tactics in the ATT&CK to construct a text network of heterogeneous cyber threat intelligence.With the help of the Bidirectional EncoderRepresentation fromTransformers(BERT)pretraining model to analyze the contextual semantics of cyber threat intelligence,the task of threat behavior identification is transformed into a text classification task,which automatically extracts attack behavior in CTI,then identifies the malware and advanced threat actors.The experimental results show that F1 achieve 94.86%and 92.15%for the multi-label classification tasks of tactics and techniques.Extend the experiment to verify the method’s effectiveness in identifying the malware and threat actors in APT attacks.The F1 for malware and advanced threat actors identification task reached 98.45%and 99.48%,which are better than the benchmark model in the experiment and achieve state of the art.The model can effectivelymodel threat intelligence text data and acquire knowledge and experience migration by correlating implied features with a priori knowledge to compensate for insufficient sample data and improve the classification performance and recognition ability of threat behavior in text.展开更多
The cybersecurity report provides unstructured actionable cyber threat intelligence(CTI)with detailed threat attack procedures and indicators of compromise(IOCs),e.g.,malware hash or URL(uniform resource locator)of co...The cybersecurity report provides unstructured actionable cyber threat intelligence(CTI)with detailed threat attack procedures and indicators of compromise(IOCs),e.g.,malware hash or URL(uniform resource locator)of command and control server.The actionable CTI,integrated into intrusion detection systems,can not only prioritize the most urgent threats based on the campaign stages of attack vectors(i.e.,IOCs)but also take appropriate mitigation measures based on contextual information of the alerts.However,the dramatic growth in the number of cybersecurity reports makes it nearly impossible for security professionals to find an efficient way to use these massive amounts of threat intelligence.In this paper,we propose a trigger-enhanced actionable CTI discovery system(TriCTI)to portray a relationship between IOCs and campaign stages and generate actionable CTI from cybersecurity reports through natural language processing(NLP)technology.Specifically,we introduce the“campaign trigger”for an effective explanation of the campaign stages to improve the performance of the classification model.The campaign trigger phrases are the keywords in the sentence that imply the campaign stage.The trained final trigger vectors have similar space representations with the keywords in the unseen sentence and will help correct classification by increasing the weight of the keywords.We also meticulously devise a data augmentation specifically for cybersecurity training sets to cope with the challenge of the scarcity of annotation data sets.Compared with state-of-the-art text classification models,such as BERT,the trigger-enhanced classification model has better performance with accuracy(86.99%)and F1 score(87.02%).We run TriCTI on more than 29k cybersecurity reports,from which we automatically and efficiently collect 113,543 actionable CTI.In particular,we verify the actionability of discovered CTI by using large-scale field data from VirusTotal(VT).The results demonstrate that the threat intelligence provided by VT lacks a part of the threat context for IOCs,such as the Actions on Objectives campaign stage.As a comparison,our proposed method can completely identify the actionable CTI in all campaign stages.Accordingly,cyber threats can be identified and resisted at any campaign stage with the discovered actionable CTI.展开更多
The ever-increasing amount of major security incidents has led to an emerging interest in cooperative approaches to encounter cyber threats.To enable cooperation in detecting and preventing attacks it is an inevitable...The ever-increasing amount of major security incidents has led to an emerging interest in cooperative approaches to encounter cyber threats.To enable cooperation in detecting and preventing attacks it is an inevitable necessity to have structured and standardized formats to describe an incident.Corresponding formats are complex and of an extensive nature as they are often designed for automated processing and exchange.These characteristics hamper the readability and,therefore,prevent humans from understanding the documented incident.This is a major problem since the success and effectiveness of any security measure rely heavily on the contribution of security experts.To meet these shortcomings we propose a visual analytics concept enabling security experts to analyze and enrich semi-structured cyber threat intelligence information.Our approach combines an innovative way of persisting this data with an interactive visualization component to analyze and edit the threat information.We demonstrate the feasibility of our concept using the Structured Threat Information eXpression,the state-ofthe-art format for reporting cyber security issues.展开更多
Cyber Threat Intelligence(CTI)has gained massive attention to collect hidden knowledge for a better understanding of the various cyber-attacks and eventually paving the way for predicting the future of such attacks.Th...Cyber Threat Intelligence(CTI)has gained massive attention to collect hidden knowledge for a better understanding of the various cyber-attacks and eventually paving the way for predicting the future of such attacks.The information exchange and collaborative sharing through different platforms have a significant contribution towards a global solution.While CTI and the information exchange can help a lot in focusing and prioritizing on the use of the large volume of complex information among different organizations,there exists a great challenge ineffective processing of large count of different Indicators of Threat(IoT)which appear regularly,and that can be solved only through a collaborative approach.Collaborative approach and intelligence sharing have become the mandatory element in the entire world of processing the threats.In order to covet the complete needs of having a definite standard of information exchange,various initiatives have been taken in means of threat information sharing platforms like MISP and formats such as SITX.This paper proposes a scoring model to address information decay,which is shared within TISP.The scoring model is implemented,taking the use case of detecting the Threat Indicators in a phishing data network.The proposed method calculates the rate of decay of an attribute through which the early entries are removed.展开更多
Humans are commonly seen as the weakest link in corporate information security.This led to a lot of effort being put into security training and awareness campaigns,which resulted in employees being less likely the tar...Humans are commonly seen as the weakest link in corporate information security.This led to a lot of effort being put into security training and awareness campaigns,which resulted in employees being less likely the target of successful attacks.Existing approaches,however,do not tap the full potential that can be gained through these campaigns.On the one hand,human perception offers an additional source of contextual information for detected incidents,on the other hand it serves as information source for incidents that may not be detectable by automated procedures.These approaches only allow a text-based reporting of basic incident information.A structured recording of human delivered information that also provides compatibility with existing SIEM systems is still missing.In this work,we propose an approach,which allows humans to systematically report perceived anomalies or incidents in a structured way.Our approach furthermore supports the integration of such reports into analytics systems.Thereby,we identify connecting points to SIEM systems,develop a taxonomy for structuring elements reportable by humans acting as a security sensor and develop a structured data format to record data delivered by humans.A prototypical human-as-a-security-sensor wizard applied to a real-world use-case shows our proof of concept.展开更多
Humans are commonly seen as the weakest link in corporate information security.This led to a lot of effort being put into security training and awareness campaigns,which resulted in employees being less likely the tar...Humans are commonly seen as the weakest link in corporate information security.This led to a lot of effort being put into security training and awareness campaigns,which resulted in employees being less likely the target of successful attacks.Existing approaches,however,do not tap the full potential that can be gained through these campaigns.On the one hand,human perception offers an additional source of contextual information for detected incidents,on the other hand it serves as information source for incidents that may not be detectable by automated procedures.These approaches only allow a text-based reporting of basic incident information.A structured recording of human delivered information that also provides compatibility with existing SIEM systems is still missing.In this work,we propose an approach,which allows humans to systematically report perceived anomalies or incidents in a structured way.Our approach furthermore supports the integration of such reports into analytics systems.Thereby,we identify connecting points to SIEM systems,develop a taxonomy for structuring elements reportable by humans acting as a security sensor and develop a structured data format to record data delivered by humans.A prototypical human-as-a-security-sensor wizard applied to a real-world use-case shows our proof of concept.展开更多
基金funded by the Double Top-Class Innovation Research Project in Cyberspace Security Enforcement Technology of People’s Public Security University of China(No.2023SYL07).
文摘In recent years,cyber attacks have been intensifying and causing great harm to individuals,companies,and countries.The mining of cyber threat intelligence(CTI)can facilitate intelligence integration and serve well in combating cyber attacks.Named Entity Recognition(NER),as a crucial component of text mining,can structure complex CTI text and aid cybersecurity professionals in effectively countering threats.However,current CTI NER research has mainly focused on studying English CTI.In the limited studies conducted on Chinese text,existing models have shown poor performance.To fully utilize the power of Chinese pre-trained language models(PLMs)and conquer the problem of lengthy infrequent English words mixing in the Chinese CTIs,we propose a residual dilated convolutional neural network(RDCNN)with a conditional random field(CRF)based on a robustly optimized bidirectional encoder representation from transformers pre-training approach with whole word masking(RoBERTa-wwm),abbreviated as RoBERTa-wwm-RDCNN-CRF.We are the first to experiment on the relevant open source dataset and achieve an F1-score of 82.35%,which exceeds the common baseline model bidirectional encoder representation from transformers(BERT)-bidirectional long short-term memory(BiLSTM)-CRF in this field by about 19.52%and exceeds the current state-of-the-art model,BERT-RDCNN-CRF,by about 3.53%.In addition,we conducted an ablation study on the encoder part of the model to verify the effectiveness of the proposed model and an in-depth investigation of the PLMs and encoder part of the model to verify the effectiveness of the proposed model.The RoBERTa-wwm-RDCNN-CRF model,the shared pre-processing,and augmentation methods can serve the subsequent fundamental tasks such as cybersecurity information extraction and knowledge graph construction,contributing to important applications in downstream tasks such as intrusion detection and advanced persistent threat(APT)attack detection.
文摘This paper extends the literature on the economics of sharing cybersecurity information by and among profit-seeking firms by modeling the case where a government agency or department publicly shares unclassified cyber threat information with all organizations. In prior cybersecurity information sharing models a common element was reciprocity—i.e., firms receiving shared information are also asked to share their private cybersecurity information with all other firms (via an information sharing arrangement). In contrast, sharing of unclassified cyber threat intelligence (CTI) by a government agency or department is not based on reciprocal sharing by the recipient organizations. After considering the government’s cost of preparing and disseminating CTI, as well as the benefits to the recipients of the CTI, we provide sufficient conditions for sharing of CTI to result in an increase in social welfare. Under a broad set of general conditions, sharing of CTI will increase social welfare gross of the costs to the government agency or department sharing the information. Thus, if the entity can keep the sharing costs low, sharing cybersecurity information will result in an increase in net social welfare.
基金supported by China’s National Key R&D Program,No.2019QY1404the National Natural Science Foundation of China,Grant No.U20A20161,U1836103the Basic Strengthening Program Project,No.2019-JCJQ-ZD-113.
文摘The continuous improvement of the cyber threat intelligence sharing mechanism provides new ideas to deal with Advanced Persistent Threats(APT).Extracting attack behaviors,i.e.,Tactics,Techniques,Procedures(TTP)from Cyber Threat Intelligence(CTI)can facilitate APT actors’profiling for an immediate response.However,it is difficult for traditional manual methods to analyze attack behaviors from cyber threat intelligence due to its heterogeneous nature.Based on the Adversarial Tactics,Techniques and Common Knowledge(ATT&CK)of threat behavior description,this paper proposes a threat behavioral knowledge extraction framework that integrates Heterogeneous Text Network(HTN)and Graph Convolutional Network(GCN)to solve this issue.It leverages the hierarchical correlation relationships of attack techniques and tactics in the ATT&CK to construct a text network of heterogeneous cyber threat intelligence.With the help of the Bidirectional EncoderRepresentation fromTransformers(BERT)pretraining model to analyze the contextual semantics of cyber threat intelligence,the task of threat behavior identification is transformed into a text classification task,which automatically extracts attack behavior in CTI,then identifies the malware and advanced threat actors.The experimental results show that F1 achieve 94.86%and 92.15%for the multi-label classification tasks of tactics and techniques.Extend the experiment to verify the method’s effectiveness in identifying the malware and threat actors in APT attacks.The F1 for malware and advanced threat actors identification task reached 98.45%and 99.48%,which are better than the benchmark model in the experiment and achieve state of the art.The model can effectivelymodel threat intelligence text data and acquire knowledge and experience migration by correlating implied features with a priori knowledge to compensate for insufficient sample data and improve the classification performance and recognition ability of threat behavior in text.
基金Our research was supported by the National Key Research and Development Program of China(Nos.2019QY1301,2018YFB0805005,2018YFC0824801).
文摘The cybersecurity report provides unstructured actionable cyber threat intelligence(CTI)with detailed threat attack procedures and indicators of compromise(IOCs),e.g.,malware hash or URL(uniform resource locator)of command and control server.The actionable CTI,integrated into intrusion detection systems,can not only prioritize the most urgent threats based on the campaign stages of attack vectors(i.e.,IOCs)but also take appropriate mitigation measures based on contextual information of the alerts.However,the dramatic growth in the number of cybersecurity reports makes it nearly impossible for security professionals to find an efficient way to use these massive amounts of threat intelligence.In this paper,we propose a trigger-enhanced actionable CTI discovery system(TriCTI)to portray a relationship between IOCs and campaign stages and generate actionable CTI from cybersecurity reports through natural language processing(NLP)technology.Specifically,we introduce the“campaign trigger”for an effective explanation of the campaign stages to improve the performance of the classification model.The campaign trigger phrases are the keywords in the sentence that imply the campaign stage.The trained final trigger vectors have similar space representations with the keywords in the unseen sentence and will help correct classification by increasing the weight of the keywords.We also meticulously devise a data augmentation specifically for cybersecurity training sets to cope with the challenge of the scarcity of annotation data sets.Compared with state-of-the-art text classification models,such as BERT,the trigger-enhanced classification model has better performance with accuracy(86.99%)and F1 score(87.02%).We run TriCTI on more than 29k cybersecurity reports,from which we automatically and efficiently collect 113,543 actionable CTI.In particular,we verify the actionability of discovered CTI by using large-scale field data from VirusTotal(VT).The results demonstrate that the threat intelligence provided by VT lacks a part of the threat context for IOCs,such as the Actions on Objectives campaign stage.As a comparison,our proposed method can completely identify the actionable CTI in all campaign stages.Accordingly,cyber threats can be identified and resisted at any campaign stage with the discovered actionable CTI.
基金supported by the Federal Ministry of Education and Research,Germany,as part of the BMBF DINGfest project。
文摘The ever-increasing amount of major security incidents has led to an emerging interest in cooperative approaches to encounter cyber threats.To enable cooperation in detecting and preventing attacks it is an inevitable necessity to have structured and standardized formats to describe an incident.Corresponding formats are complex and of an extensive nature as they are often designed for automated processing and exchange.These characteristics hamper the readability and,therefore,prevent humans from understanding the documented incident.This is a major problem since the success and effectiveness of any security measure rely heavily on the contribution of security experts.To meet these shortcomings we propose a visual analytics concept enabling security experts to analyze and enrich semi-structured cyber threat intelligence information.Our approach combines an innovative way of persisting this data with an interactive visualization component to analyze and edit the threat information.We demonstrate the feasibility of our concept using the Structured Threat Information eXpression,the state-ofthe-art format for reporting cyber security issues.
基金The author extends their appreciation to the Deanship of Scientific research at Majmaah University for the funding this work under Project No.1439-48.
文摘Cyber Threat Intelligence(CTI)has gained massive attention to collect hidden knowledge for a better understanding of the various cyber-attacks and eventually paving the way for predicting the future of such attacks.The information exchange and collaborative sharing through different platforms have a significant contribution towards a global solution.While CTI and the information exchange can help a lot in focusing and prioritizing on the use of the large volume of complex information among different organizations,there exists a great challenge ineffective processing of large count of different Indicators of Threat(IoT)which appear regularly,and that can be solved only through a collaborative approach.Collaborative approach and intelligence sharing have become the mandatory element in the entire world of processing the threats.In order to covet the complete needs of having a definite standard of information exchange,various initiatives have been taken in means of threat information sharing platforms like MISP and formats such as SITX.This paper proposes a scoring model to address information decay,which is shared within TISP.The scoring model is implemented,taking the use case of detecting the Threat Indicators in a phishing data network.The proposed method calculates the rate of decay of an attribute through which the early entries are removed.
文摘Humans are commonly seen as the weakest link in corporate information security.This led to a lot of effort being put into security training and awareness campaigns,which resulted in employees being less likely the target of successful attacks.Existing approaches,however,do not tap the full potential that can be gained through these campaigns.On the one hand,human perception offers an additional source of contextual information for detected incidents,on the other hand it serves as information source for incidents that may not be detectable by automated procedures.These approaches only allow a text-based reporting of basic incident information.A structured recording of human delivered information that also provides compatibility with existing SIEM systems is still missing.In this work,we propose an approach,which allows humans to systematically report perceived anomalies or incidents in a structured way.Our approach furthermore supports the integration of such reports into analytics systems.Thereby,we identify connecting points to SIEM systems,develop a taxonomy for structuring elements reportable by humans acting as a security sensor and develop a structured data format to record data delivered by humans.A prototypical human-as-a-security-sensor wizard applied to a real-world use-case shows our proof of concept.
文摘Humans are commonly seen as the weakest link in corporate information security.This led to a lot of effort being put into security training and awareness campaigns,which resulted in employees being less likely the target of successful attacks.Existing approaches,however,do not tap the full potential that can be gained through these campaigns.On the one hand,human perception offers an additional source of contextual information for detected incidents,on the other hand it serves as information source for incidents that may not be detectable by automated procedures.These approaches only allow a text-based reporting of basic incident information.A structured recording of human delivered information that also provides compatibility with existing SIEM systems is still missing.In this work,we propose an approach,which allows humans to systematically report perceived anomalies or incidents in a structured way.Our approach furthermore supports the integration of such reports into analytics systems.Thereby,we identify connecting points to SIEM systems,develop a taxonomy for structuring elements reportable by humans acting as a security sensor and develop a structured data format to record data delivered by humans.A prototypical human-as-a-security-sensor wizard applied to a real-world use-case shows our proof of concept.