期刊文献+
共找到14篇文章
< 1 >
每页显示 20 50 100
IDS-INT:Intrusion detection system using transformer-based transfer learning for imbalanced network traffic 被引量:3
1
作者 Farhan Ullah Shamsher Ullah +1 位作者 Gautam Srivastava Jerry Chun-Wei Lin 《Digital Communications and Networks》 SCIE CSCD 2024年第1期190-204,共15页
A network intrusion detection system is critical for cyber security against llegitimate attacks.In terms of feature perspectives,network traffic may include a variety of elements such as attack reference,attack type,a... A network intrusion detection system is critical for cyber security against llegitimate attacks.In terms of feature perspectives,network traffic may include a variety of elements such as attack reference,attack type,a subcategory of attack,host information,malicious scripts,etc.In terms of network perspectives,network traffic may contain an imbalanced number of harmful attacks when compared to normal traffic.It is challenging to identify a specific attack due to complex features and data imbalance issues.To address these issues,this paper proposes an Intrusion Detection System using transformer-based transfer learning for Imbalanced Network Traffic(IDS-INT).IDS-INT uses transformer-based transfer learning to learn feature interactions in both network feature representation and imbalanced data.First,detailed information about each type of attack is gathered from network interaction descriptions,which include network nodes,attack type,reference,host information,etc.Second,the transformer-based transfer learning approach is developed to learn detailed feature representation using their semantic anchors.Third,the Synthetic Minority Oversampling Technique(SMOTE)is implemented to balance abnormal traffic and detect minority attacks.Fourth,the Convolution Neural Network(CNN)model is designed to extract deep features from the balanced network traffic.Finally,the hybrid approach of the CNN-Long Short-Term Memory(CNN-LSTM)model is developed to detect different types of attacks from the deep features.Detailed experiments are conducted to test the proposed approach using three standard datasets,i.e.,UNsWNB15,CIC-IDS2017,and NSL-KDD.An explainable AI approach is implemented to interpret the proposed method and develop a trustable model. 展开更多
关键词 Network intrusion detection Transfer learning Features extraction imbalance data Explainable AI CYBERSECURITY
下载PDF
Clustered Federated Learning with Weighted Model Aggregation for Imbalanced Data
2
作者 Dong Wang Naifu Zhang Meixia Tao 《China Communications》 SCIE CSCD 2022年第8期41-56,共16页
As a promising edge learning framework in future 6G networks,federated learning(FL)faces a number of technical challenges due to the heterogeneous network environment and diversified user behaviors.Data imbalance is o... As a promising edge learning framework in future 6G networks,federated learning(FL)faces a number of technical challenges due to the heterogeneous network environment and diversified user behaviors.Data imbalance is one of these challenges that can significantly degrade the learning efficiency.To deal with data imbalance issue,this work proposes a new learning framework,called clustered federated learning with weighted model aggregation(weighted CFL).Compared with traditional FL,our weighted CFL adaptively clusters the participating edge devices based on the cosine similarity of their local gradients at each training iteration,and then performs weighted per-cluster model aggregation.Therein,the similarity threshold for clustering is adaptive over iterations in response to the time-varying divergence of local gradients.Moreover,the weights for per-cluster model aggregation are adjusted according to the data balance feature so as to speed up the convergence rate.Experimental results show that the proposed weighted CFL achieves a faster model convergence rate and greater learning accuracy than benchmark methods under the imbalanced data scenario. 展开更多
关键词 clustered federated learning data imbalance convergence rate analysis model aggregation
下载PDF
Classification of aviation incident causes using LGBM with improved cross-validation
3
作者 NI Xiaomei WANG Huawei +1 位作者 CHEN Lingzi LIN Ruiguan 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期396-405,共10页
Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced mach... Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced machine learning algorithm.To assess aviation safety and identify the causes of incidents, a classification model with light gradient boosting machine (LGBM)based on the aviation safety reporting system (ASRS) has been developed. It is improved by k-fold cross-validation with hybrid sampling model (HSCV), which may boost classification performance and maintain data balance. The results show that employing the LGBM-HSCV model can significantly improve accuracy while alleviating data imbalance. Vertical comparison with other cross-validation (CV) methods and lateral comparison with different fold times comprise the comparative approach. Aside from the comparison, two further CV approaches based on the improved method in this study are discussed:one with a different sampling and folding order, and the other with more CV. According to the assessment indices with different methods, the LGBMHSCV model proposed here is effective at detecting incident causes. The improved model for imbalanced data categorization proposed may serve as a point of reference for similar data processing, and the model’s accurate identification of civil aviation incident causes can assist to improve civil aviation safety. 展开更多
关键词 aviation safety imbalance data light gradient boosting machine(LGBM) cross-validation(CV)
下载PDF
An Optimal Big Data Analytics with Concept Drift Detection on High-Dimensional Streaming Data 被引量:1
4
作者 Romany F.Mansour Shaha Al-Otaibi +3 位作者 Amal Al-Rasheed Hanan Aljuaid Irina V.Pustokhina Denis A.Pustokhin 《Computers, Materials & Continua》 SCIE EI 2021年第9期2843-2858,共16页
Big data streams started becoming ubiquitous in recent years,thanks to rapid generation of massive volumes of data by different applications.It is challenging to apply existing data mining tools and techniques directl... Big data streams started becoming ubiquitous in recent years,thanks to rapid generation of massive volumes of data by different applications.It is challenging to apply existing data mining tools and techniques directly in these big data streams.At the same time,streaming data from several applications results in two major problems such as class imbalance and concept drift.The current research paper presents a new Multi-Objective Metaheuristic Optimization-based Big Data Analytics with Concept Drift Detection(MOMBD-CDD)method on High-Dimensional Streaming Data.The presented MOMBD-CDD model has different operational stages such as pre-processing,CDD,and classification.MOMBD-CDD model overcomes class imbalance problem by Synthetic Minority Over-sampling Technique(SMOTE).In order to determine the oversampling rates and neighboring point values of SMOTE,Glowworm Swarm Optimization(GSO)algorithm is employed.Besides,Statistical Test of Equal Proportions(STEPD),a CDD technique is also utilized.Finally,Bidirectional Long Short-Term Memory(Bi-LSTM)model is applied for classification.In order to improve classification performance and to compute the optimum parameters for Bi-LSTM model,GSO-based hyperparameter tuning process is carried out.The performance of the presented model was evaluated using high dimensional benchmark streaming datasets namely intrusion detection(NSL KDDCup)dataset and ECUE spam dataset.An extensive experimental validation process confirmed the effective outcome of MOMBD-CDD model.The proposed model attained high accuracy of 97.45%and 94.23%on the applied KDDCup99 Dataset and ECUE Spam datasets respectively. 展开更多
关键词 Streaming data concept drift classification model deep learning class imbalance data
下载PDF
The study of intelligent algorithm in particle identification of heavy-ion collisions at low and intermediate energies
5
作者 Gao-Yi Cheng Qian-Min Su +1 位作者 Xi-Guang Cao Guo-Qiang Zhang 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2024年第2期170-182,共13页
Traditional particle identification methods face timeconsuming,experience-dependent,and poor repeatability challenges in heavy-ion collisions at low and intermediate energies.Researchers urgently need solutions to the... Traditional particle identification methods face timeconsuming,experience-dependent,and poor repeatability challenges in heavy-ion collisions at low and intermediate energies.Researchers urgently need solutions to the dilemma of traditional particle identification methods.This study explores the possibility of applying intelligent learning algorithms to the particle identification of heavy-ion collisions at low and intermediate energies.Multiple intelligent algorithms,including XgBoost and TabNet,were selected to test datasets from the neutron ion multi-detector for reaction-oriented dynamics(NIMROD-ISiS)and Geant4 simulation.Tree-based machine learning algorithms and deep learning algorithms e.g.TabNet show excellent performance and generalization ability.Adding additional data features besides energy deposition can improve the algorithm’s performance when the data distribution is nonuniform.Intelligent learning algorithms can be applied to solve the particle identification problem in heavy-ion collisions at low and intermediate energies. 展开更多
关键词 Heavy-ion collisions at low and intermediate energies Machine learning Ensemble learning algorithm Particle identification data imbalance
下载PDF
An Improved Hilbert Curve for Parallel Spatial Data Partitioning 被引量:7
6
作者 MENG Lingkui HUANG Changqing ZHAO Chunyu LIN Zhiyong 《Geo-Spatial Information Science》 2007年第4期282-286,共5页
A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on t... A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on the improved Hilbert curve, the algorithm can be designed to achieve almost-uniform spatial data partitioning among multiple disks in parallel spatial databases. Thus, the phenomenon of data imbalance can be significantly avoided and search and query efficiency can be enhanced. 展开更多
关键词 parallel spatial database spatial data partitioning data imbalance Hilbert curve
下载PDF
MEM-TET: Improved Triplet Network for Intrusion Detection System 被引量:2
7
作者 Weifei Wang Jinguo Li +1 位作者 Na Zhao Min Liu 《Computers, Materials & Continua》 SCIE EI 2023年第7期471-487,共17页
With the advancement of network communication technology,network traffic shows explosive growth.Consequently,network attacks occur frequently.Network intrusion detection systems are still the primary means of detectin... With the advancement of network communication technology,network traffic shows explosive growth.Consequently,network attacks occur frequently.Network intrusion detection systems are still the primary means of detecting attacks.However,two challenges continue to stymie the development of a viable network intrusion detection system:imbalanced training data and new undiscovered attacks.Therefore,this study proposes a unique deep learning-based intrusion detection method.We use two independent in-memory autoencoders trained on regular network traffic and attacks to capture the dynamic relationship between traffic features in the presence of unbalanced training data.Then the original data is fed into the triplet network by forming a triplet with the data reconstructed from the two encoders to train.Finally,the distance relationship between the triples determines whether the traffic is an attack.In addition,to improve the accuracy of detecting unknown attacks,this research proposes an improved triplet loss function that is used to pull the distances of the same class closer while pushing the distances belonging to different classes farther in the learned feature space.The proposed approach’s effectiveness,stability,and significance are evaluated against advanced models on the Android Adware and General Malware Dataset(AAGM17),Knowledge Discovery and Data Mining Cup 1999(KDDCUP99),Canadian Institute for Cybersecurity Group’s Intrusion Detection Evaluation Dataset(CICIDS2017),UNSW-NB15,Network Security Lab-Knowledge Discovery and Data Mining(NSL-KDD)datasets.The achieved results confirmed the superiority of the proposed method for the task of network intrusion detection. 展开更多
关键词 Intrusion detection memory-augmented autoencoder deep metric learning imbalance data
下载PDF
Enhanced Coyote Optimization with Deep Learning Based Cloud-Intrusion Detection System 被引量:1
8
作者 Abdullah M.Basahel Mohammad Yamin +1 位作者 Sulafah M.Basahel E.Laxmi Lydia 《Computers, Materials & Continua》 SCIE EI 2023年第2期4319-4336,共18页
Cloud Computing(CC)is the preference of all information technology(IT)organizations as it offers pay-per-use based and flexible services to its users.But the privacy and security become the main hindrances in its achi... Cloud Computing(CC)is the preference of all information technology(IT)organizations as it offers pay-per-use based and flexible services to its users.But the privacy and security become the main hindrances in its achievement due to distributed and open architecture that is prone to intruders.Intrusion Detection System(IDS)refers to one of the commonly utilized system for detecting attacks on cloud.IDS proves to be an effective and promising technique,that identifies malicious activities and known threats by observing traffic data in computers,and warnings are given when such threatswere identified.The current mainstream IDS are assisted with machine learning(ML)but have issues of low detection rates and demanded wide feature engineering.This article devises an Enhanced Coyote Optimization with Deep Learning based Intrusion Detection System for Cloud Security(ECODL-IDSCS)model.The ECODL-IDSCS model initially addresses the class imbalance data problem by the use of Adaptive Synthetic(ADASYN)technique.For detecting and classification of intrusions,long short term memory(LSTM)model is exploited.In addition,ECO algorithm is derived to optimally fine tune the hyperparameters related to the LSTM model to enhance its detection efficiency in the cloud environment.Once the presented ECODL-IDSCS model is tested on benchmark dataset,the experimental results show the promising performance of the ECODL-IDSCS model over the existing IDS models. 展开更多
关键词 Intrusion detection system cloud security coyote optimization algorithm class imbalance data deep learning
下载PDF
End-to-End 2D Convolutional Neural Network Architecture for Lung Nodule Identification and Abnormal Detection in Cloud
9
作者 Safdar Ali Saad Asad +2 位作者 Zeeshan Asghar Atif Ali Dohyeun Kim 《Computers, Materials & Continua》 SCIE EI 2023年第4期461-475,共15页
The extent of the peril associated with cancer can be perceivedfrom the lack of treatment, ineffective early diagnosis techniques, and mostimportantly its fatality rate. Globally, cancer is the second leading cause of... The extent of the peril associated with cancer can be perceivedfrom the lack of treatment, ineffective early diagnosis techniques, and mostimportantly its fatality rate. Globally, cancer is the second leading cause ofdeath and among over a hundred types of cancer;lung cancer is the secondmost common type of cancer as well as the leading cause of cancer-relateddeaths. Anyhow, an accurate lung cancer diagnosis in a timely manner canelevate the likelihood of survival by a noticeable margin and medical imagingis a prevalent manner of cancer diagnosis since it is easily accessible to peoplearound the globe. Nonetheless, this is not eminently efficacious consideringhuman inspection of medical images can yield a high false positive rate. Ineffectiveand inefficient diagnosis is a crucial reason for such a high mortalityrate for this malady. However, the conspicuous advancements in deep learningand artificial intelligence have stimulated the development of exceedinglyprecise diagnosis systems. The development and performance of these systemsrely prominently on the data that is used to train these systems. A standardproblem witnessed in publicly available medical image datasets is the severeimbalance of data between different classes. This grave imbalance of data canmake a deep learning model biased towards the dominant class and unableto generalize. This study aims to present an end-to-end convolutional neuralnetwork that can accurately differentiate lung nodules from non-nodules andreduce the false positive rate to a bare minimum. To tackle the problem ofdata imbalance, we oversampled the data by transforming available images inthe minority class. The average false positive rate in the proposed method isa mere 1.5 percent. However, the average false negative rate is 31.76 percent.The proposed neural network has 68.66 percent sensitivity and 98.42 percentspecificity. 展开更多
关键词 Convolutional neural networks medical image processing lung nodule identification data imbalance deep learning
下载PDF
An Ensemble Classification Model Based on Imbalanced Data for Aviation Safety
10
作者 NI Xiaomei WANG Huawei +1 位作者 LV Shaolan XIONG Minglan 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2021年第5期437-443,共7页
Nowadays aviation accidents have become one of the major causes of severe injuries and fatalities around the world. This attracts the research community to look into aviation safety by applying data analysis technique... Nowadays aviation accidents have become one of the major causes of severe injuries and fatalities around the world. This attracts the research community to look into aviation safety by applying data analysis techniques based on an advanced machine learning algorithm. An ensemble classification model based on Aviation Safety Reporting System(ASRS) has been proposed to analyze aviation safety targeting the people injured in the system.The ensemble classification model shall contain two modules: the data-driven module consisting of data cleaning, feature selection,and imbalanced data division and reorganization, and the modeldriven module stacked by Random Forest(RF), XGBoost(XGB),and Light Gradient Boosting Machine(LGBM) separately. The results indicate that the ensemble model could solve the data imbalance while vastly improving accuracy. LGBM illustrates higher accuracy and faster run in the analysis of a single model of the ASRS-based imbalanced data, while the ensemble model has the best performance in classification at the same time. The ensemble model proposed for imbalanced data classification can provide a certain reference for similar data processing while improving the safety of civil aviation. 展开更多
关键词 aviation safety Aviation Safety Reporting System(ASRS) ensemble model imbalance data CLASSIFICATION Light Gradient Boosting Machine(LGBM)
原文传递
Credit Risk Prediction Based on Improved ADASYN Sampling and Optimized LightGBM
11
作者 Mei Song He Ma +1 位作者 Yi Zhu Mengdi Zhang 《Journal of Social Computing》 EI 2024年第3期232-241,共10页
A credit risk prediction model named KM-ADASYN-TL-FLLightGBM(KADT-FLightGBM)is proposed in this study.Firstly,to overcome the limitation of traditional sampling methods in dealing with imbalanced datasets,an improved ... A credit risk prediction model named KM-ADASYN-TL-FLLightGBM(KADT-FLightGBM)is proposed in this study.Firstly,to overcome the limitation of traditional sampling methods in dealing with imbalanced datasets,an improved ADASYN sampling with K-means clustering algorithm is constructed.Moreover,the Tomek Links method is used to filter the generated samples.Secondly,an utilized an optimized LightGBM algorithm with the Focal Loss is employed to training the model using the datasets obtained by the improved ADASYN sampling.Finally,the comparative analysis between the ensemble model and other different sampling methodologies is conducted on the Lending Club dataset.The results demonstrate that the proposed model effectively minimizes the misclassification of minority classes in credit risk prediction and can be used as a reference for similar studies. 展开更多
关键词 imbalance data credit risk prediction Focal Loss ADAPTIVE hybrid sampling
原文传递
Detecting anomalies in blockchain transactions using machine learning classifiers and explainability analysis
12
作者 Mohammad Hasan Mohammad Shahriar Rahman +1 位作者 Helge Janicke Iqbal H.Sarker 《Blockchain(Research and Applications)》 EI 2024年第3期106-122,共17页
As the use of blockchain for digital payments continues to rise,it becomes susceptible to various malicious attacks.Successfully detecting anomalies within blockchain transactions is essential for bolstering trust in ... As the use of blockchain for digital payments continues to rise,it becomes susceptible to various malicious attacks.Successfully detecting anomalies within blockchain transactions is essential for bolstering trust in digital payments.However,the task of anomaly detection in blockchain transaction data is challenging due to the infrequent occurrence of illicit transactions.Although several studies have been conducted in the field,a limitation persists:the lack of explanations for the model’s predictions.This study seeks to overcome this limitation by integrating explainable artificial intelligence(XAI)techniques and anomaly rules into tree-based ensemble classifiers for detecting anomalous Bitcoin transactions.The shapley additive explanation(SHAP)method is employed to measure the contribution of each feature,and it is compatible with ensemble models.Moreover,we present rules for interpreting whether a Bitcoin transaction is anomalous or not.Additionally,we introduce an under-sampling algorithm named XGBCLUS,designed to balance anomalous and non-anomalous transaction data.This algorithm is compared against other commonly used under-sampling and over-sampling techniques.Finally,the outcomes of various tree-based single classifiers are compared with those of stacking and voting ensemble classifiers.Our experimental results demonstrate that:(i)XGBCLUS enhances true positive rate(TPR)and receiver operating characteristic-area under curve(ROC-AUC)scores compared to state-of-the-art under-sampling and over-sampling techniques,and(ii)our proposed ensemble classifiers outperform traditional single tree-based machine learning classifiers in terms of accuracy,TPR,and false positive rate(FPR)scores. 展开更多
关键词 Anomaly detection Blockchain Bitcoin transactions data imbalance data sampling Explainable AI Machine learning Decision tree Anomaly rules
原文传递
Imbalanced Problem in Initial Coin Offering Fraud Detection
13
作者 Yifan Zheng Maoning Wang 《国际计算机前沿大会会议论文集》 2022年第2期448-464,共17页
ICOs,the initial coin offerings,are a common way to raise funds for blockchain projects.Fraudulent ICO projects not only cause financial losses to investors but also cause a loss of confidence in the blockchain capita... ICOs,the initial coin offerings,are a common way to raise funds for blockchain projects.Fraudulent ICO projects not only cause financial losses to investors but also cause a loss of confidence in the blockchain capital market.Whitepapers are usually the most important information source,so it is feasible to identify fraudulent ICO programs by analyzing whitepapers.However,the fraud samples are difficult to collect,and the classes are imbalanced.In this study,we attempt to solve this problem by extracting linguistic features from the ICO whitepaper and using a variety of cutting-edge machine learning and deep learning algorithms to train the prediction model and attempt to resample,modify the weight and modify the loss function for imbalanced samples.Our optimal method achieves an AUC of 0.94 and an accuracy of 82%,which is better than other traditional standard methods,and the results provide important implications for ICO fraud detection. 展开更多
关键词 Initial coin offering Fraud detection data imbalance
原文传递
Review of training-free event-related potential classification approaches in the World Robot Contest 2021 被引量:1
14
作者 Huanyu Wu Dongrui Wu 《Brain Science Advances》 2022年第2期82-98,共17页
Recently,rapid serial visual presentation(RSVP),as a new event-related potential(ERP)paradigm,has become one of the most popular forms in electroencephalogram signal processing technologies.Several improvement approac... Recently,rapid serial visual presentation(RSVP),as a new event-related potential(ERP)paradigm,has become one of the most popular forms in electroencephalogram signal processing technologies.Several improvement approaches have been proposed to improve the performance of RSVP analysis.In brain-computer interface systems based on RSVP,the family of approaches that do not depend on training specific parameters is essential.The participating teams proposed several effective training-free frameworks of algorithms in the ERP competition of the BCI Controlled Robot Contest in World Robot Contest 2021.This paper discusses the effectiveness of various approaches in improving the performance of the system without requiring training and suggests how to apply these approaches in a practical system.First,appropriate preprocessing techniques will greatly improve the results.Then,the non-deep learning algorithm may be more stable than the deep learning approach.Furthermore,ensemble learning can make the model more stable and robust. 展开更多
关键词 brain-computer interfaces ELECTROENCEPHALOGRAM rapid serial visual presentation(RSVP) data imbalance training-free
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部