Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients m...Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients must participate in practical applications for the federated learning global model to be accurate,but because the clients are independent,the central server cannot fully control their behavior.The central server has no way of knowing the correctness of the model parameters provided by each client in this round,so clients may purposefully or unwittingly submit anomalous data,leading to abnormal behavior,such as becoming malicious attackers or defective clients.To reduce their negative consequences,it is crucial to quickly detect these abnormalities and incentivize them.In this paper,we propose a Federated Learning framework for Detecting and Incentivizing Abnormal Clients(FL-DIAC)to accomplish efficient and security federated learning.We build a detector that introduces an auto-encoder for anomaly detection and use it to perform anomaly identification and prevent the involvement of abnormal clients,in particular for the anomaly client detection problem.Among them,before the model parameters are input to the detector,we propose a Fourier transform-based anomaly data detectionmethod for dimensionality reduction in order to reduce the computational complexity.Additionally,we create a credit scorebased incentive structure to encourage clients to participate in training in order tomake clients actively participate.Three training models(CNN,MLP,and ResNet-18)and three datasets(MNIST,Fashion MNIST,and CIFAR-10)have been used in experiments.According to theoretical analysis and experimental findings,the FL-DIAC is superior to other federated learning schemes of the same type in terms of effectiveness.展开更多
As a representative emerging machine learning technique, federated learning(FL) has gained considerable popularity for its special feature of “making data available but not visible”. However, potential problems rema...As a representative emerging machine learning technique, federated learning(FL) has gained considerable popularity for its special feature of “making data available but not visible”. However, potential problems remain, including privacy breaches, imbalances in payment, and inequitable distribution.These shortcomings let devices reluctantly contribute relevant data to, or even refuse to participate in FL. Therefore, in the application of FL, an important but also challenging issue is to motivate as many participants as possible to provide high-quality data to FL. In this paper, we propose an incentive mechanism for FL based on the continuous zero-determinant(CZD) strategies from the perspective of game theory. We first model the interaction between the server and the devices during the FL process as a continuous iterative game. We then apply the CZD strategies for two players and then multiple players to optimize the social welfare of FL, for which we prove that the server can keep social welfare at a high and stable level. Subsequently, we design an incentive mechanism based on the CZD strategies to attract devices to contribute all of their high-accuracy data to FL.Finally, we perform simulations to demonstrate that our proposed CZD-based incentive mechanism can indeed generate high and stable social welfare in FL.展开更多
When data privacy is imposed as a necessity,Federated learning(FL)emerges as a relevant artificial intelligence field for developing machine learning(ML)models in a distributed and decentralized environment.FL allows ...When data privacy is imposed as a necessity,Federated learning(FL)emerges as a relevant artificial intelligence field for developing machine learning(ML)models in a distributed and decentralized environment.FL allows ML models to be trained on local devices without any need for centralized data transfer,thereby reducing both the exposure of sensitive data and the possibility of data interception by malicious third parties.This paradigm has gained momentum in the last few years,spurred by the plethora of real-world applications that have leveraged its ability to improve the efficiency of distributed learning and to accommodate numerous participants with their data sources.By virtue of FL,models can be learned from all such distributed data sources while preserving data privacy.The aim of this paper is to provide a practical tutorial on FL,including a short methodology and a systematic analysis of existing software frameworks.Furthermore,our tutorial provides exemplary cases of study from three complementary perspectives:i)Foundations of FL,describing the main components of FL,from key elements to FL categories;ii)Implementation guidelines and exemplary cases of study,by systematically examining the functionalities provided by existing software frameworks for FL deployment,devising a methodology to design a FL scenario,and providing exemplary cases of study with source code for different ML approaches;and iii)Trends,shortly reviewing a non-exhaustive list of research directions that are under active investigation in the current FL landscape.The ultimate purpose of this work is to establish itself as a referential work for researchers,developers,and data scientists willing to explore the capabilities of FL in practical applications.展开更多
Explainable Artificial Intelligence(XAI)has an advanced feature to enhance the decision-making feature and improve the rule-based technique by using more advanced Machine Learning(ML)and Deep Learning(DL)based algorit...Explainable Artificial Intelligence(XAI)has an advanced feature to enhance the decision-making feature and improve the rule-based technique by using more advanced Machine Learning(ML)and Deep Learning(DL)based algorithms.In this paper,we chose e-healthcare systems for efficient decision-making and data classification,especially in data security,data handling,diagnostics,laboratories,and decision-making.Federated Machine Learning(FML)is a new and advanced technology that helps to maintain privacy for Personal Health Records(PHR)and handle a large amount of medical data effectively.In this context,XAI,along with FML,increases efficiency and improves the security of e-healthcare systems.The experiments show efficient system performance by implementing a federated averaging algorithm on an open-source Federated Learning(FL)platform.The experimental evaluation demonstrates the accuracy rate by taking epochs size 5,batch size 16,and the number of clients 5,which shows a higher accuracy rate(19,104).We conclude the paper by discussing the existing gaps and future work in an e-healthcare system.展开更多
In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dim...As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dimensional stochastic gradients to edge server in training,which cause severe communication bottleneck.To address this problem,we compress the communication by sparsifying and quantizing the stochastic gradients of edge devices.We first derive a closed form of the communication compression in terms of sparsification and quantization factors.Then,the convergence rate of this communicationcompressed system is analyzed and several insights are obtained.Finally,we formulate and deal with the quantization resource allocation problem for the goal of minimizing the convergence upper bound,under the constraint of multiple-access channel capacity.Simulations show that the proposed scheme outperforms the benchmarks.展开更多
As a distributed machine learning method,federated learning(FL)has the advantage of naturally protecting data privacy.It keeps data locally and trains local models through local data to protect the privacy of local da...As a distributed machine learning method,federated learning(FL)has the advantage of naturally protecting data privacy.It keeps data locally and trains local models through local data to protect the privacy of local data.The federated learning method effectively solves the problem of artificial Smart data islands and privacy protection issues.However,existing research shows that attackersmay still steal user information by analyzing the parameters in the federated learning training process and the aggregation parameters on the server side.To solve this problem,differential privacy(DP)techniques are widely used for privacy protection in federated learning.However,adding Gaussian noise perturbations to the data degrades the model learning performance.To address these issues,this paper proposes a differential privacy federated learning scheme based on adaptive Gaussian noise(DPFL-AGN).To protect the data privacy and security of the federated learning training process,adaptive Gaussian noise is specifically added in the training process to hide the real parameters uploaded by the client.In addition,this paper proposes an adaptive noise reduction method.With the convergence of the model,the Gaussian noise in the later stage of the federated learning training process is reduced adaptively.This paper conducts a series of simulation experiments on realMNIST and CIFAR-10 datasets,and the results show that the DPFL-AGN algorithmperforms better compared to the other algorithms.展开更多
Scalability and information personal privacy are vital for training and deploying large-scale deep learning models.Federated learning trains models on exclusive information by aggregating weights from various devices ...Scalability and information personal privacy are vital for training and deploying large-scale deep learning models.Federated learning trains models on exclusive information by aggregating weights from various devices and taking advantage of the device-agnostic environment of web browsers.Nevertheless,relying on a main central server for internet browser-based federated systems can prohibit scalability and interfere with the training process as a result of growing client numbers.Additionally,information relating to the training dataset can possibly be extracted from the distributed weights,potentially reducing the privacy of the local data used for training.In this research paper,we aim to investigate the challenges of scalability and data privacy to increase the efficiency of distributed training models.As a result,we propose a web-federated learning exchange(WebFLex)framework,which intends to improve the decentralization of the federated learning process.WebFLex is additionally developed to secure distributed and scalable federated learning systems that operate in web browsers across heterogeneous devices.Furthermore,WebFLex utilizes peer-to-peer interactions and secure weight exchanges utilizing browser-to-browser web real-time communication(WebRTC),efficiently preventing the need for a main central server.WebFLex has actually been measured in various setups using the MNIST dataset.Experimental results show WebFLex’s ability to improve the scalability of federated learning systems,allowing a smooth increase in the number of participating devices without central data aggregation.In addition,WebFLex can maintain a durable federated learning procedure even when faced with device disconnections and network variability.Additionally,it improves data privacy by utilizing artificial noise,which accomplishes an appropriate balance between accuracy and privacy preservation.展开更多
Human mobility prediction is important for many applications.However,training an accurate mobility prediction model requires a large scale of human trajectories,where privacy issues become an important problem.The ris...Human mobility prediction is important for many applications.However,training an accurate mobility prediction model requires a large scale of human trajectories,where privacy issues become an important problem.The rising federated learning provides us with a promising solution to this problem,which enables mobile devices to collaboratively learn a shared prediction model while keeping all the training data on the device,decoupling the ability to do machine learning from the need to store the data in the cloud.However,existing federated learningbased methods either do not provide privacy guarantees or have vulnerability in terms of privacy leakage.In this paper,we combine the techniques of data perturbation and model perturbation mechanisms and propose a privacy-preserving mobility prediction algorithm,where we add noise to the transmitted model and the raw data collaboratively to protect user privacy and keep the mobility prediction performance.Extensive experimental results show that our proposed method significantly outperforms the existing stateof-the-art mobility prediction method in terms of defensive performance against practical attacks while having comparable mobility prediction performance,demonstrating its effectiveness.展开更多
Federated Learning(FL),as an emergent paradigm in privacy-preserving machine learning,has garnered significant interest from scholars and engineers across both academic and industrial spheres.Despite its innovative ap...Federated Learning(FL),as an emergent paradigm in privacy-preserving machine learning,has garnered significant interest from scholars and engineers across both academic and industrial spheres.Despite its innovative approach to model training across distributed networks,FL has its vulnerabilities;the centralized server-client architecture introduces risks of single-point failures.Moreover,the integrity of the global model—a cornerstone of FL—is susceptible to compromise through poisoning attacks by malicious actors.Such attacks and the potential for privacy leakage via inference starkly undermine FL’s foundational privacy and security goals.For these reasons,some participants unwilling use their private data to train a model,which is a bottleneck in the development and industrialization of federated learning.Blockchain technology,characterized by its decentralized ledger system,offers a compelling solution to these issues.It inherently prevents single-point failures and,through its incentive mechanisms,motivates participants to contribute computing power.Thus,blockchain-based FL(BCFL)emerges as a natural progression to address FL’s challenges.This study begins with concise introductions to federated learning and blockchain technologies,followed by a formal analysis of the specific problems that FL encounters.It discusses the challenges of combining the two technologies and presents an overview of the latest cryptographic solutions that prevent privacy leakage during communication and incentives in BCFL.In addition,this research examines the use of BCFL in various fields,such as the Internet of Things and the Internet of Vehicles.Finally,it assesses the effectiveness of these solutions.展开更多
Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources,including clinical symptoms,physical signs,biochemical test results,imaging findings,pathological examination data,and even ...Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources,including clinical symptoms,physical signs,biochemical test results,imaging findings,pathological examination data,and even genetic data.When applying machine learning modeling to predict and diagnose multi-stage diseases,several challenges need to be addressed.Firstly,the model needs to handle multimodal data,as the data used by doctors for diagnosis includes image data,natural language data,and structured data.Secondly,privacy of patients’data needs to be protected,as these data contain the most sensitive and private information.Lastly,considering the practicality of the model,the computational requirements should not be too high.To address these challenges,this paper proposes a privacy-preserving federated deep learning diagnostic method for multi-stage diseases.This method improves the forward and backward propagation processes of deep neural network modeling algorithms and introduces a homomorphic encryption step to design a federated modeling algorithm without the need for an arbiter.It also utilizes dedicated integrated circuits to implement the hardware Paillier algorithm,providing accelerated support for homomorphic encryption in modeling.Finally,this paper designs and conducts experiments to evaluate the proposed solution.The experimental results show that in privacy-preserving federated deep learning diagnostic modeling,the method in this paper achieves the same modeling performance as ordinary modeling without privacy protection,and has higher modeling speed compared to similar algorithms.展开更多
With the arrival of 5G,latency-sensitive applications are becoming increasingly diverse.Mobile Edge Computing(MEC)technology has the characteristics of high bandwidth,low latency and low energy consumption,and has att...With the arrival of 5G,latency-sensitive applications are becoming increasingly diverse.Mobile Edge Computing(MEC)technology has the characteristics of high bandwidth,low latency and low energy consumption,and has attracted much attention among researchers.To improve the Quality of Service(QoS),this study focuses on computation offloading in MEC.We consider the QoS from the perspective of computational cost,dimensional disaster,user privacy and catastrophic forgetting of new users.The QoS model is established based on the delay and energy consumption and is based on DDQN and a Federated Learning(FL)adaptive task offloading algorithm in MEC.The proposed algorithm combines the QoS model and deep reinforcement learning algorithm to obtain an optimal offloading policy according to the local link and node state information in the channel coherence time to address the problem of time-varying transmission channels and reduce the computing energy consumption and task processing delay.To solve the problems of privacy and catastrophic forgetting,we use FL to make distributed use of multiple users’data to obtain the decision model,protect data privacy and improve the model universality.In the process of FL iteration,the communication delay of individual devices is too large,which affects the overall delay cost.Therefore,we adopt a communication delay optimization algorithm based on the unary outlier detection mechanism to reduce the communication delay of FL.The simulation results indicate that compared with existing schemes,the proposed method significantly reduces the computation cost on a device and improves the QoS when handling complex tasks.展开更多
The emergence of on-demand service provisioning by Federated Cloud Providers(FCPs)to Cloud Users(CU)has fuelled significant innovations in cloud provisioning models.Owing to the massive traffic,massive CU resource req...The emergence of on-demand service provisioning by Federated Cloud Providers(FCPs)to Cloud Users(CU)has fuelled significant innovations in cloud provisioning models.Owing to the massive traffic,massive CU resource requests are sent to FCPs,and appropriate service recommendations are sent by FCPs.Currently,the FourthGeneration(4G)-Long Term Evolution(LTE)network faces bottlenecks that affect end-user throughput and latency.Moreover,the data is exchanged among heterogeneous stakeholders,and thus trust is a prime concern.To address these limitations,the paper proposes a Blockchain(BC)-leveraged rank-based recommender scheme,FedRec,to expedite secure and trusted Cloud Service Provisioning(CSP)to the CU through the FCP at the backdrop of base 5G communication service.The scheme operates in three phases.In the first phase,a BCintegrated request-response broker model is formulated between the CU,Cloud Brokers(BR),and the FCP,where a CU service request is forwarded through the BR to different FCPs.For service requests,Anything-as-aService(XaaS)is supported by 5G-enhanced Mobile Broadband(eMBB)service.In the next phase,a weighted matching recommender model is proposed at the FCP sites based on a novel Ranking-Based Recommender(RBR)model based on the CU requests.In the final phase,based on the matching recommendations between the CU and the FCP,Smart Contracts(SC)are executed,and resource provisioning data is stored in the Interplanetary File Systems(IPFS)that expedite the block validations.The proposed scheme FedRec is compared in terms of SC evaluation and formal verification.In simulation,FedRec achieves a reduction of 27.55%in chain storage and a transaction throughput of 43.5074 Mbps at 150 blocks.For the IPFS,we have achieved a bandwidth improvement of 17.91%.In the RBR models,the maximum obtained hit ratio is 0.9314 at 200 million CU requests,showing an improvement of 1.2%in average servicing latency over non-RBR models and a maximization trade-off of QoE index of 2.7688 at the flow request 1.088 and at granted service price of USD 1.559 million to FCP for provided services.The obtained results indicate the viability of the proposed scheme against traditional approaches.展开更多
Federated learning for edge computing is a promising solution in the data booming era,which leverages the computation ability of each edge device to train local models and only shares the model gradients to the centra...Federated learning for edge computing is a promising solution in the data booming era,which leverages the computation ability of each edge device to train local models and only shares the model gradients to the central server.However,the frequently transmitted local gradients could also leak the participants’private data.To protect the privacy of local training data,lots of cryptographic-based Privacy-Preserving Federated Learning(PPFL)schemes have been proposed.However,due to the constrained resource nature of mobile devices and complex cryptographic operations,traditional PPFL schemes fail to provide efficient data confidentiality and lightweight integrity verification simultaneously.To tackle this problem,we propose a Verifiable Privacypreserving Federated Learning scheme(VPFL)for edge computing systems to prevent local gradients from leaking over the transmission stage.Firstly,we combine the Distributed Selective Stochastic Gradient Descent(DSSGD)method with Paillier homomorphic cryptosystem to achieve the distributed encryption functionality,so as to reduce the computation cost of the complex cryptosystem.Secondly,we further present an online/offline signature method to realize the lightweight gradients integrity verification,where the offline part can be securely outsourced to the edge server.Comprehensive security analysis demonstrates the proposed VPFL can achieve data confidentiality,authentication,and integrity.At last,we evaluate both communication overhead and computation cost of the proposed VPFL scheme,the experimental results have shown VPFL has low computation costs and communication overheads while maintaining high training accuracy.展开更多
In real life,a large amount of data describing the same learning task may be stored in different institutions(called participants),and these data cannot be shared among par-ticipants due to privacy protection.The case...In real life,a large amount of data describing the same learning task may be stored in different institutions(called participants),and these data cannot be shared among par-ticipants due to privacy protection.The case that different attributes/features of the same instance are stored in different institutions is called vertically distributed data.The pur-pose of vertical‐federated feature selection(FS)is to reduce the feature dimension of vertical distributed data jointly without sharing local original data so that the feature subset obtained has the same or better performance as the original feature set.To solve this problem,in the paper,an embedded vertical‐federated FS algorithm based on particle swarm optimisation(PSO‐EVFFS)is proposed by incorporating evolutionary FS into the SecureBoost framework for the first time.By optimising both hyper‐parameters of the XGBoost model and feature subsets,PSO‐EVFFS can obtain a feature subset,which makes the XGBoost model more accurate.At the same time,since different participants only share insensitive parameters such as model loss function,PSO‐EVFFS can effec-tively ensure the privacy of participants'data.Moreover,an ensemble ranking strategy of feature importance based on the XGBoost tree model is developed to effectively remove irrelevant features on each participant.Finally,the proposed algorithm is applied to 10 test datasets and compared with three typical vertical‐federated learning frameworks and two variants of the proposed algorithm with different initialisation strategies.Experi-mental results show that the proposed algorithm can significantly improve the classifi-cation performance of selected feature subsets while fully protecting the data privacy of all participants.展开更多
Nowadays,smart wearable devices are used widely in the Social Internet of Things(IoT),which record human physiological data in real time.To protect the data privacy of smart devices,researchers pay more attention to f...Nowadays,smart wearable devices are used widely in the Social Internet of Things(IoT),which record human physiological data in real time.To protect the data privacy of smart devices,researchers pay more attention to federated learning.Although the data leakage problem is somewhat solved,a new challenge has emerged.Asynchronous federated learning shortens the convergence time,while it has time delay and data heterogeneity problems.Both of the two problems harm the accuracy.To overcome these issues,we propose an asynchronous federated learning scheme based on double compensation to solve the problem of time delay and data heterogeneity problems.The scheme improves the Delay Compensated Asynchronous Stochastic Gradient Descent(DC-ASGD)algorithm based on the second-order Taylor expansion as the delay compensation.It adds the FedProx operator to the objective function as the heterogeneity compensation.Besides,the proposed scheme motivates the federated learning process by adjusting the importance of the participants and the central server.We conduct multiple sets of experiments in both conventional and heterogeneous scenarios.The experimental results show that our scheme improves the accuracy by about 5%while keeping the complexity constant.We can find that our scheme converges more smoothly during training and adapts better in heterogeneous environments through numerical experiments.The proposed double-compensation-based federated learning scheme is highly accurate,flexible in terms of participants and smooth the training process.Hence it is deemed suitable for data privacy protection of smart wearable devices.展开更多
With the increasing number of smart devices and the development of machine learning technology,the value of users’personal data is becoming more and more important.Based on the premise of protecting users’personal p...With the increasing number of smart devices and the development of machine learning technology,the value of users’personal data is becoming more and more important.Based on the premise of protecting users’personal privacy data,federated learning(FL)uses data stored on edge devices to realize training tasks by contributing training model parameters without revealing the original data.However,since FL can still leak the user’s original data by exchanging gradient information.The existing privacy protection strategy will increase the uplink time due to encryption measures.It is a huge challenge in terms of communication.When there are a large number of devices,the privacy protection cost of the system is higher.Based on these issues,we propose a privacy-preserving scheme of user-based group collaborative federated learning(GrCol-PPFL).Our scheme primarily divides participants into several groups and each group communicates in a chained transmission mechanism.All groups work in parallel at the same time.The server distributes a random parameter with the same dimension as the model parameter for each participant as a mask for the model parameter.We use the public datasets of modified national institute of standards and technology database(MNIST)to test the model accuracy.The experimental results show that GrCol-PPFL not only ensures the accuracy of themodel,but also ensures the security of the user’s original data when users collude with each other.Finally,through numerical experiments,we show that by changing the number of groups,we can find the optimal number of groups that reduces the uplink consumption time.展开更多
In this paper,to deal with the heterogeneity in federated learning(FL)systems,a knowledge distillation(KD)driven training framework for FL is proposed,where each user can select its neural network model on demand and ...In this paper,to deal with the heterogeneity in federated learning(FL)systems,a knowledge distillation(KD)driven training framework for FL is proposed,where each user can select its neural network model on demand and distill knowledge from a big teacher model using its own private dataset.To overcome the challenge of train the big teacher model in resource limited user devices,the digital twin(DT)is exploit in the way that the teacher model can be trained at DT located in the server with enough computing resources.Then,during model distillation,each user can update the parameters of its model at either the physical entity or the digital agent.The joint problem of model selection and training offloading and resource allocation for users is formulated as a mixed integer programming(MIP)problem.To solve the problem,Q-learning and optimization are jointly used,where Q-learning selects models for users and determines whether to train locally or on the server,and optimization is used to allocate resources for users based on the output of Q-learning.Simulation results show the proposed DT-assisted KD framework and joint optimization method can significantly improve the average accuracy of users while reducing the total delay.展开更多
Data sharing technology in Internet of Vehicles(Io V)has attracted great research interest with the goal of realizing intelligent transportation and traffic management.Meanwhile,the main concerns have been raised abou...Data sharing technology in Internet of Vehicles(Io V)has attracted great research interest with the goal of realizing intelligent transportation and traffic management.Meanwhile,the main concerns have been raised about the security and privacy of vehicle data.The mobility and real-time characteristics of vehicle data make data sharing more difficult in Io V.The emergence of blockchain and federated learning brings new directions.In this paper,a data-sharing model that combines blockchain and federated learning is proposed to solve the security and privacy problems of data sharing in Io V.First,we use federated learning to share data instead of exposing actual data and propose an adaptive differential privacy scheme to further balance the privacy and availability of data.Then,we integrate the verification scheme into the consensus process,so that the consensus computation can filter out low-quality models.Experimental data shows that our data-sharing model can better balance the relationship between data availability and privacy,and also has enhanced security.展开更多
Federated Learning(FL)enables collaborative and privacy-preserving training of machine learning models within the Internet of Vehicles(IoV)realm.While FL effectively tackles privacy concerns,it also imposes significan...Federated Learning(FL)enables collaborative and privacy-preserving training of machine learning models within the Internet of Vehicles(IoV)realm.While FL effectively tackles privacy concerns,it also imposes significant resource requirements.In traditional FL,trained models are transmitted to a central server for global aggregation,typically in the cloud.This approach often leads to network congestion and bandwidth limitations when numerous devices communicate with the same server.The need for Flexible Global Aggregation and Dynamic Client Selection in FL for the IoV arises from the inherent characteristics of IoV environments.These include diverse and distributed data sources,varying data quality,and limited communication resources.By employing dynamic client selection,we can prioritize relevant and high-quality data sources,enhancing model accuracy.To address this issue,we propose an FL framework that selects global aggregation nodes dynamically rather than a single fixed aggregator.Flexible global aggregation ensures efficient utilization of limited network resources while accommodating the dynamic nature of IoV data sources.This approach optimizes both model performance and resource allocation,making FL in IoV more effective and adaptable.The selection of the global aggregation node is based on workload and communication speed considerations.Additionally,our framework overcomes the constraints associated with network,computational,and energy resources in the IoV environment by implementing a client selection algorithm that dynamically adjusts participants according to predefined parameters.Our approach surpasses Federated Averaging(FedAvg)and Hierarchical FL(HFL)regarding energy consumption,delay,and accuracy,yielding superior results.展开更多
基金supported by Key Research and Development Program of China (No.2022YFC3005401)Key Research and Development Program of Yunnan Province,China (Nos.202203AA080009,202202AF080003)+1 种基金Science and Technology Achievement Transformation Program of Jiangsu Province,China (BA2021002)Fundamental Research Funds for the Central Universities (Nos.B220203006,B210203024).
文摘Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients must participate in practical applications for the federated learning global model to be accurate,but because the clients are independent,the central server cannot fully control their behavior.The central server has no way of knowing the correctness of the model parameters provided by each client in this round,so clients may purposefully or unwittingly submit anomalous data,leading to abnormal behavior,such as becoming malicious attackers or defective clients.To reduce their negative consequences,it is crucial to quickly detect these abnormalities and incentivize them.In this paper,we propose a Federated Learning framework for Detecting and Incentivizing Abnormal Clients(FL-DIAC)to accomplish efficient and security federated learning.We build a detector that introduces an auto-encoder for anomaly detection and use it to perform anomaly identification and prevent the involvement of abnormal clients,in particular for the anomaly client detection problem.Among them,before the model parameters are input to the detector,we propose a Fourier transform-based anomaly data detectionmethod for dimensionality reduction in order to reduce the computational complexity.Additionally,we create a credit scorebased incentive structure to encourage clients to participate in training in order tomake clients actively participate.Three training models(CNN,MLP,and ResNet-18)and three datasets(MNIST,Fashion MNIST,and CIFAR-10)have been used in experiments.According to theoretical analysis and experimental findings,the FL-DIAC is superior to other federated learning schemes of the same type in terms of effectiveness.
基金partially supported by the National Natural Science Foundation of China (62173308)the Natural Science Foundation of Zhejiang Province of China (LR20F030001)the Jinhua Science and Technology Project (2022-1-042)。
文摘As a representative emerging machine learning technique, federated learning(FL) has gained considerable popularity for its special feature of “making data available but not visible”. However, potential problems remain, including privacy breaches, imbalances in payment, and inequitable distribution.These shortcomings let devices reluctantly contribute relevant data to, or even refuse to participate in FL. Therefore, in the application of FL, an important but also challenging issue is to motivate as many participants as possible to provide high-quality data to FL. In this paper, we propose an incentive mechanism for FL based on the continuous zero-determinant(CZD) strategies from the perspective of game theory. We first model the interaction between the server and the devices during the FL process as a continuous iterative game. We then apply the CZD strategies for two players and then multiple players to optimize the social welfare of FL, for which we prove that the server can keep social welfare at a high and stable level. Subsequently, we design an incentive mechanism based on the CZD strategies to attract devices to contribute all of their high-accuracy data to FL.Finally, we perform simulations to demonstrate that our proposed CZD-based incentive mechanism can indeed generate high and stable social welfare in FL.
基金the R&D&I,Spain grants PID2020-119478GB-I00 and,PID2020-115832GB-I00 funded by MCIN/AEI/10.13039/501100011033.N.Rodríguez-Barroso was supported by the grant FPU18/04475 funded by MCIN/AEI/10.13039/501100011033 and by“ESF Investing in your future”Spain.J.Moyano was supported by a postdoctoral Juan de la Cierva Formación grant FJC2020-043823-I funded by MCIN/AEI/10.13039/501100011033 and by European Union NextGenerationEU/PRTR.J.Del Ser acknowledges funding support from the Spanish Centro para el Desarrollo Tecnológico Industrial(CDTI)through the AI4ES projectthe Department of Education of the Basque Government(consolidated research group MATHMODE,IT1456-22)。
文摘When data privacy is imposed as a necessity,Federated learning(FL)emerges as a relevant artificial intelligence field for developing machine learning(ML)models in a distributed and decentralized environment.FL allows ML models to be trained on local devices without any need for centralized data transfer,thereby reducing both the exposure of sensitive data and the possibility of data interception by malicious third parties.This paradigm has gained momentum in the last few years,spurred by the plethora of real-world applications that have leveraged its ability to improve the efficiency of distributed learning and to accommodate numerous participants with their data sources.By virtue of FL,models can be learned from all such distributed data sources while preserving data privacy.The aim of this paper is to provide a practical tutorial on FL,including a short methodology and a systematic analysis of existing software frameworks.Furthermore,our tutorial provides exemplary cases of study from three complementary perspectives:i)Foundations of FL,describing the main components of FL,from key elements to FL categories;ii)Implementation guidelines and exemplary cases of study,by systematically examining the functionalities provided by existing software frameworks for FL deployment,devising a methodology to design a FL scenario,and providing exemplary cases of study with source code for different ML approaches;and iii)Trends,shortly reviewing a non-exhaustive list of research directions that are under active investigation in the current FL landscape.The ultimate purpose of this work is to establish itself as a referential work for researchers,developers,and data scientists willing to explore the capabilities of FL in practical applications.
文摘Explainable Artificial Intelligence(XAI)has an advanced feature to enhance the decision-making feature and improve the rule-based technique by using more advanced Machine Learning(ML)and Deep Learning(DL)based algorithms.In this paper,we chose e-healthcare systems for efficient decision-making and data classification,especially in data security,data handling,diagnostics,laboratories,and decision-making.Federated Machine Learning(FML)is a new and advanced technology that helps to maintain privacy for Personal Health Records(PHR)and handle a large amount of medical data effectively.In this context,XAI,along with FML,increases efficiency and improves the security of e-healthcare systems.The experiments show efficient system performance by implementing a federated averaging algorithm on an open-source Federated Learning(FL)platform.The experimental evaluation demonstrates the accuracy rate by taking epochs size 5,batch size 16,and the number of clients 5,which shows a higher accuracy rate(19,104).We conclude the paper by discussing the existing gaps and future work in an e-healthcare system.
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.
基金supported in part by the National Key Research and Development Program of China under Grant 2020YFB1807700in part by the National Science Foundation of China under Grant U200120122
文摘As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent(SGD).However,devices need to upload high-dimensional stochastic gradients to edge server in training,which cause severe communication bottleneck.To address this problem,we compress the communication by sparsifying and quantizing the stochastic gradients of edge devices.We first derive a closed form of the communication compression in terms of sparsification and quantization factors.Then,the convergence rate of this communicationcompressed system is analyzed and several insights are obtained.Finally,we formulate and deal with the quantization resource allocation problem for the goal of minimizing the convergence upper bound,under the constraint of multiple-access channel capacity.Simulations show that the proposed scheme outperforms the benchmarks.
基金the Sichuan Provincial Science and Technology Department Project under Grant 2019YFN0104the Yibin Science and Technology Plan Project under Grant 2021GY008the Sichuan University of Science and Engineering Postgraduate Innovation Fund Project under Grant Y2022154.
文摘As a distributed machine learning method,federated learning(FL)has the advantage of naturally protecting data privacy.It keeps data locally and trains local models through local data to protect the privacy of local data.The federated learning method effectively solves the problem of artificial Smart data islands and privacy protection issues.However,existing research shows that attackersmay still steal user information by analyzing the parameters in the federated learning training process and the aggregation parameters on the server side.To solve this problem,differential privacy(DP)techniques are widely used for privacy protection in federated learning.However,adding Gaussian noise perturbations to the data degrades the model learning performance.To address these issues,this paper proposes a differential privacy federated learning scheme based on adaptive Gaussian noise(DPFL-AGN).To protect the data privacy and security of the federated learning training process,adaptive Gaussian noise is specifically added in the training process to hide the real parameters uploaded by the client.In addition,this paper proposes an adaptive noise reduction method.With the convergence of the model,the Gaussian noise in the later stage of the federated learning training process is reduced adaptively.This paper conducts a series of simulation experiments on realMNIST and CIFAR-10 datasets,and the results show that the DPFL-AGN algorithmperforms better compared to the other algorithms.
基金This work has been funded by King Saud University,Riyadh,Saudi Arabia,through Researchers Supporting Project Number(RSPD2024R857).
文摘Scalability and information personal privacy are vital for training and deploying large-scale deep learning models.Federated learning trains models on exclusive information by aggregating weights from various devices and taking advantage of the device-agnostic environment of web browsers.Nevertheless,relying on a main central server for internet browser-based federated systems can prohibit scalability and interfere with the training process as a result of growing client numbers.Additionally,information relating to the training dataset can possibly be extracted from the distributed weights,potentially reducing the privacy of the local data used for training.In this research paper,we aim to investigate the challenges of scalability and data privacy to increase the efficiency of distributed training models.As a result,we propose a web-federated learning exchange(WebFLex)framework,which intends to improve the decentralization of the federated learning process.WebFLex is additionally developed to secure distributed and scalable federated learning systems that operate in web browsers across heterogeneous devices.Furthermore,WebFLex utilizes peer-to-peer interactions and secure weight exchanges utilizing browser-to-browser web real-time communication(WebRTC),efficiently preventing the need for a main central server.WebFLex has actually been measured in various setups using the MNIST dataset.Experimental results show WebFLex’s ability to improve the scalability of federated learning systems,allowing a smooth increase in the number of participating devices without central data aggregation.In addition,WebFLex can maintain a durable federated learning procedure even when faced with device disconnections and network variability.Additionally,it improves data privacy by utilizing artificial noise,which accomplishes an appropriate balance between accuracy and privacy preservation.
基金supported in part by the National Key Research and Development Program of China under 2020AAA0106000the National Natural Science Foundation of China under U20B2060 and U21B2036supported by a grant from the Guoqiang Institute, Tsinghua University under 2021GQG1005
文摘Human mobility prediction is important for many applications.However,training an accurate mobility prediction model requires a large scale of human trajectories,where privacy issues become an important problem.The rising federated learning provides us with a promising solution to this problem,which enables mobile devices to collaboratively learn a shared prediction model while keeping all the training data on the device,decoupling the ability to do machine learning from the need to store the data in the cloud.However,existing federated learningbased methods either do not provide privacy guarantees or have vulnerability in terms of privacy leakage.In this paper,we combine the techniques of data perturbation and model perturbation mechanisms and propose a privacy-preserving mobility prediction algorithm,where we add noise to the transmitted model and the raw data collaboratively to protect user privacy and keep the mobility prediction performance.Extensive experimental results show that our proposed method significantly outperforms the existing stateof-the-art mobility prediction method in terms of defensive performance against practical attacks while having comparable mobility prediction performance,demonstrating its effectiveness.
基金supported by High-performance Reliable Multi-Party Secure Computing Technology and Product Project for Industrial Internet No.TC220H056.
文摘Federated Learning(FL),as an emergent paradigm in privacy-preserving machine learning,has garnered significant interest from scholars and engineers across both academic and industrial spheres.Despite its innovative approach to model training across distributed networks,FL has its vulnerabilities;the centralized server-client architecture introduces risks of single-point failures.Moreover,the integrity of the global model—a cornerstone of FL—is susceptible to compromise through poisoning attacks by malicious actors.Such attacks and the potential for privacy leakage via inference starkly undermine FL’s foundational privacy and security goals.For these reasons,some participants unwilling use their private data to train a model,which is a bottleneck in the development and industrialization of federated learning.Blockchain technology,characterized by its decentralized ledger system,offers a compelling solution to these issues.It inherently prevents single-point failures and,through its incentive mechanisms,motivates participants to contribute computing power.Thus,blockchain-based FL(BCFL)emerges as a natural progression to address FL’s challenges.This study begins with concise introductions to federated learning and blockchain technologies,followed by a formal analysis of the specific problems that FL encounters.It discusses the challenges of combining the two technologies and presents an overview of the latest cryptographic solutions that prevent privacy leakage during communication and incentives in BCFL.In addition,this research examines the use of BCFL in various fields,such as the Internet of Things and the Internet of Vehicles.Finally,it assesses the effectiveness of these solutions.
基金funded by the National Natural Science Foundation,China(No.62172123)the Key Research and Development Program of Heilongjiang(Grant No.2022ZX01A36)+1 种基金the Special Projects for the Central Government to Guide the Development of Local Science and Technology,China(No.ZY20B11)the Harbin Manufacturing Technology Innovation Talent Project(No.CXRC20221104236).
文摘Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources,including clinical symptoms,physical signs,biochemical test results,imaging findings,pathological examination data,and even genetic data.When applying machine learning modeling to predict and diagnose multi-stage diseases,several challenges need to be addressed.Firstly,the model needs to handle multimodal data,as the data used by doctors for diagnosis includes image data,natural language data,and structured data.Secondly,privacy of patients’data needs to be protected,as these data contain the most sensitive and private information.Lastly,considering the practicality of the model,the computational requirements should not be too high.To address these challenges,this paper proposes a privacy-preserving federated deep learning diagnostic method for multi-stage diseases.This method improves the forward and backward propagation processes of deep neural network modeling algorithms and introduces a homomorphic encryption step to design a federated modeling algorithm without the need for an arbiter.It also utilizes dedicated integrated circuits to implement the hardware Paillier algorithm,providing accelerated support for homomorphic encryption in modeling.Finally,this paper designs and conducts experiments to evaluate the proposed solution.The experimental results show that in privacy-preserving federated deep learning diagnostic modeling,the method in this paper achieves the same modeling performance as ordinary modeling without privacy protection,and has higher modeling speed compared to similar algorithms.
基金supported by the National Natural Science Foundation of China(62032013,62072094Liaoning Province Science and Technology Fund Project(2020MS086)+1 种基金Shenyang Science and Technology Plan Project(20206424)the Fundamental Research Funds for the Central Universities(N2116014,N180101028)CERNET Innovation Project(NGII20190504).
文摘With the arrival of 5G,latency-sensitive applications are becoming increasingly diverse.Mobile Edge Computing(MEC)technology has the characteristics of high bandwidth,low latency and low energy consumption,and has attracted much attention among researchers.To improve the Quality of Service(QoS),this study focuses on computation offloading in MEC.We consider the QoS from the perspective of computational cost,dimensional disaster,user privacy and catastrophic forgetting of new users.The QoS model is established based on the delay and energy consumption and is based on DDQN and a Federated Learning(FL)adaptive task offloading algorithm in MEC.The proposed algorithm combines the QoS model and deep reinforcement learning algorithm to obtain an optimal offloading policy according to the local link and node state information in the channel coherence time to address the problem of time-varying transmission channels and reduce the computing energy consumption and task processing delay.To solve the problems of privacy and catastrophic forgetting,we use FL to make distributed use of multiple users’data to obtain the decision model,protect data privacy and improve the model universality.In the process of FL iteration,the communication delay of individual devices is too large,which affects the overall delay cost.Therefore,we adopt a communication delay optimization algorithm based on the unary outlier detection mechanism to reduce the communication delay of FL.The simulation results indicate that compared with existing schemes,the proposed method significantly reduces the computation cost on a device and improves the QoS when handling complex tasks.
文摘The emergence of on-demand service provisioning by Federated Cloud Providers(FCPs)to Cloud Users(CU)has fuelled significant innovations in cloud provisioning models.Owing to the massive traffic,massive CU resource requests are sent to FCPs,and appropriate service recommendations are sent by FCPs.Currently,the FourthGeneration(4G)-Long Term Evolution(LTE)network faces bottlenecks that affect end-user throughput and latency.Moreover,the data is exchanged among heterogeneous stakeholders,and thus trust is a prime concern.To address these limitations,the paper proposes a Blockchain(BC)-leveraged rank-based recommender scheme,FedRec,to expedite secure and trusted Cloud Service Provisioning(CSP)to the CU through the FCP at the backdrop of base 5G communication service.The scheme operates in three phases.In the first phase,a BCintegrated request-response broker model is formulated between the CU,Cloud Brokers(BR),and the FCP,where a CU service request is forwarded through the BR to different FCPs.For service requests,Anything-as-aService(XaaS)is supported by 5G-enhanced Mobile Broadband(eMBB)service.In the next phase,a weighted matching recommender model is proposed at the FCP sites based on a novel Ranking-Based Recommender(RBR)model based on the CU requests.In the final phase,based on the matching recommendations between the CU and the FCP,Smart Contracts(SC)are executed,and resource provisioning data is stored in the Interplanetary File Systems(IPFS)that expedite the block validations.The proposed scheme FedRec is compared in terms of SC evaluation and formal verification.In simulation,FedRec achieves a reduction of 27.55%in chain storage and a transaction throughput of 43.5074 Mbps at 150 blocks.For the IPFS,we have achieved a bandwidth improvement of 17.91%.In the RBR models,the maximum obtained hit ratio is 0.9314 at 200 million CU requests,showing an improvement of 1.2%in average servicing latency over non-RBR models and a maximization trade-off of QoE index of 2.7688 at the flow request 1.088 and at granted service price of USD 1.559 million to FCP for provided services.The obtained results indicate the viability of the proposed scheme against traditional approaches.
基金supported by the National Natural Science Foundation of China(No.62206238)the Natural Science Foundation of Jiangsu Province(Grant No.BK20220562)the Natural Science Research Project of Universities in Jiangsu Province(No.22KJB520010).
文摘Federated learning for edge computing is a promising solution in the data booming era,which leverages the computation ability of each edge device to train local models and only shares the model gradients to the central server.However,the frequently transmitted local gradients could also leak the participants’private data.To protect the privacy of local training data,lots of cryptographic-based Privacy-Preserving Federated Learning(PPFL)schemes have been proposed.However,due to the constrained resource nature of mobile devices and complex cryptographic operations,traditional PPFL schemes fail to provide efficient data confidentiality and lightweight integrity verification simultaneously.To tackle this problem,we propose a Verifiable Privacypreserving Federated Learning scheme(VPFL)for edge computing systems to prevent local gradients from leaking over the transmission stage.Firstly,we combine the Distributed Selective Stochastic Gradient Descent(DSSGD)method with Paillier homomorphic cryptosystem to achieve the distributed encryption functionality,so as to reduce the computation cost of the complex cryptosystem.Secondly,we further present an online/offline signature method to realize the lightweight gradients integrity verification,where the offline part can be securely outsourced to the edge server.Comprehensive security analysis demonstrates the proposed VPFL can achieve data confidentiality,authentication,and integrity.At last,we evaluate both communication overhead and computation cost of the proposed VPFL scheme,the experimental results have shown VPFL has low computation costs and communication overheads while maintaining high training accuracy.
基金supported by the two funding sources:Scientific Innovation 2030 Major Project for New Generation of AI,Ministry of Science and Technology of the Peoples Republic of China(2020AAA0107300)National Natural Science Foundation of China(62133015).
文摘In real life,a large amount of data describing the same learning task may be stored in different institutions(called participants),and these data cannot be shared among par-ticipants due to privacy protection.The case that different attributes/features of the same instance are stored in different institutions is called vertically distributed data.The pur-pose of vertical‐federated feature selection(FS)is to reduce the feature dimension of vertical distributed data jointly without sharing local original data so that the feature subset obtained has the same or better performance as the original feature set.To solve this problem,in the paper,an embedded vertical‐federated FS algorithm based on particle swarm optimisation(PSO‐EVFFS)is proposed by incorporating evolutionary FS into the SecureBoost framework for the first time.By optimising both hyper‐parameters of the XGBoost model and feature subsets,PSO‐EVFFS can obtain a feature subset,which makes the XGBoost model more accurate.At the same time,since different participants only share insensitive parameters such as model loss function,PSO‐EVFFS can effec-tively ensure the privacy of participants'data.Moreover,an ensemble ranking strategy of feature importance based on the XGBoost tree model is developed to effectively remove irrelevant features on each participant.Finally,the proposed algorithm is applied to 10 test datasets and compared with three typical vertical‐federated learning frameworks and two variants of the proposed algorithm with different initialisation strategies.Experi-mental results show that the proposed algorithm can significantly improve the classifi-cation performance of selected feature subsets while fully protecting the data privacy of all participants.
基金supported by the National Natural Science Foundation of China,No.61977006.
文摘Nowadays,smart wearable devices are used widely in the Social Internet of Things(IoT),which record human physiological data in real time.To protect the data privacy of smart devices,researchers pay more attention to federated learning.Although the data leakage problem is somewhat solved,a new challenge has emerged.Asynchronous federated learning shortens the convergence time,while it has time delay and data heterogeneity problems.Both of the two problems harm the accuracy.To overcome these issues,we propose an asynchronous federated learning scheme based on double compensation to solve the problem of time delay and data heterogeneity problems.The scheme improves the Delay Compensated Asynchronous Stochastic Gradient Descent(DC-ASGD)algorithm based on the second-order Taylor expansion as the delay compensation.It adds the FedProx operator to the objective function as the heterogeneity compensation.Besides,the proposed scheme motivates the federated learning process by adjusting the importance of the participants and the central server.We conduct multiple sets of experiments in both conventional and heterogeneous scenarios.The experimental results show that our scheme improves the accuracy by about 5%while keeping the complexity constant.We can find that our scheme converges more smoothly during training and adapts better in heterogeneous environments through numerical experiments.The proposed double-compensation-based federated learning scheme is highly accurate,flexible in terms of participants and smooth the training process.Hence it is deemed suitable for data privacy protection of smart wearable devices.
基金supported by the Major science and technology project of Hainan Province(Grant No.ZDKJ2020012)National Natural Science Foundation of China(Grant No.62162024 and 62162022)Key Projects in Hainan Province(Grant ZDYF2021GXJS003 and Grant ZDYF2020040).
文摘With the increasing number of smart devices and the development of machine learning technology,the value of users’personal data is becoming more and more important.Based on the premise of protecting users’personal privacy data,federated learning(FL)uses data stored on edge devices to realize training tasks by contributing training model parameters without revealing the original data.However,since FL can still leak the user’s original data by exchanging gradient information.The existing privacy protection strategy will increase the uplink time due to encryption measures.It is a huge challenge in terms of communication.When there are a large number of devices,the privacy protection cost of the system is higher.Based on these issues,we propose a privacy-preserving scheme of user-based group collaborative federated learning(GrCol-PPFL).Our scheme primarily divides participants into several groups and each group communicates in a chained transmission mechanism.All groups work in parallel at the same time.The server distributes a random parameter with the same dimension as the model parameter for each participant as a mask for the model parameter.We use the public datasets of modified national institute of standards and technology database(MNIST)to test the model accuracy.The experimental results show that GrCol-PPFL not only ensures the accuracy of themodel,but also ensures the security of the user’s original data when users collude with each other.Finally,through numerical experiments,we show that by changing the number of groups,we can find the optimal number of groups that reduces the uplink consumption time.
基金supported by the National Key Research and Development Program of China (2020YFB1807700)the National Natural Science Foundation of China (NSFC)under Grant No.62071356the Chongqing Key Laboratory of Mobile Communications Technology under Grant cqupt-mct202202。
文摘In this paper,to deal with the heterogeneity in federated learning(FL)systems,a knowledge distillation(KD)driven training framework for FL is proposed,where each user can select its neural network model on demand and distill knowledge from a big teacher model using its own private dataset.To overcome the challenge of train the big teacher model in resource limited user devices,the digital twin(DT)is exploit in the way that the teacher model can be trained at DT located in the server with enough computing resources.Then,during model distillation,each user can update the parameters of its model at either the physical entity or the digital agent.The joint problem of model selection and training offloading and resource allocation for users is formulated as a mixed integer programming(MIP)problem.To solve the problem,Q-learning and optimization are jointly used,where Q-learning selects models for users and determines whether to train locally or on the server,and optimization is used to allocate resources for users based on the output of Q-learning.Simulation results show the proposed DT-assisted KD framework and joint optimization method can significantly improve the average accuracy of users while reducing the total delay.
基金supported by the Ministry of Education Industry-University Cooperation Collaborative Education Projects of China under Grant 202102119036 and 202102082013。
文摘Data sharing technology in Internet of Vehicles(Io V)has attracted great research interest with the goal of realizing intelligent transportation and traffic management.Meanwhile,the main concerns have been raised about the security and privacy of vehicle data.The mobility and real-time characteristics of vehicle data make data sharing more difficult in Io V.The emergence of blockchain and federated learning brings new directions.In this paper,a data-sharing model that combines blockchain and federated learning is proposed to solve the security and privacy problems of data sharing in Io V.First,we use federated learning to share data instead of exposing actual data and propose an adaptive differential privacy scheme to further balance the privacy and availability of data.Then,we integrate the verification scheme into the consensus process,so that the consensus computation can filter out low-quality models.Experimental data shows that our data-sharing model can better balance the relationship between data availability and privacy,and also has enhanced security.
基金supported by the UAE University UPAR Research Grant Program under Grant 31T122.
文摘Federated Learning(FL)enables collaborative and privacy-preserving training of machine learning models within the Internet of Vehicles(IoV)realm.While FL effectively tackles privacy concerns,it also imposes significant resource requirements.In traditional FL,trained models are transmitted to a central server for global aggregation,typically in the cloud.This approach often leads to network congestion and bandwidth limitations when numerous devices communicate with the same server.The need for Flexible Global Aggregation and Dynamic Client Selection in FL for the IoV arises from the inherent characteristics of IoV environments.These include diverse and distributed data sources,varying data quality,and limited communication resources.By employing dynamic client selection,we can prioritize relevant and high-quality data sources,enhancing model accuracy.To address this issue,we propose an FL framework that selects global aggregation nodes dynamically rather than a single fixed aggregator.Flexible global aggregation ensures efficient utilization of limited network resources while accommodating the dynamic nature of IoV data sources.This approach optimizes both model performance and resource allocation,making FL in IoV more effective and adaptable.The selection of the global aggregation node is based on workload and communication speed considerations.Additionally,our framework overcomes the constraints associated with network,computational,and energy resources in the IoV environment by implementing a client selection algorithm that dynamically adjusts participants according to predefined parameters.Our approach surpasses Federated Averaging(FedAvg)and Hierarchical FL(HFL)regarding energy consumption,delay,and accuracy,yielding superior results.