Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers...Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.展开更多
This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consens...This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems.展开更多
This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eli...This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.展开更多
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ...As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.展开更多
Component failures can cause multi-agent system(MAS)performance degradation and even disasters,which provokes the demand of the fault diagnosis method.A distributed sliding mode observer-based fault diagnosis method f...Component failures can cause multi-agent system(MAS)performance degradation and even disasters,which provokes the demand of the fault diagnosis method.A distributed sliding mode observer-based fault diagnosis method for MAS is developed in presence of actuator and sensor faults.Firstly,the actuator and sensor faults are extended to the system state,and the system is transformed into a descriptor system form.Then,a sliding mode-based distributed unknown input observer is proposed to estimate the extended state.Furthermore,adaptive laws are introduced to adjust the observer parameters.Finally,the effectiveness of the proposed method is demonstrated with numerical simulations.展开更多
This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objecti...This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others. The above problem involves complexity simultaneously in the time and space aspects. Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions. In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time. Hence, the minimization of the overall functional belongs to the calculus of variations. Traditional works usually aim to seek the optimal decision function. Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function. This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning(RL) framework. In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection. Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework.展开更多
This article addresses the leader-following output consensus problem of heterogeneous linear multi-agent systems with unknown agent parameters under directed graphs.The dynamics of followers are allowed to be non-mini...This article addresses the leader-following output consensus problem of heterogeneous linear multi-agent systems with unknown agent parameters under directed graphs.The dynamics of followers are allowed to be non-minimum phase with unknown arbitrary individual relative degrees.This is contrary to many existing works on distributed adaptive control schemes where agent dynamics are required to be minimum phase and often of the same relative degree.A distributed adaptive pole placement control scheme is developed,which consists of a distributed observer and an adaptive pole placement control law.It is shown that under the proposed distributed adaptive control scheme,all signals in the closed-loop system are bounded and the outputs of all the followers track the output of the leader asymptotically.The effectiveness of the proposed scheme is demonstrated by one practical example and one numerical example.展开更多
In this paper,a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for a multi-agent system with heterogeneous dynamic uncertainties and disturbances over a directed graph...In this paper,a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for a multi-agent system with heterogeneous dynamic uncertainties and disturbances over a directed graph.It is of two-degree-of-freedom nature.Specifically,a robust distributed controller is designed for consensus tracking,while a local disturbance estimator is designed for each agent without requiring the input channel information of disturbances.The condition for asymptotic disturbance rejection is derived.Moreover,even when the disturbance model is not exactly known,the developed method also provides good disturbance-rejection performance.Then,a robust stabilization condition with less conservativeness is derived for the whole multi-agent system.Further,a design algorithm is given.Finally,comparisons with the conventional one-degree-of-freedombased distributed disturbance-rejection method for mismatched disturbances and the distributed extended-state observer for matched disturbances validate the developed method.展开更多
The problem of fixed-time group consensus for second-order multi-agent systems with disturbances is investigated.For cooperative-competitive network,two different control protocols,fixed-time group consensus and fixed...The problem of fixed-time group consensus for second-order multi-agent systems with disturbances is investigated.For cooperative-competitive network,two different control protocols,fixed-time group consensus and fixed-time eventtriggered group consensus,are designed.It is demonstrated that there is no Zeno behavior under the designed eventtriggered control.Meanwhile,it is proved that for an arbitrary initial state of the system,group consensus within the settling time could be obtained under the proposed control protocols by using matrix analysis and graph theory.Finally,a series of numerical examples are propounded to illustrate the performance of the proposed control protocol.展开更多
This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication manag...This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication management strategy is proposed for multi-agent systems so that the connectivity of the system can be maintained and the communication energy can be saved.In this paper,communication management means a strategy about how the sensing ranges of agents are adjusted in the process of reaching consensus.The proposed communication management in this paper is not coupled with controller but only imposes a constraint for controller,so there is more freedom to develop an appropriate control strategy for achieving consensus.For the multi-agent systems with this novel communication management,a predictive control based strategy is developed for achieving consensus.Simulation results indicate the effectiveness and advantages of our scheme.展开更多
A new kind of group coordination control problemgroup hybrid coordination control is investigated in this paper.The group hybrid coordination control means that in a whole multi-agent system(MAS)that consists of two s...A new kind of group coordination control problemgroup hybrid coordination control is investigated in this paper.The group hybrid coordination control means that in a whole multi-agent system(MAS)that consists of two subgroups with communications between them,agents in the two subgroups achieve consensus and containment,respectively.For MASs with both time-delays and additive noises,two group control protocols are proposed to solve this problem for the containment-oriented case and consensus-oriented case,respectively.By developing a new analysis idea,some sufficient conditions and necessary conditions related to the communication intensity betw een the two subgroups are obtained for the following two types of group hybrid coordination behavior:1)Agents in one subgroup and in another subgroup achieve weak consensus and containment,respectively;2)Agents in one subgroup and in another subgroup achieve strong consensus and containment,respectively.It is revealed that the decay of the communication impact betw een the two subgroups is necessary for the consensus-oriented case.Finally,the validity of the group control results is verified by several simulation examples.展开更多
The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep r...The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments.展开更多
According to the advances in users’service requirements,physical hardware accessibility,and speed of resource delivery,Cloud Computing(CC)is an essential technology to be used in many fields.Moreover,the Internet of ...According to the advances in users’service requirements,physical hardware accessibility,and speed of resource delivery,Cloud Computing(CC)is an essential technology to be used in many fields.Moreover,the Internet of Things(IoT)is employed for more communication flexibility and richness that are required to obtain fruitful services.A multi-agent system might be a proper solution to control the load balancing of interaction and communication among agents.This paper proposes a multi-agent load balancing framework that consists of two phases to optimize the workload among different servers with large-scale CC power with various utilities and a significant number of IoT devices with low resources.Different agents are integrated based on relevant features of behavioral interaction using classification techniques to balance the workload.Aload balancing algorithm is developed to serve users’requests to improve the solution of workload problems with an efficient distribution.The activity task from IoT devices has been classified by feature selection methods in the preparatory phase to optimize the scalability ofCC.Then,the server’s availability is checked and the classified task is assigned to its suitable server in the main phase to enhance the cloud environment performance.Multi-agent load balancing framework is succeeded to cope with the importance of using large-scale requirements of CC and(low resources and large number)of IoT.展开更多
This paper considers the mean square output containment control problem for heterogeneous multi-agent systems(MASs)with randomly switching topologies and nonuniform distributed delays.By modeling the switching topolog...This paper considers the mean square output containment control problem for heterogeneous multi-agent systems(MASs)with randomly switching topologies and nonuniform distributed delays.By modeling the switching topologies as a continuous-time Markov process and taking the distributed delays into consideration,a novel distributed containment observer is proposed to estimate the convex hull spanned by the leaders'states.A novel distributed output feedback containment controller is then designed without using the prior knowledge of distributed delays.By constructing a novel switching Lyapunov functional,the output containment control problem is then solved in the sense of mean square under an easily-verifiable sufficient condition.Finally,two numerical examples are given to show the effectiveness of the proposed controller.展开更多
The existing containment control has been widely developed for several years, but ignores the case for large-scale cooperation. The strong coupling of large-scale networks will increase the costs of system detection a...The existing containment control has been widely developed for several years, but ignores the case for large-scale cooperation. The strong coupling of large-scale networks will increase the costs of system detection and maintenance. Therefore, this paper is concerned with an extensional containment control issue, hierarchical containment control. It aims to enable a multitude of followers achieving a novel cooperation in the convex hull shaped by multiple leaders. Firstly, by constructing the three-layer topology, large-scale networks are decoupled. Then,under the condition of directed spanning group-tree, a class of dynamic hierarchical containment control protocol is designed such that the novel group-consensus behavior in the convex hull can be realized. Moreover, the definitions of coupling strength coefficients and the group-consensus parameter in the proposed dynamic hierarchical control protocol enhance the adjustability of systems. Compared with the existing containment control strategy, the proposed hierarchical containment control strategy improves dynamic control performance. Finally, numerical simulations are presented to demonstrate the effectiveness of the proposed hierarchical control protocol.展开更多
Mobile-edge computing(MEC)is a promising technology for the fifth-generation(5G)and sixth-generation(6G)architectures,which provides resourceful computing capabilities for Internet of Things(IoT)devices,such as virtua...Mobile-edge computing(MEC)is a promising technology for the fifth-generation(5G)and sixth-generation(6G)architectures,which provides resourceful computing capabilities for Internet of Things(IoT)devices,such as virtual reality,mobile devices,and smart cities.In general,these IoT applications always bring higher energy consumption than traditional applications,which are usually energy-constrained.To provide persistent energy,many references have studied the offloading problem to save energy consumption.However,the dynamic environment dramatically increases the optimization difficulty of the offloading decision.In this paper,we aim to minimize the energy consumption of the entireMECsystemunder the latency constraint by fully considering the dynamic environment.UnderMarkov games,we propose amulti-agent deep reinforcement learning approach based on the bi-level actorcritic learning structure to jointly optimize the offloading decision and resource allocation,which can solve the combinatorial optimization problem using an asymmetric method and compute the Stackelberg equilibrium as a better convergence point than Nash equilibrium in terms of Pareto superiority.Our method can better adapt to a dynamic environment during the data transmission than the single-agent strategy and can effectively tackle the coordination problem in the multi-agent environment.The simulation results show that the proposed method could decrease the total computational overhead by 17.8%compared to the actor-critic-based method and reduce the total computational overhead by 31.3%,36.5%,and 44.7%compared with randomoffloading,all local execution,and all offloading execution,respectively.展开更多
To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model wit...To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.展开更多
We investigate the fixed-time containment control(FCC)problem of multi-agent systems(MASs)under discontinuous communication.A saturation function is used in the controller to achieve the containment control in MASs.On...We investigate the fixed-time containment control(FCC)problem of multi-agent systems(MASs)under discontinuous communication.A saturation function is used in the controller to achieve the containment control in MASs.One difference from using a symbolic function is that it avoids the differential calculation process for discontinuous functions,which further ensures the continuity of the control input.Considering the discontinuous communication,a dynamic variable is constructed,which is always non-negative between any two communications of the agent.Based on the designed variable,the dynamic event-triggered algorithm is proposed to achieve FCC,which can effectively reduce controller updating.In addition,we further design a new event-triggered algorithm to achieve FCC,called the team-trigger mechanism,which combines the self-triggering technique with the proposed dynamic event trigger mechanism.It has faster convergence than the proposed dynamic event triggering technique and achieves the tradeoff between communication cost,convergence time and number of triggers in MASs.Finally,Zeno behavior is excluded and the validity of the proposed theory is confirmed by simulation.展开更多
基金supported in part by NSFC (62102099, U22A2054, 62101594)in part by the Pearl River Talent Recruitment Program (2021QN02S643)+9 种基金Guangzhou Basic Research Program (2023A04J1699)in part by the National Research Foundation, SingaporeInfocomm Media Development Authority under its Future Communications Research Development ProgrammeDSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programmeDesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programmeMOE Tier 1 under Grant RG87/22in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165)in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)。
文摘Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
基金supported in part by the National Natural Science Foundation of China (NSFC)(61703086, 61773106)the IAPI Fundamental Research Funds (2018ZCX27)
文摘This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems.
基金the National Natural Science Foundation of China(62203356)Fundamental Research Funds for the Central Universities of China(31020210502002)。
文摘This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.
文摘As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.
基金supported by the National Natural Science Foundation of China(62020106003,62003162)111 project(B20007)+1 种基金the Natural Science Foundation of Jiangsu Province of China(BK20200416)the China Postdoctoral Science Foundation(2020TQ0151,2020M681590).
文摘Component failures can cause multi-agent system(MAS)performance degradation and even disasters,which provokes the demand of the fault diagnosis method.A distributed sliding mode observer-based fault diagnosis method for MAS is developed in presence of actuator and sensor faults.Firstly,the actuator and sensor faults are extended to the system state,and the system is transformed into a descriptor system form.Then,a sliding mode-based distributed unknown input observer is proposed to estimate the extended state.Furthermore,adaptive laws are introduced to adjust the observer parameters.Finally,the effectiveness of the proposed method is demonstrated with numerical simulations.
基金supported in part by the National Natural Science Foundation of China(NSFC)(61773260)the Ministry of Science and Technology (2018YFB130590)。
文摘This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others. The above problem involves complexity simultaneously in the time and space aspects. Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions. In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time. Hence, the minimization of the overall functional belongs to the calculus of variations. Traditional works usually aim to seek the optimal decision function. Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function. This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning(RL) framework. In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection. Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework.
基金This work was supported by Research Grants Council of Hong Kong(CityU-11205221).
文摘This article addresses the leader-following output consensus problem of heterogeneous linear multi-agent systems with unknown agent parameters under directed graphs.The dynamics of followers are allowed to be non-minimum phase with unknown arbitrary individual relative degrees.This is contrary to many existing works on distributed adaptive control schemes where agent dynamics are required to be minimum phase and often of the same relative degree.A distributed adaptive pole placement control scheme is developed,which consists of a distributed observer and an adaptive pole placement control law.It is shown that under the proposed distributed adaptive control scheme,all signals in the closed-loop system are bounded and the outputs of all the followers track the output of the leader asymptotically.The effectiveness of the proposed scheme is demonstrated by one practical example and one numerical example.
基金supported by the National Natural Science Foundation of China(62003010,61873006,61673053)the Beijing Postdoctoral Research Foundation(Q6041001202001)+1 种基金the Postdoctoral Research Foundation of Chaoyang District(Q1041001202101)the National Key Research and Development Project(2018YFC1602704,2018YFB1702704)。
文摘In this paper,a new distributed consensus tracking protocol incorporating local disturbance rejection is devised for a multi-agent system with heterogeneous dynamic uncertainties and disturbances over a directed graph.It is of two-degree-of-freedom nature.Specifically,a robust distributed controller is designed for consensus tracking,while a local disturbance estimator is designed for each agent without requiring the input channel information of disturbances.The condition for asymptotic disturbance rejection is derived.Moreover,even when the disturbance model is not exactly known,the developed method also provides good disturbance-rejection performance.Then,a robust stabilization condition with less conservativeness is derived for the whole multi-agent system.Further,a design algorithm is given.Finally,comparisons with the conventional one-degree-of-freedombased distributed disturbance-rejection method for mismatched disturbances and the distributed extended-state observer for matched disturbances validate the developed method.
基金Project supported by the Graduate Student Research Innovation Project of Chongqing(Grant No.CYS22482)the National Natural Science Foundation of China(Grant No.61773082)+1 种基金the Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.KJZD-K202000601)the Research Program of Chongqing Talent,China(Grant No.cstc2021ycjhbgzxm0044).
文摘The problem of fixed-time group consensus for second-order multi-agent systems with disturbances is investigated.For cooperative-competitive network,two different control protocols,fixed-time group consensus and fixed-time eventtriggered group consensus,are designed.It is demonstrated that there is no Zeno behavior under the designed eventtriggered control.Meanwhile,it is proved that for an arbitrary initial state of the system,group consensus within the settling time could be obtained under the proposed control protocols by using matrix analysis and graph theory.Finally,a series of numerical examples are propounded to illustrate the performance of the proposed control protocol.
基金supported by the National Key Research and Development Program of China(2018AAA0101701)the National Natural Science Foundation of China(62173224,61833012)。
文摘This paper studies the connectivity-maintaining consensus of multi-agent systems.Considering the impact of the sensing ranges of agents for connectivity and communication energy consumption,a novel communication management strategy is proposed for multi-agent systems so that the connectivity of the system can be maintained and the communication energy can be saved.In this paper,communication management means a strategy about how the sensing ranges of agents are adjusted in the process of reaching consensus.The proposed communication management in this paper is not coupled with controller but only imposes a constraint for controller,so there is more freedom to develop an appropriate control strategy for achieving consensus.For the multi-agent systems with this novel communication management,a predictive control based strategy is developed for achieving consensus.Simulation results indicate the effectiveness and advantages of our scheme.
基金supported by the National Natural Science Foundation of China(62073305)the Fundamental Research Funds for the Central Universities,China University of Geosciences(Wuhan)(CUG170610)。
文摘A new kind of group coordination control problemgroup hybrid coordination control is investigated in this paper.The group hybrid coordination control means that in a whole multi-agent system(MAS)that consists of two subgroups with communications between them,agents in the two subgroups achieve consensus and containment,respectively.For MASs with both time-delays and additive noises,two group control protocols are proposed to solve this problem for the containment-oriented case and consensus-oriented case,respectively.By developing a new analysis idea,some sufficient conditions and necessary conditions related to the communication intensity betw een the two subgroups are obtained for the following two types of group hybrid coordination behavior:1)Agents in one subgroup and in another subgroup achieve weak consensus and containment,respectively;2)Agents in one subgroup and in another subgroup achieve strong consensus and containment,respectively.It is revealed that the decay of the communication impact betw een the two subgroups is necessary for the consensus-oriented case.Finally,the validity of the group control results is verified by several simulation examples.
基金supported by Primary Research and Development Plan of China(No.2020YFC2006602)National Natural Science Foundation of China(Nos.62072324,61876217,61876121,61772357)+2 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(No.BE2020026)Natural Science Foundation of Jiangsu Province(No.BK20190942).
文摘The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments.
文摘According to the advances in users’service requirements,physical hardware accessibility,and speed of resource delivery,Cloud Computing(CC)is an essential technology to be used in many fields.Moreover,the Internet of Things(IoT)is employed for more communication flexibility and richness that are required to obtain fruitful services.A multi-agent system might be a proper solution to control the load balancing of interaction and communication among agents.This paper proposes a multi-agent load balancing framework that consists of two phases to optimize the workload among different servers with large-scale CC power with various utilities and a significant number of IoT devices with low resources.Different agents are integrated based on relevant features of behavioral interaction using classification techniques to balance the workload.Aload balancing algorithm is developed to serve users’requests to improve the solution of workload problems with an efficient distribution.The activity task from IoT devices has been classified by feature selection methods in the preparatory phase to optimize the scalability ofCC.Then,the server’s availability is checked and the classified task is assigned to its suitable server in the main phase to enhance the cloud environment performance.Multi-agent load balancing framework is succeeded to cope with the importance of using large-scale requirements of CC and(low resources and large number)of IoT.
文摘This paper considers the mean square output containment control problem for heterogeneous multi-agent systems(MASs)with randomly switching topologies and nonuniform distributed delays.By modeling the switching topologies as a continuous-time Markov process and taking the distributed delays into consideration,a novel distributed containment observer is proposed to estimate the convex hull spanned by the leaders'states.A novel distributed output feedback containment controller is then designed without using the prior knowledge of distributed delays.By constructing a novel switching Lyapunov functional,the output containment control problem is then solved in the sense of mean square under an easily-verifiable sufficient condition.Finally,two numerical examples are given to show the effectiveness of the proposed controller.
基金supported in part by the National Natural Science Foundation of China(U22A20221,62073064)in part by the Fundamental Research Funds for the Central Universities in China(N2204007)。
文摘The existing containment control has been widely developed for several years, but ignores the case for large-scale cooperation. The strong coupling of large-scale networks will increase the costs of system detection and maintenance. Therefore, this paper is concerned with an extensional containment control issue, hierarchical containment control. It aims to enable a multitude of followers achieving a novel cooperation in the convex hull shaped by multiple leaders. Firstly, by constructing the three-layer topology, large-scale networks are decoupled. Then,under the condition of directed spanning group-tree, a class of dynamic hierarchical containment control protocol is designed such that the novel group-consensus behavior in the convex hull can be realized. Moreover, the definitions of coupling strength coefficients and the group-consensus parameter in the proposed dynamic hierarchical control protocol enhance the adjustability of systems. Compared with the existing containment control strategy, the proposed hierarchical containment control strategy improves dynamic control performance. Finally, numerical simulations are presented to demonstrate the effectiveness of the proposed hierarchical control protocol.
基金supported by the National Natural Science Foundation of China(62162050)the Fundamental Research Funds for the Central Universities(No.N2217002)the Natural Science Foundation of Liaoning ProvincialDepartment of Science and Technology(No.2022-KF-11-04).
文摘Mobile-edge computing(MEC)is a promising technology for the fifth-generation(5G)and sixth-generation(6G)architectures,which provides resourceful computing capabilities for Internet of Things(IoT)devices,such as virtual reality,mobile devices,and smart cities.In general,these IoT applications always bring higher energy consumption than traditional applications,which are usually energy-constrained.To provide persistent energy,many references have studied the offloading problem to save energy consumption.However,the dynamic environment dramatically increases the optimization difficulty of the offloading decision.In this paper,we aim to minimize the energy consumption of the entireMECsystemunder the latency constraint by fully considering the dynamic environment.UnderMarkov games,we propose amulti-agent deep reinforcement learning approach based on the bi-level actorcritic learning structure to jointly optimize the offloading decision and resource allocation,which can solve the combinatorial optimization problem using an asymmetric method and compute the Stackelberg equilibrium as a better convergence point than Nash equilibrium in terms of Pareto superiority.Our method can better adapt to a dynamic environment during the data transmission than the single-agent strategy and can effectively tackle the coordination problem in the multi-agent environment.The simulation results show that the proposed method could decrease the total computational overhead by 17.8%compared to the actor-critic-based method and reduce the total computational overhead by 31.3%,36.5%,and 44.7%compared with randomoffloading,all local execution,and all offloading execution,respectively.
基金financial support from National Natural Science Foundation of China(Grant No.61601491)Natural Science Foundation of Hubei Province,China(Grant No.2018CFC865)Military Research Project of China(-Grant No.YJ2020B117)。
文摘To solve the problem of multi-target hunting by an unmanned surface vehicle(USV)fleet,a hunting algorithm based on multi-agent reinforcement learning is proposed.Firstly,the hunting environment and kinematic model without boundary constraints are built,and the criteria for successful target capture are given.Then,the cooperative hunting problem of a USV fleet is modeled as a decentralized partially observable Markov decision process(Dec-POMDP),and a distributed partially observable multitarget hunting Proximal Policy Optimization(DPOMH-PPO)algorithm applicable to USVs is proposed.In addition,an observation model,a reward function and the action space applicable to multi-target hunting tasks are designed.To deal with the dynamic change of observational feature dimension input by partially observable systems,a feature embedding block is proposed.By combining the two feature compression methods of column-wise max pooling(CMP)and column-wise average-pooling(CAP),observational feature encoding is established.Finally,the centralized training and decentralized execution framework is adopted to complete the training of hunting strategy.Each USV in the fleet shares the same policy and perform actions independently.Simulation experiments have verified the effectiveness of the DPOMH-PPO algorithm in the test scenarios with different numbers of USVs.Moreover,the advantages of the proposed model are comprehensively analyzed from the aspects of algorithm performance,migration effect in task scenarios and self-organization capability after being damaged,the potential deployment and application of DPOMH-PPO in the real environment is verified.
基金supported by the National Natural Science Foundation of China (Grant Nos.62173121,62002095,61961019,and 61803139)the Youth Key Project of Natural Science Foundation of Jiangxi Province of China (Grant No.20202ACBL212003)。
文摘We investigate the fixed-time containment control(FCC)problem of multi-agent systems(MASs)under discontinuous communication.A saturation function is used in the controller to achieve the containment control in MASs.One difference from using a symbolic function is that it avoids the differential calculation process for discontinuous functions,which further ensures the continuity of the control input.Considering the discontinuous communication,a dynamic variable is constructed,which is always non-negative between any two communications of the agent.Based on the designed variable,the dynamic event-triggered algorithm is proposed to achieve FCC,which can effectively reduce controller updating.In addition,we further design a new event-triggered algorithm to achieve FCC,called the team-trigger mechanism,which combines the self-triggering technique with the proposed dynamic event trigger mechanism.It has faster convergence than the proposed dynamic event triggering technique and achieves the tradeoff between communication cost,convergence time and number of triggers in MASs.Finally,Zeno behavior is excluded and the validity of the proposed theory is confirmed by simulation.