期刊文献+
共找到210篇文章
< 1 2 11 >
每页显示 20 50 100
UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach
1
作者 Jiawen Kang Junlong Chen +6 位作者 Minrui Xu Zehui Xiong Yutao Jiao Luchao Han Dusit Niyato Yongju Tong Shengli Xie 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第2期430-445,共16页
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers... Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses. 展开更多
关键词 AVATAR blockchain metaverses multi-agent deep reinforcement learning transformer UAVS
下载PDF
Constrained Multi-Objective Optimization With Deep Reinforcement Learning Assisted Operator Selection
2
作者 Fei Ming Wenyin Gong +1 位作者 Ling Wang Yaochu Jin 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第4期919-931,共13页
Solving constrained multi-objective optimization problems with evolutionary algorithms has attracted considerable attention.Various constrained multi-objective optimization evolutionary algorithms(CMOEAs)have been dev... Solving constrained multi-objective optimization problems with evolutionary algorithms has attracted considerable attention.Various constrained multi-objective optimization evolutionary algorithms(CMOEAs)have been developed with the use of different algorithmic strategies,evolutionary operators,and constraint-handling techniques.The performance of CMOEAs may be heavily dependent on the operators used,however,it is usually difficult to select suitable operators for the problem at hand.Hence,improving operator selection is promising and necessary for CMOEAs.This work proposes an online operator selection framework assisted by Deep Reinforcement Learning.The dynamics of the population,including convergence,diversity,and feasibility,are regarded as the state;the candidate operators are considered as actions;and the improvement of the population state is treated as the reward.By using a Q-network to learn a policy to estimate the Q-values of all actions,the proposed approach can adaptively select an operator that maximizes the improvement of the population according to the current state and thereby improve the algorithmic performance.The framework is embedded into four popular CMOEAs and assessed on 42 benchmark problems.The experimental results reveal that the proposed Deep Reinforcement Learning-assisted operator selection significantly improves the performance of these CMOEAs and the resulting algorithm obtains better versatility compared to nine state-of-the-art CMOEAs. 展开更多
关键词 Constrained multi-objective optimization deep Qlearning deep reinforcement learning(DRL) evolutionary algorithms evolutionary operator selection
下载PDF
Energy-Efficient Traffic Offloading for RSMA-Based Hybrid Satellite Terrestrial Networks with Deep Reinforcement Learning
3
作者 Qingmiao Zhang Lidong Zhu +1 位作者 Yanyan Chen Shan Jiang 《China Communications》 SCIE CSCD 2024年第2期49-58,共10页
As the demands of massive connections and vast coverage rapidly grow in the next wireless communication networks, rate splitting multiple access(RSMA) is considered to be the new promising access scheme since it can p... As the demands of massive connections and vast coverage rapidly grow in the next wireless communication networks, rate splitting multiple access(RSMA) is considered to be the new promising access scheme since it can provide higher efficiency with limited spectrum resources. In this paper, combining spectrum splitting with rate splitting, we propose to allocate resources with traffic offloading in hybrid satellite terrestrial networks. A novel deep reinforcement learning method is adopted to solve this challenging non-convex problem. However, the neverending learning process could prohibit its practical implementation. Therefore, we introduce the switch mechanism to avoid unnecessary learning. Additionally, the QoS constraint in the scheme can rule out unsuccessful transmission. The simulation results validates the energy efficiency performance and the convergence speed of the proposed algorithm. 展开更多
关键词 deep reinforcement learning energy efficiency hybrid satellite terrestrial networks rate splitting multiple access traffic offloading
下载PDF
Deep Reinforcement Learning-Based Task Offloading and Service Migrating Policies in Service Caching-Assisted Mobile Edge Computing
4
作者 Ke Hongchang Wang Hui +1 位作者 Sun Hongbin Halvin Yang 《China Communications》 SCIE CSCD 2024年第4期88-103,共16页
Emerging mobile edge computing(MEC)is considered a feasible solution for offloading the computation-intensive request tasks generated from mobile wireless equipment(MWE)with limited computational resources and energy.... Emerging mobile edge computing(MEC)is considered a feasible solution for offloading the computation-intensive request tasks generated from mobile wireless equipment(MWE)with limited computational resources and energy.Due to the homogeneity of request tasks from one MWE during a longterm time period,it is vital to predeploy the particular service cachings required by the request tasks at the MEC server.In this paper,we model a service caching-assisted MEC framework that takes into account the constraint on the number of service cachings hosted by each edge server and the migration of request tasks from the current edge server to another edge server with service caching required by tasks.Furthermore,we propose a multiagent deep reinforcement learning-based computation offloading and task migrating decision-making scheme(MBOMS)to minimize the long-term average weighted cost.The proposed MBOMS can learn the near-optimal offloading and migrating decision-making policy by centralized training and decentralized execution.Systematic and comprehensive simulation results reveal that our proposed MBOMS can converge well after training and outperforms the other five baseline algorithms. 展开更多
关键词 deep reinforcement learning mobile edge computing service caching service migrating
下载PDF
Policy Network-Based Dual-Agent Deep Reinforcement Learning for Multi-Resource Task Offloading in Multi-Access Edge Cloud Networks
5
作者 Feng Chuan Zhang Xu +2 位作者 Han Pengchao Ma Tianchun Gong Xiaoxue 《China Communications》 SCIE CSCD 2024年第4期53-73,共21页
The Multi-access Edge Cloud(MEC) networks extend cloud computing services and capabilities to the edge of the networks. By bringing computation and storage capabilities closer to end-users and connected devices, MEC n... The Multi-access Edge Cloud(MEC) networks extend cloud computing services and capabilities to the edge of the networks. By bringing computation and storage capabilities closer to end-users and connected devices, MEC networks can support a wide range of applications. MEC networks can also leverage various types of resources, including computation resources, network resources, radio resources,and location-based resources, to provide multidimensional resources for intelligent applications in 5/6G.However, tasks generated by users often consist of multiple subtasks that require different types of resources. It is a challenging problem to offload multiresource task requests to the edge cloud aiming at maximizing benefits due to the heterogeneity of resources provided by devices. To address this issue,we mathematically model the task requests with multiple subtasks. Then, the problem of task offloading of multi-resource task requests is proved to be NP-hard. Furthermore, we propose a novel Dual-Agent Deep Reinforcement Learning algorithm with Node First and Link features(NF_L_DA_DRL) based on the policy network, to optimize the benefits generated by offloading multi-resource task requests in MEC networks. Finally, simulation results show that the proposed algorithm can effectively improve the benefit of task offloading with higher resource utilization compared with baseline algorithms. 展开更多
关键词 benefit maximization deep reinforcement learning multi-access edge cloud task offloading
下载PDF
QoS Routing Optimization Based on Deep Reinforcement Learning in SDN
6
作者 Yu Song Xusheng Qian +2 位作者 Nan Zhang WeiWang Ao Xiong 《Computers, Materials & Continua》 SCIE EI 2024年第5期3007-3021,共15页
To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQu... To enhance the efficiency and expediency of issuing e-licenses within the power sector, we must confront thechallenge of managing the surging demand for data traffic. Within this realm, the network imposes stringentQuality of Service (QoS) requirements, revealing the inadequacies of traditional routing allocation mechanismsin accommodating such extensive data flows. In response to the imperative of handling a substantial influx of datarequests promptly and alleviating the constraints of existing technologies and network congestion, we present anarchitecture forQoS routing optimizationwith in SoftwareDefinedNetwork (SDN), leveraging deep reinforcementlearning. This innovative approach entails the separation of SDN control and transmission functionalities, centralizingcontrol over data forwardingwhile integrating deep reinforcement learning for informed routing decisions. Byfactoring in considerations such as delay, bandwidth, jitter rate, and packet loss rate, we design a reward function toguide theDeepDeterministic PolicyGradient (DDPG) algorithmin learning the optimal routing strategy to furnishsuperior QoS provision. In our empirical investigations, we juxtapose the performance of Deep ReinforcementLearning (DRL) against that of Shortest Path (SP) algorithms in terms of data packet transmission delay. Theexperimental simulation results show that our proposed algorithm has significant efficacy in reducing networkdelay and improving the overall transmission efficiency, which is superior to the traditional methods. 展开更多
关键词 deep reinforcement learning SDN route optimization QOS
下载PDF
Deep reinforcement learning using least-squares truncated temporal-difference
7
作者 Junkai Ren Yixing Lan +3 位作者 Xin Xu Yichuan Zhang Qiang Fang Yujun Zeng 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第2期425-439,共15页
Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in curre... Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in current PE methods,such as low sample efficiency and local convergence,especially on complex tasks.In this study,a novel PE algorithm called Least-Squares Truncated Temporal-Difference learning(LST2D)is proposed.In LST2D,an adaptive truncation mechanism is designed,which effectively takes advantage of the fast convergence property of Least-Squares Temporal Difference learning and the asymptotic convergence property of Temporal Difference learning(TD).Then,two feature pre-training methods are utilised to improve the approximation ability of LST2D.Furthermore,an Actor-Critic algorithm based on LST2D and pre-trained feature representations(ACLPF)is proposed,where LST2D is integrated into the critic network to improve learning-prediction efficiency.Comprehensive simulation studies were conducted on four robotic tasks,and the corresponding results illustrate the effectiveness of LST2D.The proposed ACLPF algorithm outperformed DQN,ACER and PPO in terms of sample efficiency and stability,which demonstrated that LST2D can be applied to online learning control problems by incorporating it into the actor-critic architecture. 展开更多
关键词 deep reinforcement learning policy evaluation temporal difference value function approximation
下载PDF
Enhancing Image Description Generation through Deep Reinforcement Learning:Fusing Multiple Visual Features and Reward Mechanisms
8
作者 Yan Li Qiyuan Wang Kaidi Jia 《Computers, Materials & Continua》 SCIE EI 2024年第2期2469-2489,共21页
Image description task is the intersection of computer vision and natural language processing,and it has important prospects,including helping computers understand images and obtaining information for the visually imp... Image description task is the intersection of computer vision and natural language processing,and it has important prospects,including helping computers understand images and obtaining information for the visually impaired.This study presents an innovative approach employing deep reinforcement learning to enhance the accuracy of natural language descriptions of images.Our method focuses on refining the reward function in deep reinforcement learning,facilitating the generation of precise descriptions by aligning visual and textual features more closely.Our approach comprises three key architectures.Firstly,it utilizes Residual Network 101(ResNet-101)and Faster Region-based Convolutional Neural Network(Faster R-CNN)to extract average and local image features,respectively,followed by the implementation of a dual attention mechanism for intricate feature fusion.Secondly,the Transformer model is engaged to derive contextual semantic features from textual data.Finally,the generation of descriptive text is executed through a two-layer long short-term memory network(LSTM),directed by the value and reward functions.Compared with the image description method that relies on deep learning,the score of Bilingual Evaluation Understudy(BLEU-1)is 0.762,which is 1.6%higher,and the score of BLEU-4 is 0.299.Consensus-based Image Description Evaluation(CIDEr)scored 0.998,Recall-Oriented Understudy for Gisting Evaluation(ROUGE)scored 0.552,the latter improved by 0.36%.These results not only attest to the viability of our approach but also highlight its superiority in the realm of image description.Future research can explore the integration of our method with other artificial intelligence(AI)domains,such as emotional AI,to create more nuanced and context-aware systems. 展开更多
关键词 Image description deep reinforcement learning attention mechanism
下载PDF
A Deep Reinforcement Learning-Based Technique for Optimal Power Allocation in Multiple Access Communications
9
作者 Sepehr Soltani Ehsan Ghafourian +2 位作者 Reza Salehi DiegoMartín Milad Vahidi 《Intelligent Automation & Soft Computing》 2024年第1期93-108,共16页
Formany years,researchers have explored power allocation(PA)algorithms driven bymodels in wireless networks where multiple-user communications with interference are present.Nowadays,data-driven machine learning method... Formany years,researchers have explored power allocation(PA)algorithms driven bymodels in wireless networks where multiple-user communications with interference are present.Nowadays,data-driven machine learning methods have become quite popular in analyzing wireless communication systems,which among them deep reinforcement learning(DRL)has a significant role in solving optimization issues under certain constraints.To this purpose,in this paper,we investigate the PA problem in a k-user multiple access channels(MAC),where k transmitters(e.g.,mobile users)aim to send an independent message to a common receiver(e.g.,base station)through wireless channels.To this end,we first train the deep Q network(DQN)with a deep Q learning(DQL)algorithm over the simulation environment,utilizing offline learning.Then,the DQN will be used with the real data in the online training method for the PA issue by maximizing the sumrate subjected to the source power.Finally,the simulation results indicate that our proposedDQNmethod provides better performance in terms of the sumrate compared with the available DQL training approaches such as fractional programming(FP)and weighted minimum mean squared error(WMMSE).Additionally,by considering different user densities,we show that our proposed DQN outperforms benchmark algorithms,thereby,a good generalization ability is verified over wireless multi-user communication systems. 展开更多
关键词 deep reinforcement learning deep Q learning multiple access channel power allocation
下载PDF
Deep Reinforcement Learning-Based Collaborative Routing Algorithm for Clustered MANETs 被引量:1
10
作者 Zexu Li Yong Li Wenbo Wang 《China Communications》 SCIE CSCD 2023年第3期185-200,共16页
Flexible adaptation to differentiated quality of service(QoS)is quite important for future 6G network with a variety of services.Mobile ad hoc networks(MANETs)are able to provide flexible communication services to use... Flexible adaptation to differentiated quality of service(QoS)is quite important for future 6G network with a variety of services.Mobile ad hoc networks(MANETs)are able to provide flexible communication services to users through self-configuration and rapid deployment.However,the dynamic wireless environment,the limited resources,and complex QoS requirements have presented great challenges for network routing problems.Motivated by the development of artificial intelligence,a deep reinforcement learning-based collaborative routing(DRLCR)algorithm is proposed.Both routing policy and subchannel allocation are considered jointly,aiming at minimizing the end-to-end(E2E)delay and improving the network capacity.After sufficient training by the cluster head node,the Q-network can be synchronized to each member node to select the next hop based on local observation.Moreover,we improve the performance of training by considering historical observations,which can improve the adaptability of routing policies to dynamic environments.Simulation results show that the proposed DRLCR algorithm outperforms other algorithms in terms of resource utilization and E2E delay by optimizing network load to avoid congestion.In addition,the effectiveness of the routing policy in a dynamic environment is verified. 展开更多
关键词 artificial intelligence deep reinforcement learning collaborative routing MANETS 6G
下载PDF
Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning 被引量:1
11
作者 Jia-yi Liu Gang Wang +2 位作者 Qiang Fu Shao-hua Yue Si-yuan Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第1期210-219,共10页
The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to... The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified. 展开更多
关键词 Ground-to-air confrontation Task assignment General and narrow agents deep reinforcement learning Proximal policy optimization(PPO)
下载PDF
Implicit Continuous User Authentication for Mobile Devices based on Deep Reinforcement Learning 被引量:1
12
作者 Christy James Jose M.S.Rajasree 《Computer Systems Science & Engineering》 SCIE EI 2023年第2期1357-1372,共16页
The predominant method for smart phone accessing is confined to methods directing the authentication by means of Point-of-Entry that heavily depend on physiological biometrics like,fingerprint or face.Implicit continuou... The predominant method for smart phone accessing is confined to methods directing the authentication by means of Point-of-Entry that heavily depend on physiological biometrics like,fingerprint or face.Implicit continuous authentication initiating to be loftier to conventional authentication mechanisms by continuously confirming users’identities on continuing basis and mark the instant at which an illegitimate hacker grasps dominance of the session.However,divergent issues remain unaddressed.This research aims to investigate the power of Deep Reinforcement Learning technique to implicit continuous authentication for mobile devices using a method called,Gaussian Weighted Cauchy Kriging-based Continuous Czekanowski’s(GWCK-CC).First,a Gaussian Weighted Non-local Mean Filter Preprocessing model is applied for reducing the noise pre-sent in the raw input face images.Cauchy Kriging Regression function is employed to reduce the dimensionality.Finally,Continuous Czekanowski’s Clas-sification is utilized for proficient classification between the genuine user and attacker.By this way,the proposed GWCK-CC method achieves accurate authen-tication with minimum error rate and time.Experimental assessment of the pro-posed GWCK-CC method and existing methods are carried out with different factors by using UMDAA-02 Face Dataset.The results confirm that the proposed GWCK-CC method enhances authentication accuracy,by 9%,reduces the authen-tication time,and error rate by 44%,and 43%as compared to the existing methods. 展开更多
关键词 deep reinforcement learning gaussian weighted non-local meanfilter cauchy kriging regression continuous czekanowski’s implicit continuous authentication mobile devices
下载PDF
Multi-Agent Deep Reinforcement Learning for Cross-Layer Scheduling in Mobile Ad-Hoc Networks
13
作者 Xinxing Zheng Yu Zhao +1 位作者 Joohyun Lee Wei Chen 《China Communications》 SCIE CSCD 2023年第8期78-88,共11页
Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus o... Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus on enabling congestion control to minimize network transmission delays through flexible power control.To effectively solve the congestion problem,we propose a distributed cross-layer scheduling algorithm,which is empowered by graph-based multi-agent deep reinforcement learning.The transmit power is adaptively adjusted in real-time by our algorithm based only on local information(i.e.,channel state information and queue length)and local communication(i.e.,information exchanged with neighbors).Moreover,the training complexity of the algorithm is low due to the regional cooperation based on the graph attention network.In the evaluation,we show that our algorithm can reduce the transmission delay of data flow under severe signal interference and drastically changing channel states,and demonstrate the adaptability and stability in different topologies.The method is general and can be extended to various types of topologies. 展开更多
关键词 Ad-hoc network cross-layer scheduling multi agent deep reinforcement learning interference elimination power control queue scheduling actorcritic methods markov decision process
下载PDF
Joint Flexible Duplexing and Power Allocation with Deep Reinforcement Learning in Cell-Free Massive MIMO System
14
作者 Danhao Deng Chaowei Wang +2 位作者 Zhi Zhang Lihua Li Weidong Wang 《China Communications》 SCIE CSCD 2023年第4期73-85,共13页
Network-assisted full duplex(NAFD)cellfree(CF)massive MIMO has drawn increasing attention in 6G evolvement.In this paper,we build an NAFD CF system in which the users and access points(APs)can flexibly select their du... Network-assisted full duplex(NAFD)cellfree(CF)massive MIMO has drawn increasing attention in 6G evolvement.In this paper,we build an NAFD CF system in which the users and access points(APs)can flexibly select their duplex modes to increase the link spectral efficiency.Then we formulate a joint flexible duplexing and power allocation problem to balance the user fairness and system spectral efficiency.We further transform the problem into a probability optimization to accommodate the shortterm communications.In contrast with the instant performance optimization,the probability optimization belongs to a sequential decision making problem,and thus we reformulate it as a Markov Decision Process(MDP).We utilizes deep reinforcement learning(DRL)algorithm to search the solution from a large state-action space,and propose an asynchronous advantage actor-critic(A3C)-based scheme to reduce the chance of converging to the suboptimal policy.Simulation results demonstrate that the A3C-based scheme is superior to the baseline schemes in term of the complexity,accumulated log spectral efficiency,and stability. 展开更多
关键词 cell-free massive MIMO flexible duplexing sum fair spectral efficiency deep reinforcement learning asynchronous advantage actor-critic
下载PDF
Deep Reinforcement Learning Based Power Minimization for RIS-Assisted MISO-OFDM Systems
15
作者 Peng Chen Wenting Huang +1 位作者 Xiao Li Shi Jin 《China Communications》 SCIE CSCD 2023年第4期259-269,共11页
In this paper,we investigate the downlink orthogonal frequency division multiplexing(OFDM)transmission system assisted by reconfigurable intelligent surfaces(RISs).Considering multiple antennas at the base station(BS)... In this paper,we investigate the downlink orthogonal frequency division multiplexing(OFDM)transmission system assisted by reconfigurable intelligent surfaces(RISs).Considering multiple antennas at the base station(BS)and multiple single-antenna users,the joint optimization of precoder at the BS and the phase shift design at the RIS is studied to minimize the transmit power under the constraint of the certain quality-of-service.A deep reinforcement learning(DRL)based algorithm is proposed,in which maximum ratio transmission(MRT)precoding is utilized at the BS and the twin delayed deep deterministic policy gradient(TD3)method is utilized for RIS phase shift optimization.Numerical results demonstrate that the proposed DRL based algorithm can achieve a transmit power almost the same with the lower bound achieved by manifold optimization(MO)algorithm while has much less computation delay. 展开更多
关键词 deep reinforcement learning OFDM PRECODING reconfigurable intelligent surface
下载PDF
Deep reinforcement learning based multi-level dynamic reconfiguration for urban distribution network:a cloud-edge collaboration architecture
16
作者 Siyuan Jiang Hongjun Gao +2 位作者 Xiaohui Wang Junyong Liu Kunyu Zuo 《Global Energy Interconnection》 EI CAS CSCD 2023年第1期1-14,共14页
With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provi... With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system. 展开更多
关键词 Cloud-edge collaboration architecture Multi-agent deep reinforcement learning Multi-level dynamic reconfiguration Offline learning Online learning
下载PDF
Multi-Agent Deep Reinforcement Learning for Efficient Computation Offloading in Mobile Edge Computing
17
作者 Tianzhe Jiao Xiaoyue Feng +2 位作者 Chaopeng Guo Dongqi Wang Jie Song 《Computers, Materials & Continua》 SCIE EI 2023年第9期3585-3603,共19页
Mobile-edge computing(MEC)is a promising technology for the fifth-generation(5G)and sixth-generation(6G)architectures,which provides resourceful computing capabilities for Internet of Things(IoT)devices,such as virtua... Mobile-edge computing(MEC)is a promising technology for the fifth-generation(5G)and sixth-generation(6G)architectures,which provides resourceful computing capabilities for Internet of Things(IoT)devices,such as virtual reality,mobile devices,and smart cities.In general,these IoT applications always bring higher energy consumption than traditional applications,which are usually energy-constrained.To provide persistent energy,many references have studied the offloading problem to save energy consumption.However,the dynamic environment dramatically increases the optimization difficulty of the offloading decision.In this paper,we aim to minimize the energy consumption of the entireMECsystemunder the latency constraint by fully considering the dynamic environment.UnderMarkov games,we propose amulti-agent deep reinforcement learning approach based on the bi-level actorcritic learning structure to jointly optimize the offloading decision and resource allocation,which can solve the combinatorial optimization problem using an asymmetric method and compute the Stackelberg equilibrium as a better convergence point than Nash equilibrium in terms of Pareto superiority.Our method can better adapt to a dynamic environment during the data transmission than the single-agent strategy and can effectively tackle the coordination problem in the multi-agent environment.The simulation results show that the proposed method could decrease the total computational overhead by 17.8%compared to the actor-critic-based method and reduce the total computational overhead by 31.3%,36.5%,and 44.7%compared with randomoffloading,all local execution,and all offloading execution,respectively. 展开更多
关键词 Computation offloading multi-agent deep reinforcement learning mobile-edge computing latency energy efficiency
下载PDF
Deep Reinforcement Learning for IRS-Assisted UAV Covert Communications
18
作者 Songjiao Bi Langtao Hu +3 位作者 Quanjin Liu Jianlan Wu Rui Yang Lei Wu 《China Communications》 SCIE CSCD 2023年第12期131-141,共11页
Covert communications can hide the existence of a transmission from the transmitter to receiver.This paper considers an intelligent reflecting surface(IRS)assisted unmanned aerial vehicle(UAV)covert communication syst... Covert communications can hide the existence of a transmission from the transmitter to receiver.This paper considers an intelligent reflecting surface(IRS)assisted unmanned aerial vehicle(UAV)covert communication system.It was inspired by the high-dimensional data processing and decisionmaking capabilities of the deep reinforcement learning(DRL)algorithm.In order to improve the covert communication performance,an UAV 3D trajectory and IRS phase optimization algorithm based on double deep Q network(TAP-DDQN)is proposed.The simulations show that TAP-DDQN can significantly improve the covert performance of the IRS-assisted UAV covert communication system,compared with benchmark solutions. 展开更多
关键词 covert communication deep reinforcement learning intelligent reflective surface UAV
下载PDF
Dynamic Task Offloading for Digital Twin-Empowered Mobile Edge Computing via Deep Reinforcement Learning
19
作者 Ying Chen Wei Gu +2 位作者 Jiajie Xu Yongchao Zhang Geyong Min 《China Communications》 SCIE CSCD 2023年第11期164-175,共12页
Limited by battery and computing re-sources,the computing-intensive tasks generated by Internet of Things(IoT)devices cannot be processed all by themselves.Mobile edge computing(MEC)is a suitable solution for this pro... Limited by battery and computing re-sources,the computing-intensive tasks generated by Internet of Things(IoT)devices cannot be processed all by themselves.Mobile edge computing(MEC)is a suitable solution for this problem,and the gener-ated tasks can be offloaded from IoT devices to MEC.In this paper,we study the problem of dynamic task offloading for digital twin-empowered MEC.Digital twin techniques are applied to provide information of environment and share the training data of agent de-ployed on IoT devices.We formulate the task offload-ing problem with the goal of maximizing the energy efficiency and the workload balance among the ESs.Then,we reformulate the problem as an MDP problem and design DRL-based energy efficient task offloading(DEETO)algorithm to solve it.Comparative experi-ments are carried out which show the superiority of our DEETO algorithm in improving energy efficiency and balancing the workload. 展开更多
关键词 deep reinforcement learning digital twin Internet of Things mobile edge computing
下载PDF
Discrete Phase Shifts Control and Beam Selection in RIS-Aided MISO System via Deep Reinforcement Learning
20
作者 Dongting Lin Yuan Liu 《China Communications》 SCIE CSCD 2023年第8期198-208,共11页
Reconfigurable intelligent surface(RIS)for wireless networks have drawn lots of attention in both academic and industry communities.RIS can dynamically control the phases of the reflection elements to send the signal ... Reconfigurable intelligent surface(RIS)for wireless networks have drawn lots of attention in both academic and industry communities.RIS can dynamically control the phases of the reflection elements to send the signal in the desired direction,thus it provides supplementary links for wireless networks.Most of prior works on RIS-aided wireless communication systems consider continuous phase shifts,but phase shifts of RIS are discrete in practical hardware.Thus we focus on the actual discrete phase shifts on RIS in this paper.Using the advanced deep reinforcement learning(DRL),we jointly optimize the transmit beamforming matrix from the discrete Fourier transform(DFT)codebook at the base station(BS)and the discrete phase shifts at the RIS to maximize the received signal-to-interference plus noise ratio(SINR).Unlike the traditional schemes usually using alternate optimization methods to solve the transmit beamforming and phase shifts,the DRL algorithm proposed in the paper can jointly design the transmit beamforming and phase shifts as the output of the DRL neural network.Numerical results indicate that the DRL proposed can dispose the complicated optimization problem with low computational complexity. 展开更多
关键词 reconfigurable intelligent surface discrete phase shifts transmit beamforming deep reinforcement learning
下载PDF
上一页 1 2 11 下一页 到第
使用帮助 返回顶部