Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
The flying foxes optimization(FFO)algorithm,as a newly introduced metaheuristic algorithm,is inspired by the survival tactics of flying foxes in heat wave environments.FFO preferentially selects the best-performing in...The flying foxes optimization(FFO)algorithm,as a newly introduced metaheuristic algorithm,is inspired by the survival tactics of flying foxes in heat wave environments.FFO preferentially selects the best-performing individuals.This tendency will cause the newly generated solution to remain closely tied to the candidate optimal in the search area.To address this issue,the paper introduces an opposition-based learning-based search mechanism for FFO algorithm(IFFO).Firstly,this paper introduces niching techniques to improve the survival list method,which not only focuses on the adaptability of individuals but also considers the population’s crowding degree to enhance the global search capability.Secondly,an initialization strategy of opposition-based learning is used to perturb the initial population and elevate its quality.Finally,to verify the superiority of the improved search mechanism,IFFO,FFO and the cutting-edge metaheuristic algorithms are compared and analyzed using a set of test functions.The results prove that compared with other algorithms,IFFO is characterized by its rapid convergence,precise results and robust stability.展开更多
Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information ...Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.展开更多
As the simplest hydrogen-bonded alcohol,liquid methanol has attracted intensive experimental and theoretical interest.However,theoretical investigations on this system have primarily relied on empirical intermolecular...As the simplest hydrogen-bonded alcohol,liquid methanol has attracted intensive experimental and theoretical interest.However,theoretical investigations on this system have primarily relied on empirical intermolecular force fields or ab initio molecular dynamics with semilocal density functionals.Inspired by recent studies on bulk water using increasingly accurate machine learning force fields,we report a new machine learning force field for liquid methanol with a hybrid functional revPBE0 plus dispersion correction.Molecular dynamics simulations on this machine learning force field are orders of magnitude faster than ab initio molecular dynamics simulations,yielding the radial distribution functions,selfdiffusion coefficients,and hydrogen bond network properties with very small statistical errors.The resulting structural and dynamical properties are compared well with the experimental data,demonstrating the superior accuracy of this machine learning force field.This work represents a successful step toward a first-principles description of this benchmark system and showcases the general applicability of the machine learning force field in studying liquid systems.展开更多
We present a large deviation theory that characterizes the exponential estimate for rare events in stochastic dynamical systems in the limit of weak noise.We aim to consider a next-to-leading-order approximation for m...We present a large deviation theory that characterizes the exponential estimate for rare events in stochastic dynamical systems in the limit of weak noise.We aim to consider a next-to-leading-order approximation for more accurate calculation of the mean exit time by computing large deviation prefactors with the aid of machine learning.More specifically,we design a neural network framework to compute quasipotential,most probable paths and prefactors based on the orthogonal decomposition of a vector field.We corroborate the higher effectiveness and accuracy of our algorithm with two toy models.Numerical experiments demonstrate its powerful functionality in exploring the internal mechanism of rare events triggered by weak random fluctuations.展开更多
Discrete dislocation dynamics(DDD)simulations reveal the evolution of dislocation structures and the interaction of dislocations.This study investigated the compression behavior of single-crystal copper micropillars u...Discrete dislocation dynamics(DDD)simulations reveal the evolution of dislocation structures and the interaction of dislocations.This study investigated the compression behavior of single-crystal copper micropillars using fewshot machine learning with data provided by DDD simulations.Two types of features are considered:external features comprising specimen size and loading orientation and internal features involving dislocation source length,Schmid factor,the orientation of the most easily activated dislocations and their distance from the free boundary.The yielding stress and stress-strain curves of single-crystal copper micropillar are predicted well by incorporating both external and internal features of the sample as separate or combined inputs.It is found that the machine learning accuracy predictions for single-crystal micropillar compression can be improved by incorporating easily activated dislocation features with external features.However,the effect of easily activated dislocation on yielding is less important compared to the effects of specimen size and Schmid factor which includes information of orientation but becomes more evident in small-sized micropillars.Overall,incorporating internal features,especially the information of most easily activated dislocations,improves predictive capabilities across diverse sample sizes and orientations.展开更多
With the maturity and development of 5G field,Mobile Edge CrowdSensing(MECS),as an intelligent data collection paradigm,provides a broad prospect for various applications in IoT.However,sensing users as data uploaders...With the maturity and development of 5G field,Mobile Edge CrowdSensing(MECS),as an intelligent data collection paradigm,provides a broad prospect for various applications in IoT.However,sensing users as data uploaders lack a balance between data benefits and privacy threats,leading to conservative data uploads and low revenue or excessive uploads and privacy breaches.To solve this problem,a Dynamic Privacy Measurement and Protection(DPMP)framework is proposed based on differential privacy and reinforcement learning.Firstly,a DPM model is designed to quantify the amount of data privacy,and a calculation method for personalized privacy threshold of different users is also designed.Furthermore,a Dynamic Private sensing data Selection(DPS)algorithm is proposed to help sensing users maximize data benefits within their privacy thresholds.Finally,theoretical analysis and ample experiment results show that DPMP framework is effective and efficient to achieve a balance between data benefits and sensing user privacy protection,in particular,the proposed DPMP framework has 63%and 23%higher training efficiency and data benefits,respectively,compared to the Monte Carlo algorithm.展开更多
Conventional wing aerodynamic optimization processes can be time-consuming and imprecise due to the complexity of versatile flight missions.Plenty of existing literature has considered two-dimensional infinite airfoil...Conventional wing aerodynamic optimization processes can be time-consuming and imprecise due to the complexity of versatile flight missions.Plenty of existing literature has considered two-dimensional infinite airfoil optimization,while three-dimensional finite wing optimizations are subject to limited study because of high computational costs.Here we create an adaptive optimization methodology built upon digitized wing shape deformation and deep learning algorithms,which enable the rapid formulation of finite wing designs for specific aerodynamic performance demands under different cruise conditions.This methodology unfolds in three stages:radial basis function interpolated wing generation,collection of inputs from computational fluid dynamics simulations,and deep neural network that constructs the surrogate model for the optimal wing configuration.It has been demonstrated that the proposed methodology can significantly reduce the computational cost of numerical simulations.It also has the potential to optimize various aerial vehicles undergoing different mission environments,loading conditions,and safety requirements.展开更多
Pipeline isolation plugging robot (PIPR) is an important tool in pipeline maintenance operation. During the plugging process, the violent vibration will occur by the flow field, which can cause serious damage to the p...Pipeline isolation plugging robot (PIPR) is an important tool in pipeline maintenance operation. During the plugging process, the violent vibration will occur by the flow field, which can cause serious damage to the pipeline and PIPR. In this paper, we propose a dynamic regulating strategy to reduce the plugging-induced vibration by regulating the spoiler angle and plugging velocity. Firstly, the dynamic plugging simulation and experiment are performed to study the flow field changes during dynamic plugging. And the pressure difference is proposed to evaluate the degree of flow field vibration. Secondly, the mathematical models of pressure difference with plugging states and spoiler angles are established based on the extreme learning machine (ELM) optimized by improved sparrow search algorithm (ISSA). Finally, a modified Q-learning algorithm based on simulated annealing is applied to determine the optimal strategy for the spoiler angle and plugging velocity in real time. The results show that the proposed method can reduce the plugging-induced vibration by 19.9% and 32.7% on average, compared with single-regulating methods. This study can effectively ensure the stability of the plugging process.展开更多
Organizations are adopting the Bring Your Own Device(BYOD)concept to enhance productivity and reduce expenses.However,this trend introduces security challenges,such as unauthorized access.Traditional access control sy...Organizations are adopting the Bring Your Own Device(BYOD)concept to enhance productivity and reduce expenses.However,this trend introduces security challenges,such as unauthorized access.Traditional access control systems,such as Attribute-Based Access Control(ABAC)and Role-Based Access Control(RBAC),are limited in their ability to enforce access decisions due to the variability and dynamism of attributes related to users and resources.This paper proposes a method for enforcing access decisions that is adaptable and dynamic,based on multilayer hybrid deep learning techniques,particularly the Tabular Deep Neural Network Tabular DNN method.This technique transforms all input attributes in an access request into a binary classification(allow or deny)using multiple layers,ensuring accurate and efficient access decision-making.The proposed solution was evaluated using the Kaggle Amazon access control policy dataset and demonstrated its effectiveness by achieving a 94%accuracy rate.Additionally,the proposed solution enhances the implementation of access decisions based on a variety of resource and user attributes while ensuring privacy through indirect communication with the Policy Administration Point(PAP).This solution significantly improves the flexibility of access control systems,making themmore dynamic and adaptable to the evolving needs ofmodern organizations.Furthermore,it offers a scalable approach to manage the complexities associated with the BYOD environment,providing a robust framework for secure and efficient access management.展开更多
Traditional optimal scheduling methods are limited to accurate physical models and parameter settings, which aredifficult to adapt to the uncertainty of source and load, and there are problems such as the inability to...Traditional optimal scheduling methods are limited to accurate physical models and parameter settings, which aredifficult to adapt to the uncertainty of source and load, and there are problems such as the inability to make dynamicdecisions continuously. This paper proposed a dynamic economic scheduling method for distribution networksbased on deep reinforcement learning. Firstly, the economic scheduling model of the new energy distributionnetwork is established considering the action characteristics of micro-gas turbines, and the dynamic schedulingmodel based on deep reinforcement learning is constructed for the new energy distribution network system with ahigh proportion of new energy, and the Markov decision process of the model is defined. Secondly, Second, for thechanging characteristics of source-load uncertainty, agents are trained interactively with the distributed networkin a data-driven manner. Then, through the proximal policy optimization algorithm, agents adaptively learn thescheduling strategy and realize the dynamic scheduling decision of the new energy distribution network system.Finally, the feasibility and superiority of the proposed method are verified by an improved IEEE 33-node simulationsystem.展开更多
With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provi...With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system.展开更多
Machine learning(ML)methods with good applicability to complex and highly nonlinear sequences have been attracting much attention in recent years for predictions of complicated mechanical properties of various materia...Machine learning(ML)methods with good applicability to complex and highly nonlinear sequences have been attracting much attention in recent years for predictions of complicated mechanical properties of various materials.As one of the widely known ML methods,back-propagation(BP)neural networks with and without optimization by genetic algorithm(GA)are also established for comparisons of time cost and prediction error.With the aim to further increase the prediction accuracy and efficiency,this paper proposes a long short-term memory(LSTM)networks model to predict the dynamic compressive performance of concrete-like materials at high strain rates.Dynamic explicit analysis is performed in the finite element(FE)software ABAQUS to simulate various waveforms in the split Hopkinson pressure bar(SHPB)experiments by applying different stress waves in the incident bar.The FE simulation accuracy is validated against SHPB experimental results from the viewpoint of dynamic increase factor.In order to cover more extensive loading scenarios,60 sets of FE simulations are conducted in this paper to generate three kinds of waveforms in the incident and transmission bars of SHPB experiments.By training the proposed three networks,the nonlinear mapping relations can be reasonably established between incident,reflect,and transmission waves.Statistical measures are used to quantify the network prediction accuracy,confirming that the predicted stress-strain curves of concrete-like materials at high strain rates by the proposed networks agree sufficiently with those by FE simulations.It is found that compared with BP network,the GA-BP network can effectively stabilize the network structure,indicating that the GA optimization improves the prediction accuracy of the SHPB dynamic responses by performing the crossover and mutation operations of weights and thresholds in the original BP network.By eliminating the long-time dependencies,the proposed LSTM network achieves better results than the BP and GA-BP networks,since smaller mean square error(MSE)and higher correlation coefficient are achieved.More importantly,the proposed LSTM algorithm,after the training process with a limited number of FE simulations,could replace the time-consuming and laborious FE pre-and post-processing and modelling.展开更多
Social infrastructures such as dams are likely to be exposed to high risk of terrorist and military attacks,leading to increasing attentions on their vulnerability and catastrophic consequences under such events.This ...Social infrastructures such as dams are likely to be exposed to high risk of terrorist and military attacks,leading to increasing attentions on their vulnerability and catastrophic consequences under such events.This paper tries to develop advanced deep learning approaches for structural dynamic response prediction and dam health diagnosis.At first,the improved long short-term memory(LSTM)networks are proposed for data-driven structural dynamic response analysis with the data generated by a single degree of freedom(SDOF)and the finite numerical simulation,due to the unavailability of abundant practical structural response data of concrete gravity dam under blast events.Three kinds of LSTM-based models are discussed with the various cases of noise-contaminated signals,and the results prove that LSTM-based models have the potential for quick structural response estimation under blast loads.Furthermore,the damage indicators(i.e.,peak vibration velocity and domain frequency)are extracted from the predicted velocity histories,and their relationship with the dam damage status from the numerical simulation is established.This study provides a deep-learning based structural health monitoring(SHM)framework for quick assessment of dam experienced underwater explosions through blastinduced monitoring data.展开更多
GaP has been shown to be a promising photoelectrocatalyst for selective CO_(2)reduction to methanol.Due to the relevance of the interface structure to important processes such as electron/proton transfer,a detailed un...GaP has been shown to be a promising photoelectrocatalyst for selective CO_(2)reduction to methanol.Due to the relevance of the interface structure to important processes such as electron/proton transfer,a detailed understanding of the GaP(110)-water interfacial structure is of great importance.Ab initio molecular dynamics(AIMD)can be used for obtaining the microscopic information of the interfacial structure.However,the GaP(110)-water interface cannot converge to an equilibrated structure at the time scale of the AIMD simulation.In this work,we perform the machine learning accelerated molecular dynamics(MLMD)to overcome the difficulty of insufficient sampling by AIMD.With the help of MLMD,we unravel the microscopic information of the structure of the GaP(110)-water interface,and obtain a deeper understanding of the mechanisms of proton transfer at the GaP(110)-water interface,which will pave the way for gaining valuable insights into photoelectrocatalytic mechanisms and improving the performance of photoelectrochemical cells.展开更多
As a new bionic algorithm,Spider Monkey Optimization(SMO)has been widely used in various complex optimization problems in recent years.However,the new space exploration power of SMO is limited and the diversity of the...As a new bionic algorithm,Spider Monkey Optimization(SMO)has been widely used in various complex optimization problems in recent years.However,the new space exploration power of SMO is limited and the diversity of the population in SMO is not abundant.Thus,this paper focuses on how to reconstruct SMO to improve its performance,and a novel spider monkey optimization algorithm with opposition-based learning and orthogonal experimental design(SMO^(3))is developed.A position updatingmethod based on the historical optimal domain and particle swarmfor Local Leader Phase(LLP)andGlobal Leader Phase(GLP)is presented to improve the diversity of the population of SMO.Moreover,an opposition-based learning strategy based on self-extremum is proposed to avoid suffering from premature convergence and getting stuck at locally optimal values.Also,a local worst individual elimination method based on orthogonal experimental design is used for helping the SMO algorithm eliminate the poor individuals in time.Furthermore,an extended SMO^(3)named CSMO^(3)is investigated to deal with constrained optimization problems.The proposed algorithm is applied to both unconstrained and constrained functions which include the CEC2006 benchmark set and three engineering problems.Experimental results show that the performance of the proposed algorithm is better than three well-known SMO algorithms and other evolutionary algorithms in unconstrained and constrained problems.展开更多
The solution of fractional-order systems has been a complex problem for our research.Traditional methods like the predictor-corrector method and other solution steps are complicated and cumbersome to derive,which make...The solution of fractional-order systems has been a complex problem for our research.Traditional methods like the predictor-corrector method and other solution steps are complicated and cumbersome to derive,which makes it more difficult for our solution efficiency.The development of machine learning and nonlinear dynamics has provided us with new ideas to solve some complex problems.Therefore,this study considers how to improve the accuracy and efficiency of the solution based on traditional methods.Finally,we propose an efficient and accurate nonlinear auto-regressive neural network for the fractional order dynamic system prediction model(FODS-NAR).First,we demonstrate by example that the FODS-NAR algorithm can predict the solution of a stochastic fractional order system.Second,we compare the FODS-NAR algorithm with the famous and good reservoir computing(RC)algorithms.We find that FODS-NAR gives more accurate predictions than the traditional RC algorithm with the same system parameters,and the residuals of the FODS-NAR algorithm are closer to 0.Consequently,we conclude that the FODS-NAR algorithm is a method with higher accuracy and prediction results closer to the state of fractional-order stochastic systems.In addition,we analyze the effects of the number of neurons and the order of delays in the FODS-NAR algorithm on the prediction results and derive a range of their optimal values.展开更多
Dynamic area coverage with small unmanned aerial vehicle(UAV)systems is one of the major research topics due to limited payloads and the difficulty of decentralized decision-making process.Collaborative behavior of a ...Dynamic area coverage with small unmanned aerial vehicle(UAV)systems is one of the major research topics due to limited payloads and the difficulty of decentralized decision-making process.Collaborative behavior of a group of UAVs in an unknown environment is another hard problem to be solved.In this paper,we propose a method for decentralized execution of multi-UAVs for dynamic area coverage problems.The proposed decentralized decision-making dynamic area coverage(DDMDAC)method utilizes reinforcement learning(RL)where each UAV is represented by an intelligent agent that learns policies to create collaborative behaviors in partially observable environment.Intelligent agents increase their global observations by gathering information about the environment by connecting with other agents.The connectivity provides a consensus for the decision-making process,while each agent takes decisions.At each step,agents acquire all reachable agents’states,determine the optimum location for maximal area coverage and receive reward using the covered rate on the target area,respectively.The method was tested in a multi-agent actor-critic simulation platform.In the study,it has been considered that each UAV has a certain communication distance as in real applications.The results show that UAVs with limited communication distance can act jointly in the target area and can successfully cover the area without guidance from the central command unit.展开更多
Dynamic path planning is crucial for mobile robots to navigate successfully in unstructured envi-ronments.To achieve globally optimal path and real-time dynamic obstacle avoidance during the movement,a dynamic path pl...Dynamic path planning is crucial for mobile robots to navigate successfully in unstructured envi-ronments.To achieve globally optimal path and real-time dynamic obstacle avoidance during the movement,a dynamic path planning algorithm incorporating improved IB-RRT∗and deep reinforce-ment learning(DRL)is proposed.Firstly,an improved IB-RRT∗algorithm is proposed for global path planning by combining double elliptic subset sampling and probabilistic central circle target bi-as.Then,to tackle the slow response to dynamic obstacles and inadequate obstacle avoidance of tra-ditional local path planning algorithms,deep reinforcement learning is utilized to predict the move-ment trend of dynamic obstacles,leading to a dynamic fusion path planning.Finally,the simulation and experiment results demonstrate that the proposed improved IB-RRT∗algorithm has higher con-vergence speed and search efficiency compared with traditional Bi-RRT∗,Informed-RRT∗,and IB-RRT∗algorithms.Furthermore,the proposed fusion algorithm can effectively perform real-time obsta-cle avoidance and navigation tasks for mobile robots in unstructured environments.展开更多
The current digital educational resources are of many kinds and large quantities, to solve the problems existing in the existing dynamic resource selection methods, a dynamic resource selection method based on machine...The current digital educational resources are of many kinds and large quantities, to solve the problems existing in the existing dynamic resource selection methods, a dynamic resource selection method based on machine learning is proposed. Firstly, according to the knowledge structure and concepts of mathematical resources, combined with the basic components of dynamic mathematical resources, the knowledge structure graph of mathematical resources is constructed;according to the characteristics of mathematical resources, the interaction between users and resources is simulated, and the graph of the main body of the resources is identified, and the candidate collection of mathematical knowledge is selected;finally, according to the degree of matching between mathematical literature and the candidate collection, machine learning is utilized, and the mathematical resources are screened.展开更多
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金support from the Ningxia Natural Science Foundation Project(2023AAC03361).
文摘The flying foxes optimization(FFO)algorithm,as a newly introduced metaheuristic algorithm,is inspired by the survival tactics of flying foxes in heat wave environments.FFO preferentially selects the best-performing individuals.This tendency will cause the newly generated solution to remain closely tied to the candidate optimal in the search area.To address this issue,the paper introduces an opposition-based learning-based search mechanism for FFO algorithm(IFFO).Firstly,this paper introduces niching techniques to improve the survival list method,which not only focuses on the adaptability of individuals but also considers the population’s crowding degree to enhance the global search capability.Secondly,an initialization strategy of opposition-based learning is used to perturb the initial population and elevate its quality.Finally,to verify the superiority of the improved search mechanism,IFFO,FFO and the cutting-edge metaheuristic algorithms are compared and analyzed using a set of test functions.The results prove that compared with other algorithms,IFFO is characterized by its rapid convergence,precise results and robust stability.
文摘Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.
基金supported by the CAS Project for Young Scientists in Basic Research(YSBR-005)the National Natural Science Foundation of China(22325304,22221003 and 22033007)We acknowledge the Supercomputing Center of USTC,Hefei Advanced Computing Center,Beijing PARATERA Tech Co.,Ltd.,for providing high-performance computing services。
文摘As the simplest hydrogen-bonded alcohol,liquid methanol has attracted intensive experimental and theoretical interest.However,theoretical investigations on this system have primarily relied on empirical intermolecular force fields or ab initio molecular dynamics with semilocal density functionals.Inspired by recent studies on bulk water using increasingly accurate machine learning force fields,we report a new machine learning force field for liquid methanol with a hybrid functional revPBE0 plus dispersion correction.Molecular dynamics simulations on this machine learning force field are orders of magnitude faster than ab initio molecular dynamics simulations,yielding the radial distribution functions,selfdiffusion coefficients,and hydrogen bond network properties with very small statistical errors.The resulting structural and dynamical properties are compared well with the experimental data,demonstrating the superior accuracy of this machine learning force field.This work represents a successful step toward a first-principles description of this benchmark system and showcases the general applicability of the machine learning force field in studying liquid systems.
基金Project supported by the Natural Science Foundation of Jiangsu Province (Grant No.BK20220917)the National Natural Science Foundation of China (Grant Nos.12001213 and 12302035)。
文摘We present a large deviation theory that characterizes the exponential estimate for rare events in stochastic dynamical systems in the limit of weak noise.We aim to consider a next-to-leading-order approximation for more accurate calculation of the mean exit time by computing large deviation prefactors with the aid of machine learning.More specifically,we design a neural network framework to compute quasipotential,most probable paths and prefactors based on the orthogonal decomposition of a vector field.We corroborate the higher effectiveness and accuracy of our algorithm with two toy models.Numerical experiments demonstrate its powerful functionality in exploring the internal mechanism of rare events triggered by weak random fluctuations.
基金supported by the National Natural Science Foundation of China(Grant Nos.12192214 and 12222209).
文摘Discrete dislocation dynamics(DDD)simulations reveal the evolution of dislocation structures and the interaction of dislocations.This study investigated the compression behavior of single-crystal copper micropillars using fewshot machine learning with data provided by DDD simulations.Two types of features are considered:external features comprising specimen size and loading orientation and internal features involving dislocation source length,Schmid factor,the orientation of the most easily activated dislocations and their distance from the free boundary.The yielding stress and stress-strain curves of single-crystal copper micropillar are predicted well by incorporating both external and internal features of the sample as separate or combined inputs.It is found that the machine learning accuracy predictions for single-crystal micropillar compression can be improved by incorporating easily activated dislocation features with external features.However,the effect of easily activated dislocation on yielding is less important compared to the effects of specimen size and Schmid factor which includes information of orientation but becomes more evident in small-sized micropillars.Overall,incorporating internal features,especially the information of most easily activated dislocations,improves predictive capabilities across diverse sample sizes and orientations.
基金supported in part by the National Natural Science Foundation of China under Grant U1905211,Grant 61872088,Grant 62072109,Grant 61872090,and Grant U1804263in part by the Guangxi Key Laboratory of Trusted Software under Grant KX202042+3 种基金in part by the Science and Technology Major Support Program of Guizhou Province under Grant 20183001in part by the Science and Technology Program of Guizhou Province under Grant 20191098in part by the Project of High-level Innovative Talents of Guizhou Province under Grant 20206008in part by the Open Research Fund of Key Laboratory of Cryptography of Zhejiang Province under Grant ZCL21015.
文摘With the maturity and development of 5G field,Mobile Edge CrowdSensing(MECS),as an intelligent data collection paradigm,provides a broad prospect for various applications in IoT.However,sensing users as data uploaders lack a balance between data benefits and privacy threats,leading to conservative data uploads and low revenue or excessive uploads and privacy breaches.To solve this problem,a Dynamic Privacy Measurement and Protection(DPMP)framework is proposed based on differential privacy and reinforcement learning.Firstly,a DPM model is designed to quantify the amount of data privacy,and a calculation method for personalized privacy threshold of different users is also designed.Furthermore,a Dynamic Private sensing data Selection(DPS)algorithm is proposed to help sensing users maximize data benefits within their privacy thresholds.Finally,theoretical analysis and ample experiment results show that DPMP framework is effective and efficient to achieve a balance between data benefits and sensing user privacy protection,in particular,the proposed DPMP framework has 63%and 23%higher training efficiency and data benefits,respectively,compared to the Monte Carlo algorithm.
基金supported by CITRIS and the Banatao Institute,Air Force Office of Scientific Research(Grant No.FA9550-22-1-0420)National Science Foundation(Grant No.ACI-1548562).
文摘Conventional wing aerodynamic optimization processes can be time-consuming and imprecise due to the complexity of versatile flight missions.Plenty of existing literature has considered two-dimensional infinite airfoil optimization,while three-dimensional finite wing optimizations are subject to limited study because of high computational costs.Here we create an adaptive optimization methodology built upon digitized wing shape deformation and deep learning algorithms,which enable the rapid formulation of finite wing designs for specific aerodynamic performance demands under different cruise conditions.This methodology unfolds in three stages:radial basis function interpolated wing generation,collection of inputs from computational fluid dynamics simulations,and deep neural network that constructs the surrogate model for the optimal wing configuration.It has been demonstrated that the proposed methodology can significantly reduce the computational cost of numerical simulations.It also has the potential to optimize various aerial vehicles undergoing different mission environments,loading conditions,and safety requirements.
基金This work was financially supported by the National Natural Science Foundation of China(Grant No.51575528)the Science Foundation of China University of Petroleum,Beijing(No.2462022QEDX011).
文摘Pipeline isolation plugging robot (PIPR) is an important tool in pipeline maintenance operation. During the plugging process, the violent vibration will occur by the flow field, which can cause serious damage to the pipeline and PIPR. In this paper, we propose a dynamic regulating strategy to reduce the plugging-induced vibration by regulating the spoiler angle and plugging velocity. Firstly, the dynamic plugging simulation and experiment are performed to study the flow field changes during dynamic plugging. And the pressure difference is proposed to evaluate the degree of flow field vibration. Secondly, the mathematical models of pressure difference with plugging states and spoiler angles are established based on the extreme learning machine (ELM) optimized by improved sparrow search algorithm (ISSA). Finally, a modified Q-learning algorithm based on simulated annealing is applied to determine the optimal strategy for the spoiler angle and plugging velocity in real time. The results show that the proposed method can reduce the plugging-induced vibration by 19.9% and 32.7% on average, compared with single-regulating methods. This study can effectively ensure the stability of the plugging process.
基金partly supported by the University of Malaya Impact Oriented Interdisci-plinary Research Grant under Grant IIRG008(A,B,C)-19IISS.
文摘Organizations are adopting the Bring Your Own Device(BYOD)concept to enhance productivity and reduce expenses.However,this trend introduces security challenges,such as unauthorized access.Traditional access control systems,such as Attribute-Based Access Control(ABAC)and Role-Based Access Control(RBAC),are limited in their ability to enforce access decisions due to the variability and dynamism of attributes related to users and resources.This paper proposes a method for enforcing access decisions that is adaptable and dynamic,based on multilayer hybrid deep learning techniques,particularly the Tabular Deep Neural Network Tabular DNN method.This technique transforms all input attributes in an access request into a binary classification(allow or deny)using multiple layers,ensuring accurate and efficient access decision-making.The proposed solution was evaluated using the Kaggle Amazon access control policy dataset and demonstrated its effectiveness by achieving a 94%accuracy rate.Additionally,the proposed solution enhances the implementation of access decisions based on a variety of resource and user attributes while ensuring privacy through indirect communication with the Policy Administration Point(PAP).This solution significantly improves the flexibility of access control systems,making themmore dynamic and adaptable to the evolving needs ofmodern organizations.Furthermore,it offers a scalable approach to manage the complexities associated with the BYOD environment,providing a robust framework for secure and efficient access management.
基金the State Grid Liaoning Electric Power Supply Co.,Ltd.(Research on Scheduling Decision Technology Based on Interactive Reinforcement Learning for Adapting High Proportion of New Energy,No.2023YF-49).
文摘Traditional optimal scheduling methods are limited to accurate physical models and parameter settings, which aredifficult to adapt to the uncertainty of source and load, and there are problems such as the inability to make dynamicdecisions continuously. This paper proposed a dynamic economic scheduling method for distribution networksbased on deep reinforcement learning. Firstly, the economic scheduling model of the new energy distributionnetwork is established considering the action characteristics of micro-gas turbines, and the dynamic schedulingmodel based on deep reinforcement learning is constructed for the new energy distribution network system with ahigh proportion of new energy, and the Markov decision process of the model is defined. Secondly, Second, for thechanging characteristics of source-load uncertainty, agents are trained interactively with the distributed networkin a data-driven manner. Then, through the proximal policy optimization algorithm, agents adaptively learn thescheduling strategy and realize the dynamic scheduling decision of the new energy distribution network system.Finally, the feasibility and superiority of the proposed method are verified by an improved IEEE 33-node simulationsystem.
基金supported by the National Natural Science Foundation of China under Grant 52077146.
文摘With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system.
基金supported by the National Natural Science Foundation of China (No. 52175148)the Natural Science Foundation of Shaanxi Province (No. 2021KW-25)+1 种基金the Open Cooperation Innovation Fund of Xi’an Modern Chemistry Research Institute (No. SYJJ20210409)the Fundamental Research Funds for the Central Universities (No. 3102018ZY015)
文摘Machine learning(ML)methods with good applicability to complex and highly nonlinear sequences have been attracting much attention in recent years for predictions of complicated mechanical properties of various materials.As one of the widely known ML methods,back-propagation(BP)neural networks with and without optimization by genetic algorithm(GA)are also established for comparisons of time cost and prediction error.With the aim to further increase the prediction accuracy and efficiency,this paper proposes a long short-term memory(LSTM)networks model to predict the dynamic compressive performance of concrete-like materials at high strain rates.Dynamic explicit analysis is performed in the finite element(FE)software ABAQUS to simulate various waveforms in the split Hopkinson pressure bar(SHPB)experiments by applying different stress waves in the incident bar.The FE simulation accuracy is validated against SHPB experimental results from the viewpoint of dynamic increase factor.In order to cover more extensive loading scenarios,60 sets of FE simulations are conducted in this paper to generate three kinds of waveforms in the incident and transmission bars of SHPB experiments.By training the proposed three networks,the nonlinear mapping relations can be reasonably established between incident,reflect,and transmission waves.Statistical measures are used to quantify the network prediction accuracy,confirming that the predicted stress-strain curves of concrete-like materials at high strain rates by the proposed networks agree sufficiently with those by FE simulations.It is found that compared with BP network,the GA-BP network can effectively stabilize the network structure,indicating that the GA optimization improves the prediction accuracy of the SHPB dynamic responses by performing the crossover and mutation operations of weights and thresholds in the original BP network.By eliminating the long-time dependencies,the proposed LSTM network achieves better results than the BP and GA-BP networks,since smaller mean square error(MSE)and higher correlation coefficient are achieved.More importantly,the proposed LSTM algorithm,after the training process with a limited number of FE simulations,could replace the time-consuming and laborious FE pre-and post-processing and modelling.
基金supported by a grant from the National Natural Science Foundation of China(Grant No.52109163 and 51979188).
文摘Social infrastructures such as dams are likely to be exposed to high risk of terrorist and military attacks,leading to increasing attentions on their vulnerability and catastrophic consequences under such events.This paper tries to develop advanced deep learning approaches for structural dynamic response prediction and dam health diagnosis.At first,the improved long short-term memory(LSTM)networks are proposed for data-driven structural dynamic response analysis with the data generated by a single degree of freedom(SDOF)and the finite numerical simulation,due to the unavailability of abundant practical structural response data of concrete gravity dam under blast events.Three kinds of LSTM-based models are discussed with the various cases of noise-contaminated signals,and the results prove that LSTM-based models have the potential for quick structural response estimation under blast loads.Furthermore,the damage indicators(i.e.,peak vibration velocity and domain frequency)are extracted from the predicted velocity histories,and their relationship with the dam damage status from the numerical simulation is established.This study provides a deep-learning based structural health monitoring(SHM)framework for quick assessment of dam experienced underwater explosions through blastinduced monitoring data.
基金the National Natural Science Foundation of China(22225302,21991151,21991150,22021001,92161113,91945301)the Fundamental Research Funds for the Central Universities(20720220009)+1 种基金the China Postdoctoral Science Foundation(2020 M682079)the Guangdong Basic and Applied Basic Research Foundation(2020A1515110539)。
文摘GaP has been shown to be a promising photoelectrocatalyst for selective CO_(2)reduction to methanol.Due to the relevance of the interface structure to important processes such as electron/proton transfer,a detailed understanding of the GaP(110)-water interfacial structure is of great importance.Ab initio molecular dynamics(AIMD)can be used for obtaining the microscopic information of the interfacial structure.However,the GaP(110)-water interface cannot converge to an equilibrated structure at the time scale of the AIMD simulation.In this work,we perform the machine learning accelerated molecular dynamics(MLMD)to overcome the difficulty of insufficient sampling by AIMD.With the help of MLMD,we unravel the microscopic information of the structure of the GaP(110)-water interface,and obtain a deeper understanding of the mechanisms of proton transfer at the GaP(110)-water interface,which will pave the way for gaining valuable insights into photoelectrocatalytic mechanisms and improving the performance of photoelectrochemical cells.
基金supported by the First Batch of Teaching Reform Projects of Zhejiang Higher Education“14th Five-Year Plan”(jg20220434)Special Scientific Research Project for Space Debris and Near-Earth Asteroid Defense(KJSP2020020202)+1 种基金Natural Science Foundation of Zhejiang Province(LGG19F030010)National Natural Science Foundation of China(61703183).
文摘As a new bionic algorithm,Spider Monkey Optimization(SMO)has been widely used in various complex optimization problems in recent years.However,the new space exploration power of SMO is limited and the diversity of the population in SMO is not abundant.Thus,this paper focuses on how to reconstruct SMO to improve its performance,and a novel spider monkey optimization algorithm with opposition-based learning and orthogonal experimental design(SMO^(3))is developed.A position updatingmethod based on the historical optimal domain and particle swarmfor Local Leader Phase(LLP)andGlobal Leader Phase(GLP)is presented to improve the diversity of the population of SMO.Moreover,an opposition-based learning strategy based on self-extremum is proposed to avoid suffering from premature convergence and getting stuck at locally optimal values.Also,a local worst individual elimination method based on orthogonal experimental design is used for helping the SMO algorithm eliminate the poor individuals in time.Furthermore,an extended SMO^(3)named CSMO^(3)is investigated to deal with constrained optimization problems.The proposed algorithm is applied to both unconstrained and constrained functions which include the CEC2006 benchmark set and three engineering problems.Experimental results show that the performance of the proposed algorithm is better than three well-known SMO algorithms and other evolutionary algorithms in unconstrained and constrained problems.
基金supported by the National Natural Science Foundation of China(NNSFC)(Grant No.11902234)Natural Science Basic Research Program of Shaanxi(Program No.2020JQ-853)+1 种基金Shaanxi Provincial Department of Education Youth Innovation Team Scientific Research Project(Program No.22JP025)the Young Talents Development Support Program of Xi’an University of Finance and Economics.
文摘The solution of fractional-order systems has been a complex problem for our research.Traditional methods like the predictor-corrector method and other solution steps are complicated and cumbersome to derive,which makes it more difficult for our solution efficiency.The development of machine learning and nonlinear dynamics has provided us with new ideas to solve some complex problems.Therefore,this study considers how to improve the accuracy and efficiency of the solution based on traditional methods.Finally,we propose an efficient and accurate nonlinear auto-regressive neural network for the fractional order dynamic system prediction model(FODS-NAR).First,we demonstrate by example that the FODS-NAR algorithm can predict the solution of a stochastic fractional order system.Second,we compare the FODS-NAR algorithm with the famous and good reservoir computing(RC)algorithms.We find that FODS-NAR gives more accurate predictions than the traditional RC algorithm with the same system parameters,and the residuals of the FODS-NAR algorithm are closer to 0.Consequently,we conclude that the FODS-NAR algorithm is a method with higher accuracy and prediction results closer to the state of fractional-order stochastic systems.In addition,we analyze the effects of the number of neurons and the order of delays in the FODS-NAR algorithm on the prediction results and derive a range of their optimal values.
文摘Dynamic area coverage with small unmanned aerial vehicle(UAV)systems is one of the major research topics due to limited payloads and the difficulty of decentralized decision-making process.Collaborative behavior of a group of UAVs in an unknown environment is another hard problem to be solved.In this paper,we propose a method for decentralized execution of multi-UAVs for dynamic area coverage problems.The proposed decentralized decision-making dynamic area coverage(DDMDAC)method utilizes reinforcement learning(RL)where each UAV is represented by an intelligent agent that learns policies to create collaborative behaviors in partially observable environment.Intelligent agents increase their global observations by gathering information about the environment by connecting with other agents.The connectivity provides a consensus for the decision-making process,while each agent takes decisions.At each step,agents acquire all reachable agents’states,determine the optimum location for maximal area coverage and receive reward using the covered rate on the target area,respectively.The method was tested in a multi-agent actor-critic simulation platform.In the study,it has been considered that each UAV has a certain communication distance as in real applications.The results show that UAVs with limited communication distance can act jointly in the target area and can successfully cover the area without guidance from the central command unit.
基金the National Natural Science Foundation of China(No.61973275)。
文摘Dynamic path planning is crucial for mobile robots to navigate successfully in unstructured envi-ronments.To achieve globally optimal path and real-time dynamic obstacle avoidance during the movement,a dynamic path planning algorithm incorporating improved IB-RRT∗and deep reinforce-ment learning(DRL)is proposed.Firstly,an improved IB-RRT∗algorithm is proposed for global path planning by combining double elliptic subset sampling and probabilistic central circle target bi-as.Then,to tackle the slow response to dynamic obstacles and inadequate obstacle avoidance of tra-ditional local path planning algorithms,deep reinforcement learning is utilized to predict the move-ment trend of dynamic obstacles,leading to a dynamic fusion path planning.Finally,the simulation and experiment results demonstrate that the proposed improved IB-RRT∗algorithm has higher con-vergence speed and search efficiency compared with traditional Bi-RRT∗,Informed-RRT∗,and IB-RRT∗algorithms.Furthermore,the proposed fusion algorithm can effectively perform real-time obsta-cle avoidance and navigation tasks for mobile robots in unstructured environments.
文摘The current digital educational resources are of many kinds and large quantities, to solve the problems existing in the existing dynamic resource selection methods, a dynamic resource selection method based on machine learning is proposed. Firstly, according to the knowledge structure and concepts of mathematical resources, combined with the basic components of dynamic mathematical resources, the knowledge structure graph of mathematical resources is constructed;according to the characteristics of mathematical resources, the interaction between users and resources is simulated, and the graph of the main body of the resources is identified, and the candidate collection of mathematical knowledge is selected;finally, according to the degree of matching between mathematical literature and the candidate collection, machine learning is utilized, and the mathematical resources are screened.