期刊文献+

一种高效的经验回放模块设计

Design of experience-replay module with high performance
下载PDF
导出
摘要 针对深度Q网络(DQN)应用中基于python数据结构直接实现的经验回放过程时常成为性能瓶颈,提出一种具有高性能及通用性的经验回放模块设计方案。该设计方案具有两层软件结构:底层的功能内核由C++语言实现,以提供较高的执行效率;上层则由python语言编写,以面向对象的方式封装模块功能并提供调用接口,使模块具有较高易用性。针对经验回放所涉及的关键操作,一些技术细节被充分研究和精心设计,例如,将优先级回放机制作为附属组件与模块的主体运行逻辑分离,将样本的可抽取性验证提前到样本记录操作中进行,使用高效的样本淘汰策略与算法等。这些措施使模块具有较高的通用性和可扩展性。实验结果表明,按照该模块实现的经验回放过程,整体执行效率得到了充分优化,两个关键操作--样本记录与样本抽取,皆可高效执行。与基于python数据结构的直接实现方式相比,所提模块在样本抽取操作上的性能提升了约100倍,从而避免了经验回放过程成为整个系统的性能瓶颈,满足了各类DQN相关应用项目的需要。 Concerning the problem that a straightforward implementation of the experience-replay procedure based on python data-structures may lead to a performance bottleneck in Deep Q Network(DQN)related applications,a design scheme of a universal experience-replay module was proposed to provide high performance.The proposed module consists of two software layers.One of them,called the“kernel”,was written in C++,to implement fundamental functions for experience-replay,achieving a high execution efficiency.And the other layer“wrapper”,written in python,encapsulated the module function and provided the call interface in an object-oriented style,guaranteeing the usability.For the critical operations in experience-replay,the software structure and algorithms were well researched and designed.The measures include implementing the priority replay mechanism as an accessorial part of the main module with logical separation,bringing forward the samples verification of“get_batch”to the“record”operation,using efficient strategies and algorithms in eliminating samples,and so on.With such measures,the proposed module is universal and extendible.The experimental results show that the execution efficiency of the experience-replay process is well optimized by using the proposed module,and the two critical operations,the“record”and the“get_batch”,can be executed efficiently.The proposed module operates the“get_batch”about 100 times faster compared with the straightforward implementation based on python data-structures.Therefore,the experience-replay process is no longer a performance bottleneck in the system,meeting the requirements of various kinds of DQN-related applications.
作者 陈勃 王锦艳 CHEN Bo;WANG Jinyan(College of Mathematics and Computer Science,Fuzhou University,Fuzhou Fujian 350108,China)
出处 《计算机应用》 CSCD 北大核心 2019年第11期3242-3249,共8页 journal of Computer Applications
基金 福建省自然科学基金资助项目(2016J01294)~~
关键词 强化学习 深度学习 深度Q网络 经验回放 软件设计 reinforcement learning deep learning Deep Q Network(DQN) experience-replay software design
  • 相关文献

参考文献3

二级参考文献14

共引文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部