期刊文献+

TD-BP强化学习算法在五子棋博弈系统中的应用 被引量:3

Applications of TD-BP Algorithm in Renju Game System
下载PDF
导出
摘要 局面估值的准确性是决定棋类游戏水平高低的一个重要因素。针对使用静态估值函数的不足,提出了TD-BP强化学习算法,结合博弈中常用的极小极大搜索算法和经过历史启发增强的PVS搜索算法,实现了一种自适应性较强的五子棋自学习程序。实验结果表明,使用该算法的程序经过较短时间的训练后达到了较好的下棋水平. The accuracy of the valuations is one of the important factors which decide the chess games' level.For the fact that static valuations function is rarely used,reinforcement learning algorithm of TD-algorithm combined with BP neural network is proposed.Based on common mini-max search algorithm and PVS search algorithm enhanced by history heuristic,the self-study ability of Renju Game program is realized.Experimental results showed this method of the program achieves a good chess level after a short time training.
出处 《沈阳理工大学学报》 CAS 2010年第4期30-32,37,共4页 Journal of Shenyang Ligong University
关键词 TD算法 BP神经网络 估值函数 PVS算法 TD algorithm BP neural network valuations function PVS algorithm
  • 相关文献

参考文献6

  • 1Richard Sutton.TD-Gammon[EB/OL].http://www-anw.cs.umass.edu/index.shtml.
  • 2Tesauro G.Practical issues in temporal difference learning[J].Machine Learning,1992,8(3-4):257-277.
  • 3Mannen H,Wiering M.Learning to Play Chess Using TD(λ)-Learning With Database Games[C].Proceedings of the Thirteenth Belgian-Dutch Conference on Machine Learning,Benelearn,2004.
  • 4Sutton R S.Learning to predict by the method of temporal difference[J].Machine Learning,1988,(3):9-44.
  • 5Jonathan Schaeffer.The history heuristic and alpha-beta search enhancements in practice[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1989,11(11):1203-1212.
  • 6张明亮,李凡长.一种新的博弈树搜索方法[J].山东大学学报(工学版),2009,39(6):1-7. 被引量:8

二级参考文献22

  • 1KNUTH D E, MOORE R W. An analysis of alpha-beta pruning [J]. Artificial Intelligence, 1975, 6(4):293-326.
  • 2KJELDSEN T H. John von Neumann' s conception of the minimax theorem: a journey through different mathematical contexts[J]. Archive for History of Exact Sciences, 2001, 56(1):39-68.
  • 3SLAGLE J R, DIXON J K. Experiment with some programs that search game trees [J]. Journal of the ACM, 1969, 16 (2) : 189-207.
  • 4FINKEL R A, FISHBUILN J, LAWLESS S A. Parallel alpha- beta search on Arachne[C]// 1EEE International Conference on Parallel Processing.[S.l. ] :IEEE Press, 1980:235-243.
  • 5PEARL J. Asymptotic properties of minimax trees and game searching procedures[J]. Artificial Intelligence, 1980, 14(2) : 113-138.
  • 6PEARL J. Scout: a simple game-searching algorithm with proven optimal properties [J]. Proceedings of the First Annual National Conference on Artificial Intelligence. Stanford: [ s. n. ], 1980: 143-145.
  • 7MARSLAND T A, CAMPBELL M. Parallel search of strongly ordered game trees [J]. Computing Surveys, 1982, 14(4): 533-551.
  • 8PLAAT A, SCHAEFFER J, PIJLS W, et al. A new paradigm for minimax search, technical report TR-CS-94-18[R]. Edmonton, Alberta, Canada: University of Alberta, 1994.
  • 9PLAAT A, SCHAEFFER J, PILS W, et al. Best-first fixeddepth minimax algorithms[J]. Artificial Intelligence, 1996, 87 (1-2) : 255-293.
  • 10ATKIN L, SLATE D. Chess 4.5 -- the northwestern university chess program [ C]// Chess Skill in Man and Machine. New York: Springer-Verlass, 1977 : 82-118.

共引文献7

同被引文献24

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部