摘要
局面估值的准确性是决定棋类游戏水平高低的一个重要因素。针对使用静态估值函数的不足,提出了TD-BP强化学习算法,结合博弈中常用的极小极大搜索算法和经过历史启发增强的PVS搜索算法,实现了一种自适应性较强的五子棋自学习程序。实验结果表明,使用该算法的程序经过较短时间的训练后达到了较好的下棋水平.
The accuracy of the valuations is one of the important factors which decide the chess games' level.For the fact that static valuations function is rarely used,reinforcement learning algorithm of TD-algorithm combined with BP neural network is proposed.Based on common mini-max search algorithm and PVS search algorithm enhanced by history heuristic,the self-study ability of Renju Game program is realized.Experimental results showed this method of the program achieves a good chess level after a short time training.
出处
《沈阳理工大学学报》
CAS
2010年第4期30-32,37,共4页
Journal of Shenyang Ligong University