文献一覧: 釜谷博行 (著者)

1 0 0 0 OA 部分観測マルコフ環境における階層型強化学習

著者: 釜谷博行阿部健一
出版者: 一般社団法人電気学会
雑誌: 電気学会論文誌C(電子・情報・システム部門誌) (ISSN:03854221)
巻号頁・発行日: vol.122, no.7, pp.1186-1193, 2002-07-01 (Released:2008-12-19)
参考文献数: 22

The most widely used reinforcement learning (RL) algorithms are limited to Markovian environments. To handle larger scale partially observable Markov decision processes, we propose a new on-line hierarchical RL algorithm, which is called Switching Q-learning (SQ-learning). The basic idea of SQ-learning is that non-Markovian tasks can be automatically decomposed into subtasks solvable by multiple policies, without any other information leading to good subgoals. To deal with such decomposition, SQ-learning employs ordered sequences of Q modules in which each module discovers a local control policy based on Sarsa (λ). Furthermore, a hierarchical structure learning automaton is used which finds appropriate subgoal sequences according to LR-I algorithm. The results of extensive simulations demonstrate the effectiveness of SQ-learning.

2016-04-21 12:35:00
1 はてなブックマーク