|Table of Contents|

Robot Q-learning coverage algorithm in unknown environments


Research Field:
Publishing date:


Robot Q-learning coverage algorithm in unknown environments
Chang Baoxian1Ding Jie2Zhu Junwu2Zhang Yonglong23
1.College of Sciences,Nanjing University of Technology,Nanjing 211816,China;
2.College of Information Engineering,Yangzhou University,Yangzhou 225009,China;
3.College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics, Nanjing 210016,China
unknown environments Q-learning coverage algorithm robots area coverage grid model
A Q-Learning coverage algorithm(QLCA)is presented to improve the area coverage rates of robots in unknown environments.A grid model is constructed for an environment and the positions of robots and barriers are deployed in the grid map randomly.The subsequent action choices and path plans of the robots are directed by the Qtable gained from the robots' self-learning of the QLCA,and the moving frequencies of robots are decreased.The effects of parameters such as the number of robots,environments on this algorithm are analyzed.The simulation results show:compared with the random chosen coverage algorithm(RSCA),the QLCA optimizes the coverage executing steps and redundancy obviously.


[1] 蔡自兴,崔益安.多机器人覆盖技术研究进展[J].控制与决策,2008,23(5):481-486.
Cai Zixing,Cui Yian.Survey of multi-robot coverage[J].Control and Decision,2008,23(5):481-486.
[2]Watkins C J C H.Learning from delayed rewards[D].Cambridge,UK:King's College,University of Cambridge,1989:1-55.
[3]Cheng Ke.Multi-robot coalition formation for distributed area coverage[D].Omaha,USA:Computer Science Department,University of Nebraska,2011:3-55.
[4]Minsky M L.Theory of neural-analog reinforcement systems and its application to the brain-model problem[M].Princeton,USA:Princeton University,1954:5-23.
Liang Quan.Reinforcement learning based mobile robot path planning in unknown environment[J].Journal of Mechanical and Electrical Engineering,2012,29(4):477-481.
[6]Fazli P,Davoodi A,Mackworth A K.Multi-robot repeated area coverage:Performance optimization under various visual ranges[A].Ninth Conference on Computer and Robot Vision(CRV 2012)[C].Toronto,Canada:IEEE,2012:298-305.
[7]Hazon N,Mieli F,Kaminka G A.Towards robust on-line multi-robot coverage[A].Proceedings of 2006 IEEE International Conference on Robotics and Automation[C].Singapore City,Singapore:IEEE,2006:1710-1715.
[8]Jeon H S,Ko M C,Oh R,et al.A practical robot coverage algorithm for unknown environments[M].Berlin,Germany:Springer-Verlag Berlin Heidelberg,2010:129-140.
[9]张家旺,韩光胜,张伟.Q学习算法在RoboCup带球中的应用[J].系统仿真技术,2005(2):84-87. vZhang Jiawang,Han Guangsheng,Zhang Wei.Application of Q-learning algorithm in dribbling ball training of RoboCup[J].System Simulation Technology,2005(2):84-87.
[11] Matignon L,Laurent G J,Le Fort-Piat N.Independent reinforcement learners in cooperative Markov games:A survey regarding coordination problems[J].The Knowledge Engineering Review,2012,27(1):1-31.
[12]Cheng K,Dasgupta P.Multi-agent coalition formation for distributed area coverage[M].Berlin,Germany:Springer-Verlag Berlin Heidelberg,2011:4-13.
[13]Cheng K,Dasgupta P.Weighted voting game based multi-robot team formation for distributed area coverage[A].Proceedings of the 3rd International Symposium on Practical Cognitive Agents and Robots[C].New York,USA:ACM,2010:9-15.
[14]Dasgupta P,Cheng K,Banerjee B.Adaptive multi-robot team reconfiguration using a policy-reuse reinforcement learning approach[M].Berlin,Germany:Springer-Verlag Berlin Heidelberg,2012:330-345.
[17]Wu Jun,Xu Xin,Zhang Pengcheng.A novel multi-agent reinforcement learning approach for job scheduling in grid computing[J].Future Generation Computer Systems,2011(27):430-439.


Last Update: 2013-12-31