research-article

Free Access

Solving hard AI planning instances using curriculum-driven deep reinforcement learning

Authors:
Dieqiao Feng

Department of Computer Science, Cornell University

Department of Computer Science, Cornell University
Search about this author

,
Carla P. Gomes

Department of Computer Science, Cornell University

Department of Computer Science, Cornell University
Search about this author

IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial IntelligenceJanuary 2021 Article No.: 304Pages 2198–2205

Published:07 January 2021Publication History

IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence

Pages 2198–2205

ABSTRACT

Despite significant progress in general AI planning, certain domains remain out of reach of current AI planning systems. Sokoban is a PSPACE-complete planning task and represents one of the hardest domains for current AI planners. Even domain-specific specialized search methods fail quickly due to the exponential search complexity on hard instances. Our approach based on deep reinforcement learning augmented with a curriculum-driven method is the first one to solve hard instances within one day of training while other modern solvers cannot solve these instances within any reasonable time limit. In contrast to prior efforts, which use carefully handcrafted pruning techniques, our approach automatically uncovers domain structure. Our results reveal that deep RL provides a promising framework for solving previously unsolved AI planning problems, provided a proper training curriculum can be devised.

References

David Abel, David Ellis Hershkowitz, Gabriel Barth-Maron, Stephen Brawner, Kevin O'Farrell, James MacGlashan, and Stefanie Tellex. Goal-based action priors. In Twenty-Fifth International Conference on Automated Planning and Scheduling, 2015.Google Scholar
Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41-48, 2009.Google Scholar
Avrim L Blum and Merrick L Furst. Fast planning through planning graph analysis. Artificial intelligence, 90(1-2):281-300, 1997.Google Scholar
Tom Bylander. The computational complexity of propositional strips planning. Artificial Intelligence, 69(1-2):165-204, 1994.Google Scholar
Joseph Culberson. Sokoban is pspacecomplete. University of Alberta, Technical Report, TRID-ID TR97-02, 1997.Google Scholar
Alan Fern, Roni Khardon, and Prasad Tadepalli. The first learning track of the international planning competition. Machine Learning, 84(1-2):81-107, 2011.Google Scholar
Maria Fox and Derek Long. Pddl2. 1: An extension to pddl for expressing temporal planning domains. Journal of artificial intelligence research, 20:61- 124, 2003.Google Scholar
Kurt Gödel. Über formal unentscheidbare sätze der principia mathematica und verwandter systeme i. Monatshefte für mathematik und physik, 38(1):173-198, 1931.Google Scholar
Edward Groshev, Aviv Tamar, Maxwell Goldstein, Siddharth Srivastava, and Pieter Abbeel. Learning generalized reactive policies using deep neural networks. In 2018 AAAI Spring Symposium Series, 2018.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778, 2016.Google Scholar
Robert A Hearn and Erik D Demaine. Pspace-completeness of sliding-block puzzles and other problems through the nondeterministic constraint logic model of computation. Theoretical Computer Science, 343(1-2):72-96, 2005.Google Scholar
Jörg Hoffmann. Ff: The fast-forward planning system. AI magazine, 22(3):57-57, 2001.Google Scholar
Matti Järvisalo, Daniel Le Berre, Olivier Roussel, and Laurent Simon. The international sat solver competitions. Ai Magazine, 33(1):89-92, 2012.Google Scholar
Andreas Junghanns and Jonathan Schaeffer. Sokoban: Improving the search with relevance cuts. Theoretical Computer Science, 252(1- 2):151-175, 2001.Google Scholar
Henry Kautz and Bart Selman. Blackbox: A new approach to the application of theorem proving to problem solving. In AIPS98 Workshop on Planning as Combinatorial Search, volume 58260, pages 58- 60, 1998.Google Scholar
Jana Koehler, Bernhard Nebel, Jörg Hoffmann, and Yannis Dimopoulos. Extending planning graphs to an adl subset. In European Conference on Planning, pages 273-285. Springer, 1997.Google Scholar
Nir Lipovetzky. Structure and inference in classical planning. PhD thesis, Universitat Pompeu Fabra, 2013.Google Scholar
Joao P Marques-Silva and Karem A Sakallah. Grasp: A search algorithm for propositional satisfiability. IEEE Transactions on Computers, 48(5):506-521, 1999.Google Scholar
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.Google Scholar
Benjamin Rosman and Subramanian Ramamoorthy. What good are actionš accelerating learning using learned action priors. In 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), pages 1-6. IEEE, 2012.Google Scholar
Bart Selman, Hector J Levesque, David G Mitchell, et al. A new method for solving hard satisfiability problems. In Aaai, volume 92, pages 440- 446. Citeseer, 1992.Google Scholar
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016.Google Scholar
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815, 2017.Google Scholar
Mauro Vallati, Lukas Chrpa, Marek Grzes, Thomas Leo McCluskey, Mark Roberts, Scott Sanner, et al. The 2014 international planning competition: Progress and trends. Ai Magazine, 36(3):90-98, 2015.Google Scholar
Théophane Weber, Sébastien Racanière, David P Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomenech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, et al. Imagination-augmented agents for deep reinforcement learning. arXiv preprint arXiv:1707.06203, 2017.Google Scholar
James Welle. A comparison of parallelism in current ai planning systems. 2003.Google Scholar

Index Terms

(auto-classified)

Solving hard AI planning instances using curriculum-driven deep reinforcement learning
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning approaches
2. Theory of computation
  1. Design and analysis of algorithms

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
January 2021
5311 pages
ISBN:9780999241165
Editor:
Christian Bessiere
Copyright © 2020 International Joint Conferences on Artificial Intelligence
Sponsors
In-Cooperation
Publisher
Unknown publishers
Publication History
- Published: 7 January 2021
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 7
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Solving hard AI planning instances using curriculum-driven deep reinforcement learning

Save to Binder

IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Solving hard AI planning instances using curriculum-driven deep reinforcement learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Solving hard AI planning instances using curriculum-driven deep reinforcement learning

Save to Binder

IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Solving hard AI planning instances using curriculum-driven deep reinforcement learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media