


default search action
David Silver
Person information
- affiliation: DeepMind, London, UK
- affiliation (former): University College London, UK
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [j29]Daniel J. Mankowitz
, Andrea Michi
, Anton Zhernov, Marco Gelmi, Marco Selvi, Cosmin Paduraru, Edouard Leurent, Shariq Iqbal, Jean-Baptiste Lespiau, Alex Ahern, Thomas Köppe, Kevin Millikin, Stephen Gaffney, Sophie Elster, Jackson Broshear, Chris Gamble, Kieran Milan, Robert Tung, Minjae Hwang, A. Taylan Cemgil, Mohammadamin Barekatain, Yujia Li, Amol Mandhane, Thomas Hubert, Julian Schrittwieser, Demis Hassabis, Pushmeet Kohli, Martin A. Riedmiller, Oriol Vinyals, David Silver:
Faster sorting algorithms discovered using deep reinforcement learning. Nat. 618(7964): 257-263 (2023) - [i59]Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Slav Petrov, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy P. Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul Ronald Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, et al.:
Gemini: A Family of Highly Capable Multimodal Models. CoRR abs/2312.11805 (2023) - 2022
- [j28]Alhussein Fawzi
, Matej Balog
, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain
, Alexander Novikov, Francisco J. R. Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, David Silver
, Demis Hassabis
, Pushmeet Kohli
:
Discovering faster matrix multiplication algorithms with reinforcement learning. Nat. 610(7930): 47-53 (2022) - [j27]Yutaka Matsuo
, Yann LeCun, Maneesh Sahani
, Doina Precup, David Silver, Masashi Sugiyama
, Eiji Uchibe
, Jun Morimoto:
Deep learning, reinforcement learning, and world models. Neural Networks 152: 267-275 (2022) - [c88]Ioannis Antonoglou, Julian Schrittwieser, Sherjil Ozair, Thomas K. Hubert, David Silver:
Planning in Stochastic Environments with a Learned Model. ICLR 2022 - [c87]Ivo Danihelka, Arthur Guez, Julian Schrittwieser, David Silver:
Policy improvement by planning with Gumbel. ICLR 2022 - [c86]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. ICLR 2022 - [c85]David Silver, Anirudh Goyal, Ivo Danihelka, Matteo Hessel, Hado van Hasselt:
Learning by Directional Gradient Descent. ICLR 2022 - [d1]Julien Pérolat
, Bart De Vylder
, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls
:
Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning". Zenodo, 2022 - [i58]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022) - 2021
- [j26]David Silver
, Satinder Singh, Doina Precup, Richard S. Sutton
:
Reward is enough. Artif. Intell. 299: 103535 (2021) - [c84]Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver:
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning. AAAI 2021: 7160-7168 - [c83]Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa:
Expected Eligibility Traces. AAAI 2021: 9997-10005 - [c82]Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent Sifre, Theophane Weber, David Silver, Hado van Hasselt:
Muesli: Combining Improvements in Policy Optimization. ICML 2021: 4214-4226 - [c81]Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Mohammadamin Barekatain, Simon Schmitt, David Silver:
Learning and Planning in Complex Action Spaces. ICML 2021: 4476-4486 - [c80]Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado Philip van Hasselt, David Silver:
Self-Consistent Models and Values. NeurIPS 2021: 1111-1125 - [c79]Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh:
Proper Value Equivalence. NeurIPS 2021: 7773-7786 - [c78]Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver:
Online and Offline Reinforcement Learning by Planning with a Learned Model. NeurIPS 2021: 27580-27591 - [c77]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. NeurIPS 2021: 29861-29873 - [i57]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. CoRR abs/2102.06741 (2021) - [i56]Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent Sifre, Theophane Weber, David Silver, Hado van Hasselt:
Muesli: Combining Improvements in Policy Optimization. CoRR abs/2104.06159 (2021) - [i55]Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver:
Online and Offline Reinforcement Learning by Planning with a Learned Model. CoRR abs/2104.06294 (2021) - [i54]Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Mohammadamin Barekatain, Simon Schmitt, David Silver:
Learning and Planning in Complex Action Spaces. CoRR abs/2104.06303 (2021) - [i53]Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh:
Proper Value Equivalence. CoRR abs/2106.10316 (2021) - [i52]André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan J. Hunt, Shibl Mourad, David Silver, Doina Precup:
The Option Keyboard: Combining Skills in Reinforcement Learning. CoRR abs/2106.13105 (2021) - [i51]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. CoRR abs/2109.04504 (2021) - [i50]Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado van Hasselt, David Silver:
Self-Consistent Models and Values. CoRR abs/2110.12840 (2021) - 2020
- [j25]Andrew W. Senior
, Richard Evans, John Jumper, James Kirkpatrick, Laurent Sifre, Tim Green, Chongli Qin, Augustin Zídek, Alexander W. R. Nelson, Alex Bridgland, Hugo Penedones, Stig Petersen, Karen Simonyan, Steve Crossan, Pushmeet Kohli, David T. Jones, David Silver, Koray Kavukcuoglu, Demis Hassabis:
Improved protein structure prediction using potentials from deep learning. Nat. 577(7792): 706-710 (2020) - [j24]Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy P. Lillicrap, David Silver
:
Mastering Atari, Go, chess and shogi by planning with a learned model. Nat. 588(7839): 604-609 (2020) - [j23]André Barreto
, Shaobo Hou
, Diana Borsa, David Silver, Doina Precup:
Fast reinforcement learning with generalized policy updates. Proc. Natl. Acad. Sci. USA 117(48): 30079-30087 (2020) - [c76]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [c75]Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh:
What Can Learned Intrinsic Rewards Capture? ICML 2020: 11436-11446 - [c74]Christopher Grimm, André Barreto, Satinder Singh, David Silver:
The Value Equivalence Principle for Model-Based Reinforcement Learning. NeurIPS 2020 - [c73]Arthur Guez, Fabio Viola, Theophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess:
Value-driven Hindsight Modelling. NeurIPS 2020 - [c72]Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver:
Discovering Reinforcement Learning Algorithms. NeurIPS 2020 - [c71]Zhongwen Xu, Hado Philip van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver:
Meta-Gradient Reinforcement Learning with an Objective Discovered Online. NeurIPS 2020 - [c70]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
A Self-Tuning Actor-Critic Algorithm. NeurIPS 2020 - [i49]Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess:
Value-driven Hindsight Modelling. CoRR abs/2002.08329 (2020) - [i48]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Self-Tuning Deep Reinforcement Learning. CoRR abs/2002.12928 (2020) - [i47]Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver:
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning. CoRR abs/2006.02243 (2020) - [i46]Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa:
Expected Eligibility Traces. CoRR abs/2007.01839 (2020) - [i45]Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver:
Meta-Gradient Reinforcement Learning with an Objective Discovered Online. CoRR abs/2007.08433 (2020) - [i44]Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver:
Discovering Reinforcement Learning Algorithms. CoRR abs/2007.08794 (2020) - [i43]Christopher Grimm, André Barreto, Satinder Singh, David Silver:
The Value Equivalence Principle for Model-Based Reinforcement Learning. CoRR abs/2011.03506 (2020)
2010 – 2019
- 2019
- [j22]Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, Richard Powell, Timo Ewalds
, Petko Georgiev, Junhyuk Oh, Dan Horgan, Manuel Kroiss, Ivo Danihelka, Aja Huang, Laurent Sifre, Trevor Cai, John P. Agapiou, Max Jaderberg, Alexander Sasha Vezhnevets, Rémi Leblond, Tobias Pohlen, Valentin Dalibard, David Budden, Yury Sulsky, James Molloy, Tom Le Paine, Çaglar Gülçehre
, Ziyu Wang, Tobias Pfaff, Yuhuai Wu, Roman Ring, Dani Yogatama, Dario Wünsch
, Katrina McKinney, Oliver Smith, Tom Schaul, Timothy P. Lillicrap, Koray Kavukcuoglu, Demis Hassabis, Chris Apps, David Silver:
Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nat. 575(7782): 350-354 (2019) - [c69]Théophane Weber, Nicolas Heess, Lars Buesing, David Silver:
Credit Assignment Techniques in Stochastic Computation Graphs. AISTATS 2019: 2650-2660 - [c68]Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Hado van Hasselt, Rémi Munos, David Silver, Tom Schaul:
Universal Successor Features Approximators. ICLR (Poster) 2019 - [c67]Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Theophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy P. Lillicrap:
An Investigation of Model-Free Planning. ICML 2019: 2464-2473 - [c66]Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Useful Questions as Auxiliary Tasks. NeurIPS 2019: 9306-9317 - [c65]André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan J. Hunt, Shibl Mourad, David Silver, Doina Precup:
The Option Keyboard: Combining Skills in Reinforcement Learning. NeurIPS 2019: 13031-13041 - [i42]Théophane Weber, Nicolas Heess, Lars Buesing, David Silver:
Credit Assignment Techniques in Stochastic Computation Graphs. CoRR abs/1901.01761 (2019) - [i41]Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy P. Lillicrap:
An investigation of model-free planning. CoRR abs/1901.03559 (2019) - [i40]André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. CoRR abs/1901.10964 (2019) - [i39]Matteo Hessel, Hado van Hasselt, Joseph Modayil, David Silver:
On Inductive Biases in Deep Reinforcement Learning. CoRR abs/1907.02908 (2019) - [i38]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. CoRR abs/1908.03568 (2019) - [i37]Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard L. Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Useful Questions as Auxiliary Tasks. CoRR abs/1909.04607 (2019) - [i36]Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy P. Lillicrap, David Silver:
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. CoRR abs/1911.08265 (2019) - [i35]Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh:
What Can Learned Intrinsic Rewards Capture? CoRR abs/1912.05500 (2019) - 2018
- [j21]Ron Sun, David Silver, Gerald Tesauro, Guang-Bin Huang
:
Introduction to the special issue on deep reinforcement learning: An editorial. Neural Networks 107: 1-2 (2018) - [c64]Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. AAAI 2018: 3215-3222 - [c63]Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver:
Distributed Prioritized Experience Replay. ICLR (Poster) 2018 - [c62]André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. ICML 2018: 510-519 - [c61]Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018: 1104-1113 - [c60]Arthur Guez, Theophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. ICML 2018: 1817-1826 - [c59]Zhongwen Xu, Hado van Hasselt, David Silver:
Meta-Gradient Reinforcement Learning. NeurIPS 2018: 2402-2413 - [i34]Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. CoRR abs/1802.04697 (2018) - [i33]Daniel J. Mankowitz, Augustin Zídek, André Barreto, Dan Horgan, Matteo Hessel, John Quan, Junhyuk Oh, Hado van Hasselt, David Silver, Tom Schaul:
Unicorn: Continual Learning with a Universal, Off-policy Agent. CoRR abs/1802.08294 (2018) - [i32]Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver:
Distributed Prioritized Experience Replay. CoRR abs/1803.00933 (2018) - [i31]Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack W. Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Jimenez Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matthew M. Botvinick, Demis Hassabis, Timothy P. Lillicrap:
Unsupervised Predictive Memory in a Goal-Directed Agent. CoRR abs/1803.10760 (2018) - [i30]Zhongwen Xu, Hado van Hasselt, David Silver:
Meta-Gradient Reinforcement Learning. CoRR abs/1805.09801 (2018) - [i29]Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. CoRR abs/1806.06923 (2018) - [i28]Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio García Castañeda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green, Louise Deason, Joel Z. Leibo, David Silver, Demis Hassabis, Koray Kavukcuoglu, Thore Graepel:
Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. CoRR abs/1807.01281 (2018) - [i27]Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver, Nando de Freitas:
Bayesian Optimization in AlphaGo. CoRR abs/1812.06855 (2018) - [i26]Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul:
Universal Successor Features Approximators. CoRR abs/1812.07626 (2018) - 2017
- [j20]David Silver:
Technical perspective: Solving imperfect information games. Commun. ACM 60(11): 80 (2017) - [j19]David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy P. Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis:
Mastering the game of Go without human knowledge. Nat. 550(7676): 354-359 (2017) - [c58]Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo
, David Silver, Koray Kavukcuoglu:
Reinforcement Learning with Unsupervised Auxiliary Tasks. ICLR 2017 - [c57]Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, David Silver, Koray Kavukcuoglu:
Decoupled Neural Interfaces using Synthetic Gradients. ICML 2017: 1627-1635 - [c56]David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David P. Reichert, Neil C. Rabinowitz, André Barreto, Thomas Degris:
The Predictron: End-To-End Learning and Planning. ICML 2017: 3191-3199 - [c55]Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu:
FeUdal Networks for Hierarchical Reinforcement Learning. ICML 2017: 3540-3549 - [c54]Zhongwen Xu, Joseph Modayil, Hado van Hasselt, André Barreto, David Silver, Tom Schaul:
Natural Value Approximators: Learning when to Trust Past Estimates. NIPS 2017: 2120-2128 - [c53]André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, David Silver, Hado van Hasselt:
Successor Features for Transfer in Reinforcement Learning. NIPS 2017: 4055-4065 - [c52]Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel:
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. NIPS 2017: 4190-4203 - [c51]Sébastien Racanière, Theophane Weber, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter W. Battaglia, Demis Hassabis, David Silver, Daan Wierstra:
Imagination-Augmented Agents for Deep Reinforcement Learning. NIPS 2017: 5690-5701 - [i25]Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu:
FeUdal Networks for Hierarchical Reinforcement Learning. CoRR abs/1703.01161 (2017) - [i24]Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin A. Riedmiller, David Silver:
Emergence of Locomotion Behaviours in Rich Environments. CoRR abs/1707.02286 (2017) - [i23]Theophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter W. Battaglia, David Silver, Daan Wierstra:
Imagination-Augmented Agents for Deep Reinforcement Learning. CoRR abs/1707.06203 (2017) - [i22]Oriol Vinyals, Timo Ewalds
, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John P. Agapiou
, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy P. Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing:
StarCraft II: A New Challenge for Reinforcement Learning. CoRR abs/1708.04782 (2017) - [i21]Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Daniel Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. CoRR abs/1710.02298 (2017) - [i20]Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel:
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. CoRR abs/1711.00832 (2017) - [i19]David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy P. Lillicrap, Karen Simonyan, Demis Hassabis:
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. CoRR abs/1712.01815 (2017) - 2016
- [j18]David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis:
Mastering the game of Go with deep neural networks and tree search. Nat. 529(7587): 484-489 (2016) - [c50]Hado van Hasselt, Arthur Guez, David Silver:
Deep Reinforcement Learning with Double Q-Learning. AAAI 2016: 2094-2100 - [c49]Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu:
Asynchronous Methods for Deep Reinforcement Learning. ICML 2016: 1928-1937 - [c48]Hado van Hasselt, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver:
Learning values across many orders of magnitude. NIPS 2016: 4287-4295 - [c47]Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra:
Continuous control with deep reinforcement learning. ICLR (Poster) 2016 - [c46]Tom Schaul, John Quan, Ioannis Antonoglou, David Silver:
Prioritized Experience Replay. ICLR (Poster) 2016 - [i18]Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu:
Asynchronous Methods for Deep Reinforcement Learning. CoRR abs/1602.01783 (2016) - [i17]Hado van Hasselt, Arthur Guez, Matteo Hessel, David Silver:
Learning functions across many orders of magnitudes. CoRR abs/1602.07714 (2016) - [i16]Johannes Heinrich, David Silver:
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. CoRR abs/1603.01121 (2016) - [i15]André Barreto, Rémi Munos, Tom Schaul, David Silver:
Successor Features for Transfer in Reinforcement Learning. CoRR abs/1606.05312 (2016) - [i14]Nicolas Heess, Gregory Wayne, Yuval Tassa, Timothy P. Lillicrap, Martin A. Riedmiller, David Silver:
Learning and Transfer of Modulated Locomotor Controllers. CoRR abs/1610.05182 (2016) - [i13]Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu:
Reinforcement Learning with Unsupervised Auxiliary Tasks. CoRR abs/1611.05397 (2016) - [i12]David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David P. Reichert, Neil C. Rabinowitz, André Barreto, Thomas Degris:
The Predictron: End-To-End Learning and Planning. CoRR abs/1612.08810 (2016) - 2015
- [j17]Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Fidjeland, Georg Ostrovski
, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis:
Human-level control through deep reinforcement learning. Nat. 518(7540): 529-533 (2015) - [c45]John Vines
, Peter C. Wright, David Silver, Maggie Winchcombe, Patrick Olivier
:
Authenticity, Relatability and Collaborative Approaches to Sharing Knowledge about Assistive Living Technology. CSCW 2015: 82-94 - [c44]Johannes Heinrich, Marc Lanctot, David Silver:
Fictitious Self-Play in Extensive-Form Games. ICML 2015: 805-813 - [c43]Tom Schaul, Daniel Horgan, Karol Gregor, David Silver:
Universal Value Function Approximators. ICML 2015: 1312-1320 - [c42]Johannes Heinrich, David Silver:
Smooth UCT Search in Computer Poker. IJCAI 2015: 554-560 - [c41]David M. Bradley, Jonathan K. Chang, David Silver, Matthew Powers, Herman Herman, Peter Rander, Anthony Stentz:
Scene understanding for a high-mobility walking robot. IROS 2015: 1144-1151 - [c40]Nicolas Heess, Gregory Wayne, David Silver, Timothy P. Lillicrap, Tom Erez, Yuval Tassa:
Learning Continuous Control Policies by Stochastic Value Gradients. NIPS 2015: 2944-2952 - [c39]Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver:
Move Evaluation in Go Using Deep Convolutional Neural Networks. ICLR (Poster) 2015 - [i11]Kamil Ciosek, David Silver:
Value Iteration with Options and State Aggregation. CoRR abs/1501.03959 (2015) - [i10]Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver:
Massively Parallel Methods for Deep Reinforcement Learning. CoRR abs/1507.04296 (2015) - [i9]Hado van Hasselt, Arthur Guez, David Silver:
Deep Reinforcement Learning with Double Q-learning. CoRR abs/1509.06461 (2015) - [i8]Nicolas Heess, Greg Wayne, David Silver, Timothy P. Lillicrap, Yuval Tassa, Tom Erez:
Learning Continuous Control Policies by Stochastic Value Gradients. CoRR abs/1510.09142 (2015) - [i7]Nicolas Heess, Jonathan J. Hunt, Timothy P. Lillicrap, David Silver:
Memory-based control with recurrent neural networks. CoRR abs/1512.04455 (2015) - 2014
- [c38]David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin A. Riedmiller:
Deterministic Policy Gradient Algorithms. ICML 2014: 387-395 - [c37]Arthur Guez, Nicolas Heess, David Silver, Peter Dayan:
Bayes-Adaptive Simulation-based Search with Value Function Approximation. NIPS 2014: 451-459 - [c36]David Silver, Suman Jana, Dan Boneh, Eric Yawei Chen, Collin Jackson:
Password Managers: Attacks and Defenses. USENIX Security Symposium 2014: 449-464 - [c35]Tom Schaul, Ioannis Antonoglou, David Silver:
Unit Tests for Stochastic Optimization. ICLR 2014 - [i6]S. R. K. Branavan, David Silver, Regina Barzilay:
Learning to Win by Reading Manuals in a Monte-Carlo Framework. CoRR abs/1401.5390 (2014) - [i5]Arthur Guez, David Silver, Peter Dayan:
Better Optimism By Bayes: Adaptive Planning with Rich Models. CoRR abs/1402.1958 (2014) - 2013
- [j16]Arthur Guez, David Silver, Peter Dayan:
Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search. J. Artif. Intell. Res. 48: 841-883 (2013) - [c34]David Silver, Richard S. Sutton, Martin Müller:
Temporal-Difference Search in Computer Go. ICAPS 2013 - [c33]David Silver, Leonard Newnham, David Barker, Suzanne Weller, Jason McFall:
Concurrent Reinforcement Learning from Customer Interactions. ICML (3) 2013: 924-932 - [i4]Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin A. Riedmiller:
Playing Atari with Deep Reinforcement Learning. CoRR abs/1312.5602 (2013) - 2012
- [j15]Sylvain Gelly, Levente Kocsis, Marc Schoenauer
, Michèle Sebag, David Silver, Csaba Szepesvári, Olivier Teytaud:
The grand challenge of computer Go: Monte Carlo tree search and extensions. Commun. ACM 55(3): 106-113 (2012) - [j14]David Silver:
Digital natives on a media fast. Inf. Serv. Use 32(3-4): 137-139 (2012) - [j13]S. R. K. Branavan, David Silver, Regina Barzilay:
Learning to Win by Reading Manuals in a Monte-Carlo Framework. J. Artif. Intell. Res. 43: 661-704 (2012) - [j12]David Silver, Richard S. Sutton, Martin Müller:
Temporal-difference search in computer Go. Mach. Learn. 87(2): 183-219 (2012) - [c32]Nicolas Heess, David Silver, Yee Whye Teh:
Actor-Critic Reinforcement Learning with Energy-Based Policies. EWRL 2012: 43-58 - [c31]David Silver:
Gradient Temporal Difference Networks. EWRL 2012: 117-130 - [c30]David Silver, Kamil Ciosek:
Compositional Planning Using Optimal Option Models. ICML 2012 - [c29]David Silver, J. Andrew Bagnell, Anthony Stentz:
Active learning from demonstration for robust autonomous navigation. ICRA 2012: 200-207 - [c28]David Silver, J. Andrew Bagnell, Anthony Stentz:
Learning Autonomous Driving Styles and Maneuvers from Expert Demonstration. ISER 2012: 371-386 - [c27]Arthur Guez, David Silver, Peter Dayan:
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search. NIPS 2012: 1034-1042 - [i3]Arthur Guez, David Silver, Peter Dayan:
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search. CoRR abs/1205.3109 (2012) - 2011
- [j11]Sylvain Gelly, David Silver:
Monte-Carlo tree search and rapid action value estimation in computer Go. Artif. Intell. 175(11): 1856-1875 (2011) - [j10]Joel Veness, Kee Siong Ng
, Marcus Hutter
, William T. B. Uther, David Silver:
A Monte-Carlo AIXI Approximation. J. Artif. Intell. Res. 40: 95-142 (2011) - [c26]S. R. K. Branavan, David Silver, Regina Barzilay:
Learning to Win by Reading Manuals in a Monte-Carlo Framework. ACL 2011: 268-277 - [c25]S. R. K. Branavan, David Silver, Regina Barzilay:
Non-Linear Monte-Carlo Search in Civilization II. IJCAI 2011: 2404-2410 - [c24]David Silver, Anthony Stentz:
Monte Carlo Localization and registration to prior data for outdoor navigation. IROS 2011: 510-517 - 2010
- [b1]David Silver:
Learning Preference Models for Autonomous Mobile Robots in Complex Domains. Carnegie Mellon University, USA, 2010 - [j9]David Silver, J. Andrew Bagnell, Anthony Stentz:
Learning from Demonstration for Autonomous Navigation in Complex Unstructured Terrain. Int. J. Robotics Res. 29(12): 1565-1592 (2010) - [j8]J. Andrew Bagnell, David M. Bradley, David Silver, Boris Sofman, Anthony Stentz:
Learning for Autonomous Navigation. IEEE Robotics Autom. Mag. 17(2): 74-84 (2010) - [c23]Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver:
Reinforcement Learning via AIXI Approximation. AAAI 2010: 605-611 - [c22]David Silver, Joel Veness:
Monte-Carlo Planning in Large POMDPs. NIPS 2010: 2164-2172 - [i2]Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver:
Reinforcement Learning via AIXI Approximation. CoRR abs/1007.2049 (2010)
2000 – 2009
- 2009
- [j7]Nathan D. Ratliff, David Silver, J. Andrew Bagnell:
Learning to search: Functional gradient techniques for imitation learning. Auton. Robots 27(1): 25-53 (2009) - [c21]David Silver, J. Andrew Bagnell, Anthony Stentz:
Applied Imitation Learning for Autonomous Navigation in Complex Natural Terrain. FSR 2009: 249-259 - [c20]David Silver, Gerald Tesauro:
Monte-Carlo simulation balancing. ICML 2009: 945-952 - [c19]Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar
, David Silver, Csaba Szepesvári, Eric Wiewiora:
Fast gradient-descent methods for temporal-difference learning with linear function approximation. ICML 2009: 993-1000 - [c18]David Silver, J. Andrew Bagnell, Anthony Stentz:
Perceptual Interpretation for Autonomous Navigation through Dynamic Imitation Learning. ISRR 2009: 433-449 - [c17]Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton:
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009: 1204-1212 - [c16]Joel Veness, David Silver, William T. B. Uther, Alan Blair:
Bootstrapping from Game Tree Search. NIPS 2009: 1937-1945 - [i1]Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver:
A Monte Carlo AIXI Approximation. CoRR abs/0909.0801 (2009) - 2008
- [j6]David Silver:
History, Hype, and Hope: An Afterward. First Monday 13(3) (2008) - [c15]Sylvain Gelly, David Silver:
Achieving Master Level Play in 9 x 9 Computer Go. AAAI 2008: 1537-1540 - [c14]David Silver, Richard S. Sutton, Martin Müller:
Sample-based learning and search with permanent and transient memories. ICML 2008: 968-975 - [c13]David Silver, James A. Bagnell, Anthony Stentz:
High Performance Outdoor Navigation from Overhead Data using Imitation Learning. Robotics: Science and Systems 2008 - 2007
- [c12]Sylvain Gelly, David Silver:
Combining online and offline knowledge in UCT. ICML 2007: 273-280 - [c11]Richard S. Sutton, Anna Koop, David Silver:
On the role of tracking in stationary environments. ICML 2007: 871-878 - [c10]David Silver, Richard S. Sutton, Martin Müller:
Reinforcement Learning of Local Shape in the Game of Go. IJCAI 2007: 1053-1058 - 2006
- [j5]Aaron Morris, Dave Ferguson, Zachary Omohundro, David M. Bradley, David Silver, Christopher R. Baker, Scott Thayer, Chuck Whittaker, William Whittaker:
Recent developments in subterranean robotics. J. Field Robotics 23(1): 35-57 (2006) - [j4]David Silver, Dave Ferguson, Aaron Morris, Scott Thayer:
Topological exploration of subterranean environments. J. Field Robotics 23(6-7): 395-415 (2006) - [c9]David Silver, Boris Sofman, Nicolas Vandapel, J. Andrew Bagnell, Anthony Stentz:
Experimental Analysis of Overhead Data Processing To Support Long Range Navigation. IROS 2006: 2443-2450 - 2005
- [j3]Brad Lisien, Deryck Morales, David Silver, George Kantor, Ioannis M. Rekleitis
, Howie Choset:
The hierarchical atlas. IEEE Trans. Robotics 21(3): 473-481 (2005) - [c8]David Silver:
Cooperative Pathfinding. AIIDE 2005: 117-122 - [c7]David Silver, Joseph Carsten, Scott Thayer:
Topological Global Localization for Subterranean Voids. FSR 2005: 117-128 - [c6]Aaron Morris, David Silver, David I. Ferguson, Scott Thayer:
Towards Topological Exploration of Abandoned Mines. ICRA 2005: 2117-2123 - 2004
- [j2]David Silver:
Internet/Cyberculture/ Digital Culture/New Media/ Fill-in-the-Blank Studies. New Media Soc. 6(1): 55-64 (2004) - [c5]David Silver, Deryck Morales, Ioannis M. Rekleitis
, Brad Lisien, Howie Choset:
Arc Carving: Obtaining Accurate, Low Latency Maps from Ultrasonic Range Sensors. ICRA 2004: 1554-1561 - [c4]David Silver, Dave Ferguson, Aaron Morris, Scott Thayer:
Feature extraction for topological mine maps. IROS 2004: 773-779 - [c3]David Silver, David M. Bradley, Scott Thayer:
Scan matching for flooded subterranean voids. RAM 2004: 422-427 - [c2]David M. Bradley, David Silver, Scott Thayer:
A regional point descriptor for global topological localization in flooded subterranean environments. RAM 2004: 440-445 - 2003
- [c1]Brad Lisien, Deryck Morales, David Silver, George Kantor, Ioannis M. Rekleitis
, Howie Choset:
Hierarchical simultaneous localization and mapping. IROS 2003: 448-453 - 2000
- [j1]David Silver:
Book Review: Life Online: Researching Real Experience in Virtual Space. New Media Soc. 2(2): 251-255 (2000)
Coauthor Index
aka: Hado Philip van Hasselt
aka: Thomas K. Hubert
aka: Théophane Weber

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-04 21:15 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint