


Остановите войну!
for scientists:


default search action
Rémi Munos
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [i92]Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney:
An Analysis of Quantile Temporal-Difference Learning. CoRR abs/2301.04462 (2023) - [i91]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. CoRR abs/2303.08059 (2023) - [i90]Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. CoRR abs/2305.00654 (2023) - [i89]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - 2022
- [c143]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Marginalized Operators for Off-policy Reinforcement Learning. AISTATS 2022: 655-679 - [c142]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint. AAMAS 2022: 489-497 - [c141]Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L. Dyer, Rémi Munos, Petar Velickovic, Michal Valko:
Large-Scale Representation Learning on Graphs via Bootstrapping. ICLR 2022 - [c140]Shantanu Thakoor, Mark Rowland, Diana Borsa, Will Dabney, Rémi Munos, André Barreto:
Generalised Policy Improvement with Geometric Policy Composition. ICML 2022: 21272-21307 - [c139]Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022 - [c138]Yunhao Tang, Rémi Munos, Mark Rowland, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare:
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. NeurIPS 2022 - [c137]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. NeurIPS 2022 - [i88]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Marginalized Operators for Off-policy Reinforcement Learning. CoRR abs/2203.16177 (2022) - [i87]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i86]Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022) - [i85]Shantanu Thakoor, Mark Rowland, Diana Borsa, Will Dabney, Rémi Munos, André Barreto:
Generalised Policy Improvement with Geometric Policy Composition. CoRR abs/2206.08736 (2022) - [i84]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022) - [i83]Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare:
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. CoRR abs/2207.07570 (2022) - [i82]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. CoRR abs/2209.14414 (2022) - [i81]Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko:
Curiosity in hindsight. CoRR abs/2211.10515 (2022) - [i80]Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022) - [i79]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. CoRR abs/2212.12567 (2022) - 2021
- [j27]Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. J. Artif. Intell. Res. 71: 41-88 (2021) - [j26]Prashanth L. A.
, Nathaniel Korda, Rémi Munos:
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling. Mach. Learn. 110(3): 559-618 (2021) - [c136]Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel:
Revisiting Peng's Q(λ) for Modern Reinforcement Learning. ICML 2021: 5794-5804 - [c135]Thomas Mesnard, Theophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Thomas S. Stepleton, Nicolas Heess, Arthur Guez, Eric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos:
Counterfactual Credit Assignment in Model-Free Reinforcement Learning. ICML 2021: 7654-7664 - [c134]Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. ICML 2021: 8525-8535 - [c133]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Taylor Expansion of Discount Factors. ICML 2021: 10130-10140 - [c132]Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko:
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation. NeurIPS 2021: 5303-5315 - [c131]Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Learning in two-player zero-sum partially observable Markov games with perfect recall. NeurIPS 2021: 11987-11998 - [i78]Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021) - [i77]Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Rémi Munos, Petar Velickovic, Michal Valko:
Bootstrapped Representation Learning on Graphs. CoRR abs/2102.06514 (2021) - [i76]Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel:
Revisiting Peng's Q(λ) for Modern Reinforcement Learning. CoRR abs/2103.00107 (2021) - [i75]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: the Mean-field Game viewpoint. CoRR abs/2106.03787 (2021) - [i74]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Taylor Expansion of Discount Factors. CoRR abs/2106.06170 (2021) - [i73]Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall. CoRR abs/2106.06279 (2021) - [i72]Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko:
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation. CoRR abs/2106.13125 (2021) - 2020
- [j25]Will Dabney, Zeb Kurth-Nelson, Naoshige Uchida, Clara Kwon Starkweather, Demis Hassabis, Rémi Munos, Matthew M. Botvinick:
A distributional code for value in dopamine-based reinforcement learning. Nat. 577(7792): 671-675 (2020) - [c130]Mark Rowland, Will Dabney, Rémi Munos:
Adaptive Trade-Offs in Off-Policy Learning. AISTATS 2020: 34-44 - [c129]Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney:
Conditional Importance Sampling for Off-Policy Learning. AISTATS 2020: 45-55 - [c128]Daniel Hennes, Dustin Morrill, Shayegan Omidshafiei, Rémi Munos, Julien Pérolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Paavo Parmas, Edgar A. Duéñez-Guzmán, Karl Tuyls:
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients. AAMAS 2020: 492-501 - [c127]Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. ICLR 2020 - [c126]Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos:
Monte-Carlo Tree Search as Regularized Policy Optimization. ICML 2020: 3769-3778 - [c125]Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. ICML 2020: 3875-3886 - [c124]Rémi Munos, Julien Pérolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls:
Fast computation of Nash Equilibria in Imperfect Information Games. ICML 2020: 7119-7129 - [c123]Yunhao Tang, Michal Valko, Rémi Munos:
Taylor Expansion Policy Optimization. ICML 2020: 9397-9406 - [c122]Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020 - [c121]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning. NeurIPS 2020 - [i71]Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. CoRR abs/2002.08456 (2020) - [i70]Yunhao Tang, Michal Valko, Rémi Munos:
Taylor Expansion Policy Optimization. CoRR abs/2003.06259 (2020) - [i69]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of Regularization in RL. CoRR abs/2003.14089 (2020) - [i68]Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. CoRR abs/2004.14646 (2020) - [i67]Shayegan Omidshafiei, Karl Tuyls, Wojciech M. Czarnecki, Francisco C. Santos, Mark Rowland, Jerome T. Connor, Daniel Hennes, Paul Muller, Julien Pérolat, Bart De Vylder, Audrunas Gruslys, Rémi Munos:
Navigating the Landscape of Games. CoRR abs/2005.01642 (2020) - [i66]Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020) - [i65]Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos:
Monte-Carlo Tree Search as Regularized Policy Optimization. CoRR abs/2007.12509 (2020) - [i64]Audrunas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Pérolat, Dustin Morrill, Vinícius Flores Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls:
The Advantage Regret-Matching Actor-Critic. CoRR abs/2008.12234 (2020) - [i63]Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. CoRR abs/2011.09192 (2020) - [i62]Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Marcus Hutter, Lars Buesing, Rémi Munos:
Counterfactual Credit Assignment in Model-Free Reinforcement Learning. CoRR abs/2011.09464 (2020)
2010 – 2019
- 2019
- [c120]Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Rémi Munos, Doina Precup:
The Termination Critic. AISTATS 2019: 2231-2240 - [c119]Diana Borsa, Nicolas Heess, Bilal Piot, Siqi Liu, Leonard Hasenclever, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. AAMAS 2019: 1117-1124 - [c118]Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Hado van Hasselt, Rémi Munos, David Silver, Tom Schaul:
Universal Successor Features Approximators. ICLR (Poster) 2019 - [c117]Steven Kapturowski, Georg Ostrovski, John Quan, Rémi Munos, Will Dabney:
Recurrent Experience Replay in Distributed Reinforcement Learning. ICLR (Poster) 2019 - [c116]Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney:
Statistics and Samples in Distributional Reinforcement Learning. ICML 2019: 5528-5536 - [c115]Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos:
Multiagent Evaluation under Incomplete Information. NeurIPS 2019: 12270-12282 - [c114]Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Ménard, Rémi Munos, Michal Valko:
Planning in entropy-regularized Markov decision processes and games. NeurIPS 2019: 12383-12392 - [c113]Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. NeurIPS 2019: 12467-12476 - [i61]Jean-Bastien Grill, Michal Valko, Rémi Munos:
Optimistic optimization of a Brownian. CoRR abs/1901.04884 (2019) - [i60]André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. CoRR abs/1901.10964 (2019) - [i59]Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Jean-Bastien Grill, Florent Altché, Rémi Munos:
World Discovery Models. CoRR abs/1902.07685 (2019) - [i58]Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney:
Statistics and Samples in Distributional Reinforcement Learning. CoRR abs/1902.08102 (2019) - [i57]Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Rémi Munos, Doina Precup:
The Termination Critic. CoRR abs/1902.09996 (2019) - [i56]Shayegan Omidshafiei, Christos H. Papadimitriou, Georgios Piliouras, Karl Tuyls, Mark Rowland, Jean-Baptiste Lespiau, Wojciech M. Czarnecki, Marc Lanctot, Julien Pérolat, Rémi Munos:
α-Rank: Multi-Agent Evaluation by Evolution. CoRR abs/1903.01373 (2019) - [i55]Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Rémi Munos, Julien Pérolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Karl Tuyls:
Neural Replicator Dynamics. CoRR abs/1906.00190 (2019) - [i54]Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos:
Multiagent Evaluation under Incomplete Information. CoRR abs/1909.09849 (2019) - [i53]Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. CoRR abs/1909.12823 (2019) - [i52]Mark Rowland, Will Dabney, Rémi Munos:
Adaptive Trade-Offs in Off-Policy Learning. CoRR abs/1910.07478 (2019) - [i51]Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney:
Conditional Importance Sampling for Off-Policy Learning. CoRR abs/1910.07479 (2019) - [i50]Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. CoRR abs/1912.02503 (2019) - 2018
- [j24]Lucian Busoniu
, Elod Páll, Rémi Munos:
Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values. Autom. 92: 100-108 (2018) - [j23]Koppány Máthé, Lucian Busoniu
, Rémi Munos, Bart De Schutter
:
Optimistic planning with an adaptive number of action switches for near-optimal nonlinear control. Eng. Appl. Artif. Intell. 67: 355-367 (2018) - [c112]Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos:
Distributional Reinforcement Learning With Quantile Regression. AAAI 2018: 2892-2901 - [c111]Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh:
An Analysis of Categorical Distributional Reinforcement Learning. AISTATS 2018: 29-37 - [c110]Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller:
Maximum a Posteriori Policy Optimisation. ICLR (Poster) 2018 - [c109]Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks For Exploration. ICLR (Poster) 2018 - [c108]Audrunas Gruslys, Will Dabney, Mohammad Gheshlaghi Azar, Bilal Piot, Marc G. Bellemare, Rémi Munos:
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning. ICLR (Poster) 2018 - [c107]André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. ICML 2018: 510-519 - [c106]Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018: 1104-1113 - [c105]Lasse Espeholt, Hubert Soyer, Rémi Munos, Karen Simonyan, Volodymyr Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu:
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. ICML 2018: 1406-1415 - [c104]Arthur Guez, Theophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. ICML 2018: 1817-1826 - [c103]Brendan O'Donoghue, Ian Osband, Rémi Munos, Volodymyr Mnih:
The Uncertainty Bellman Equation and Exploration. ICML 2018: 3836-3845 - [c102]Georg Ostrovski, Will Dabney, Rémi Munos:
Autoregressive Quantile Networks for Generative Modeling. ICML 2018: 3933-3942 - [c101]Jean-Bastien Grill, Michal Valko, Rémi Munos:
Optimistic optimization of a Brownian. NeurIPS 2018: 3009-3018 - [c100]Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael Bowling:
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. NeurIPS 2018: 3426-3439 - [i49]Lasse Espeholt, Hubert Soyer, Rémi Munos, Karen Simonyan, Volodymyr Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu:
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. CoRR abs/1802.01561 (2018) - [i48]Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. CoRR abs/1802.04697 (2018) - [i47]Chiyuan Zhang, Oriol Vinyals, Rémi Munos, Samy Bengio:
A Study on Overfitting in Deep Reinforcement Learning. CoRR abs/1804.06893 (2018) - [i46]Thomas S. Stepleton, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, Rémi Munos:
Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery. CoRR abs/1805.04955 (2018) - [i45]Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Vecerík, Matteo Hessel, Rémi Munos, Olivier Pietquin:
Observe and Look Further: Achieving Consistent Performance on Atari. CoRR abs/1805.11593 (2018) - [i44]Georg Ostrovski, Will Dabney, Rémi Munos:
Autoregressive Quantile Networks for Generative Modeling. CoRR abs/1806.05575 (2018) - [i43]Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller:
Maximum a Posteriori Policy Optimisation. CoRR abs/1806.06920 (2018) - [i42]Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. CoRR abs/1806.06923 (2018) - [i41]Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael Bowling:
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. CoRR abs/1810.09026 (2018) - [i40]Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos:
Neural Predictive Belief Representations. CoRR abs/1811.06407 (2018) - [i39]Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul:
Universal Successor Features Approximators. CoRR abs/1812.07626 (2018) - 2017
- [c99]Jane Wang, Zeb Kurth-Nelson, Hubert Soyer, Joel Z. Leibo, Dhruva Tirumala, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matt M. Botvinick:
Learning to reinforcement learn. CogSci 2017 - [c98]Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Rémi Munos, Koray Kavukcuoglu, Nando de Freitas:
Sample Efficient Actor-Critic with Experience Replay. ICLR (Poster) 2017 - [c97]Brendan O'Donoghue, Rémi Munos, Koray Kavukcuoglu, Volodymyr Mnih:
Combining policy gradient and Q-learning. ICLR (Poster) 2017 - [c96]Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. ICML 2017: 263-272 - [c95]Marc G. Bellemare, Will Dabney, Rémi Munos:
A Distributional Perspective on Reinforcement Learning. ICML 2017: 449-458 - [c94]Alex Graves, Marc G. Bellemare, Jacob Menick, Rémi Munos, Koray Kavukcuoglu:
Automated Curriculum Learning for Neural Networks. ICML 2017: 1311-1320 - [c93]Georg Ostrovski, Marc G. Bellemare, Aäron van den Oord, Rémi Munos:
Count-Based Exploration with Neural Density Models. ICML 2017: 2721-2730 - [c92]André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, David Silver, Hado van Hasselt:
Successor Features for Transfer in Reinforcement Learning. NIPS 2017: 4055-4065 - [i38]Georg Ostrovski, Marc G. Bellemare, Aäron van den Oord, Rémi Munos:
Count-Based Exploration with Neural Density Models. CoRR abs/1703.01310 (2017) - [i37]Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. CoRR abs/1703.05449 (2017) - [i36]Alex Graves, Marc G. Bellemare, Jacob Menick, Rémi Munos, Koray Kavukcuoglu:
Automated Curriculum Learning for Neural Networks. CoRR abs/1704.03003 (2017) - [i35]Audrunas Gruslys, Mohammad Gheshlaghi Azar, Marc G. Bellemare, Rémi Munos:
The Reactor: A Sample-Efficient Actor-Critic Architecture. CoRR abs/1704.04651 (2017) - [i34]Marc G. Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, Rémi Munos:
The Cramer Distance as a Solution to Biased Wasserstein Gradients. CoRR abs/1705.10743 (2017) - [i33]Diana Borsa, Bilal Piot, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. CoRR abs/1706.06617 (2017) - [i32]