default search action

combined dblp search
author search
venue search
publication search

ask others

Rémi Munos

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c164]
- view
- export record
  dblp key:
  - conf/icml/FarebrotherPTML25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/FarebrotherPTML25
Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, Rémi Munos, Alessandro Lazaric, Ahmed Touati:
Temporal Difference Flows. ICML 2025
[c163]
- view
- export record
  dblp key:
  - conf/icml/TangZSM25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangZSM25
Yunhao Tang, Kunhao Zheng, Gabriel Synnaeve, Rémi Munos:
Optimizing Language Models for Inference Time Objectives using Reinforcement Learning. ICML 2025
[i120]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-13028
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-13028
Bernardo Ávila Pires, Mark Rowland, Diana Borsa, Zhaohan Daniel Guo, Khimya Khetarpal, André Barreto, David Abel, Rémi Munos, Will Dabney:
Optimizing Return Distributions with Distributional Dynamic Programming. CoRR abs/2501.13028 (2025)
[i119]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-05453
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-05453
Taco Cohen, David W. Zhang, Kunhao Zheng, Yunhao Tang, Rémi Munos, Gabriel Synnaeve:
Soft Policy Optimization: Online Off-Policy RL for Sequence Models. CoRR abs/2503.05453 (2025)
[i118]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-09817
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-09817
Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, Rémi Munos, Alessandro Lazaric, Ahmed Touati:
Temporal Difference Flows. CoRR abs/2503.09817 (2025)
[i117]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-19595
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-19595
Yunhao Tang, Kunhao Zheng, Gabriel Synnaeve, Rémi Munos:
Optimizing Language Models for Inference Time Objectives using Reinforcement Learning. CoRR abs/2503.19595 (2025)
[i116]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-19612
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-19612
Yunhao Tang, Taco Cohen, David W. Zhang, Michal Valko, Rémi Munos:
RL-finetuning LLMs from on- and off-policy data with a single algorithm. CoRR abs/2503.19612 (2025)
[i115]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-19618
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-19618
Yunhao Tang, Sid Wang, Rémi Munos:
Learning to chain-of-thought with Jensen's evidence lower bound. CoRR abs/2503.19618 (2025)
[i114]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-09477
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-09477
Yunhao Tang, Rémi Munos:
On a few pitfalls in KL divergence gradient estimation for RL. CoRR abs/2506.09477 (2025)
[i113]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-20520
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-20520
Charles Arnal, Gaëtan Narozniak, Vivien Cabannes, Yunhao Tang, Julia Kempe, Rémi Munos:
Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards. CoRR abs/2506.20520 (2025)
[i112]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-06941
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-06941
Yuda Song, Julia Kempe, Rémi Munos:
Outcome-based Exploration for LLM Reasoning. CoRR abs/2509.06941 (2025)
[i111]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-12635
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-12635
Yu Wang, Sheng Shen, Rémi Munos, Hongyuan Zhan, Yuandong Tian:
Positional Encoding via Token-Aware Phase Attention. CoRR abs/2509.12635 (2025)
[i110]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-21638
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-21638
Daniel R. Jiang, Jalaj Bhandari, Yukai Yang, Rémi Munos, Tyler Lu:
Aligning LLMs Toward Multi-Turn Conversational Outcomes Using Iterative PPO. CoRR abs/2511.21638 (2025)
[i109]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-20806
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-20806
Anselm Paulus, Ilia Kulikov, Brandon Amos, Rémi Munos, Ivan Evtimov, Kamalika Chaudhuri, Arman Zharmagambetov:
Safety Alignment of LMs via Non-cooperative Games. CoRR abs/2512.20806 (2025)
2024
[j28]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/RowlandMATOHTBD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/RowlandMATOHTBD24
Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney:
An Analysis of Quantile Temporal-Difference Learning. J. Mach. Learn. Res. 25: 163:1-163:47 (2024)
[c162]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/AzarGPMRVC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/AzarGPMRVC24
Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Rémi Munos, Mark Rowland, Michal Valko, Daniele Calandriello:
A General Theoretical Paradigm to Understand Learning from Human Preferences. AISTATS 2024: 4447-4455
[c161]
- view
- export record
  dblp key:
  - conf/icml/CalandrielloGMR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/CalandrielloGMR24
Daniele Calandriello, Zhaohan Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. ICML 2024: 5409-5435
[c160]
- view
- export record
  dblp key:
  - conf/icml/MunosVCARGTGMFM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MunosVCARGTGMFM24
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. ICML 2024: 36743-36768
[c159]
- view
- export record
  dblp key:
  - conf/icml/TangGZCMRRVPP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGZCMRRVPP24
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. ICML 2024: 47725-47742
[c158]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/0001LMLTD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/0001LMLTD24
Mark Rowland, Kevin Kevin Li, Rémi Munos, Clare Lyle, Yunhao Tang, Will Dabney:
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model. NeurIPS 2024
[c157]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/FiegelMKMPV24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/FiegelMKMPV24
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Local and Adaptive Mirror Descents in Extensive-Form Games. NeurIPS 2024
[c156]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Shani0CLCZNKPSH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Shani0CLCZNKPSH24
Lior Shani, Aviv Rosenberg, Asaf Cassel, Oran Lang, Daniele Calandriello, Avital Zipori, Hila Noga, Orgad Keller, Bilal Piot, Idan Szpektor, Avinatan Hassidim, Yossi Matias, Rémi Munos:
Multi-turn Reinforcement Learning with Preference Human Feedback. NeurIPS 2024
[i108]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-05749
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-05749
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. CoRR abs/2402.05749 (2024)
[i107]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-05766
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-05766
Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney:
Off-policy Distributional Q(λ): Distributional RL without Importance Sampling. CoRR abs/2402.05766 (2024)
[i106]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-07598
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-07598
Mark Rowland, Li Kevin Wenliang, Rémi Munos, Clare Lyle, Yunhao Tang, Will Dabney:
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model. CoRR abs/2402.07598 (2024)
[i105]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-08635
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-08635
Daniele Calandriello, Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. CoRR abs/2403.08635 (2024)
[i104]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-04407
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-04407
Laurent Orseau, Rémi Munos:
Super-Exponential Regret for UCT, AlphaGo and Variants. CoRR abs/2405.04407 (2024)
[i103]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-08448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-08448
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Yuan Cao, Eugene Tarassov, Rémi Munos, Bernardo Ávila Pires, Michal Valko, Yong Cheng, Will Dabney:
Understanding the performance gap between online and offline alignment algorithms. CoRR abs/2405.08448 (2024)
[i102]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-14655
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-14655
Lior Shani, Aviv Rosenberg, Asaf B. Cassel, Oran Lang, Daniele Calandriello, Avital Zipori, Hila Noga, Orgad Keller, Bilal Piot, Idan Szpektor, Avinatan Hassidim, Yossi Matias, Rémi Munos:
Multi-turn Reinforcement Learning from Preference Human Feedback. CoRR abs/2405.14655 (2024)
[i101]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19107
Pierre Harvey Richemond, Yunhao Tang, Daniel Guo, Daniele Calandriello, Mohammad Gheshlaghi Azar, Rafael Rafailov, Bernardo Ávila Pires, Eugene Tarassov, Lucas Spangher, Will Ellsworth, Aliaksei Severyn, Jonathan Mallinson, Lior Shani, Gil Shamir, Rishabh Joshi, Tianqi Liu, Rémi Munos, Bilal Piot:
Offline Regularised Reinforcement Learning for Large Language Models Alignment. CoRR abs/2405.19107 (2024)
2023
[c155]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ChandakTGTMDB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ChandakTGTMDB23
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. ICML 2023: 4009-4034
[c154]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/FiegelMKMPV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/FiegelMKMPV23
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. ICML 2023: 10093-10135
[c153]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/JarrettTAMMV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/JarrettTAMMV23
Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko:
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments. ICML 2023: 14780-14816
[c152]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/KitamuraKTVVYMM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/KitamuraKTVVYMM23
Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175
[c151]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/MesnardCSTRWLGV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MesnardCSTRWLGV23
Thomas Mesnard, Wenqi Chen, Alaa Saade, Yunhao Tang, Mark Rowland, Theophane Weber, Clare Lyle, Audrunas Gruslys, Michal Valko, Will Dabney, Georg Ostrovski, Eric Moulines, Rémi Munos:
Quantile Credit Assignment. ICML 2023: 24517-24531
[c150]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/RowlandTLMBD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/RowlandTLMBD23
Mark Rowland, Yunhao Tang, Clare Lyle, Rémi Munos, Marc G. Bellemare, Will Dabney:
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation. ICML 2023: 29210-29231
[c149]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangGRPCMRALL0T23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGRPCMRALL0T23
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656
[c148]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangKRHMPV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangKRHMPV23
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko:
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm. ICML 2023: 33657-33673
[c147]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangM23
Yunhao Tang, Rémi Munos:
Towards a better understanding of representation dynamics under TD-learning. ICML 2023: 33720-33738
[c146]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangMRV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangMRV23
Yunhao Tang, Rémi Munos, Mark Rowland, Michal Valko:
VA-learning as a more efficient alternative to Q-learning. ICML 2023: 33739-33757
[c145]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TiapkinBCMMNPTV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TiapkinBCMMNPTV23
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. ICML 2023: 34161-34221
[c144]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/TiapkinBCMMNPVM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/TiapkinBCMMNPVM23
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard:
Model-free Posterior Sampling via Learning Rate Randomization. NeurIPS 2023
[i100]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-04462
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-04462
Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney:
An Analysis of Quantile Temporal-Difference Learning. CoRR abs/2301.04462 (2023)
[i99]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-08059
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-08059
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. CoRR abs/2303.08059 (2023)
[i98]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-00654
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-00654
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. CoRR abs/2305.00654 (2023)
[i97]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13185
Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023)
[i96]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18161
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18161
Yunhao Tang, Rémi Munos, Mark Rowland, Michal Valko:
VA-learning as a more efficient alternative to Q-learning. CoRR abs/2305.18161 (2023)
[i95]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18388
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18388
Mark Rowland, Yunhao Tang, Clare Lyle, Rémi Munos, Marc G. Bellemare, Will Dabney:
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation. CoRR abs/2305.18388 (2023)
[i94]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18491
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18491
Yunhao Tang, Rémi Munos:
Towards a Better Understanding of Representation Dynamics under TD-learning. CoRR abs/2305.18491 (2023)
[i93]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18501
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18501
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko:
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm. CoRR abs/2305.18501 (2023)
[i92]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-00656
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-00656
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Local and adaptive mirror descents in extensive-form games. CoRR abs/2309.00656 (2023)
[i91]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-12036
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-12036
Mohammad Gheshlaghi Azar, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, Rémi Munos:
A General Theoretical Paradigm to Understand Learning from Human Preferences. CoRR abs/2310.12036 (2023)
[i90]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-18186
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-18186
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard:
Model-free Posterior Sampling via Learning Rate Randomization. CoRR abs/2310.18186 (2023)
[i89]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-00886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-00886
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. CoRR abs/2312.00886 (2023)
2022
[c143]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/TangRMV22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/TangRMV22
Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Marginalized Operators for Off-policy Reinforcement Learning. AISTATS 2022: 655-679
[c142]
- view
  authority control:
- export record
  dblp key:
  - conf/atal/GeistPLEPBMP22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/GeistPLEPBMP22
Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint. AAMAS 2022: 489-497
[c141]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ThakoorTAADMVV22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ThakoorTAADMVV22
Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L. Dyer, Rémi Munos, Petar Velickovic, Michal Valko:
Large-Scale Representation Learning on Graphs via Bootstrapping. ICLR 2022
[c140]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ThakoorRBDM022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ThakoorRBDM022
Shantanu Thakoor, Mark Rowland, Diana Borsa, Will Dabney, Rémi Munos, André Barreto:
Generalised Policy Improvement with Geometric Policy Composition. ICML 2022: 21272-21307
[c139]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/GuoTPPATSCGTVMA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuoTPPATSCGTVMA22
Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022
[c138]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/TangMRPDB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/TangMRPDB22
Yunhao Tang, Rémi Munos, Mark Rowland, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare:
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. NeurIPS 2022
[c137]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/TiapkinBCMMNRVM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/TiapkinBCMMNRVM22
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. NeurIPS 2022
[d1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - data/10/PerolatVHTSBMCBAMECWGMK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/data/10/PerolatVHTSBMCBAMECWGMK22
Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning". Zenodo, 2022
[i88]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16177
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16177
Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Marginalized Operators for Off-policy Reinforcement Learning. CoRR abs/2203.16177 (2022)
[i87]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-14211
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-14211
Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022)
[i86]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08332
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08332
Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022)
[i85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08736
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08736
Shantanu Thakoor, Mark Rowland, Diana Borsa, Will Dabney, Rémi Munos, André Barreto:
Generalised Policy Improvement with Geometric Policy Composition. CoRR abs/2206.08736 (2022)
[i84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-15378
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-15378
Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022)
[i83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-07570
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-07570
Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare:
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. CoRR abs/2207.07570 (2022)
[i82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2209-14414
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2209-14414
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. CoRR abs/2209.14414 (2022)
[i81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-10515
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-10515
Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko:
Curiosity in hindsight. CoRR abs/2211.10515 (2022)
[i80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03319
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03319
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022)
[i79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-12567
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-12567
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. CoRR abs/2212.12567 (2022)
2021
[j27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/jair/TuylsOMWCHGSWSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jair/TuylsOMWCHGSWSL21
Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. J. Artif. Intell. Res. 71: 41-88 (2021)
[j26]
- view
  authority control:
- export record
  dblp key:
  - journals/ml/AKM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/AKM21
Prashanth L. A., Nathaniel Korda, Rémi Munos:
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling. Mach. Learn. 110(3): 559-618 (2021)
[c136]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/KozunoTRMKDVA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/KozunoTRMKDVA21
Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel:
Revisiting Peng's Q(λ) for Modern Reinforcement Learning. ICML 2021: 5794-5804
[c135]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/MesnardWVTSHDSH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MesnardWVTSHDSH21
Thomas Mesnard, Theophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Thomas S. Stepleton, Nicolas Heess, Arthur Guez, Eric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos:
Counterfactual Credit Assignment in Model-Free Reinforcement Learning. ICML 2021: 7654-7664
[c134]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/PerolatMLOROBAB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PerolatMLOROBAB21
Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. ICML 2021: 8525-8535
[c133]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangRMV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangRMV21
Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Taylor Expansion of Discount Factors. ICML 2021: 10130-10140
[c132]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/TangKRMV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/TangKRMV21
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko:
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation. NeurIPS 2021: 5303-5315
[c131]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/KozunoMMV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KozunoMMV21
Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Learning in two-player zero-sum partially observable Markov games with perfect recall. NeurIPS 2021: 11987-11998
[i78]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2101-02055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-02055
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021)
[i77]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-06514
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-06514
Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Rémi Munos, Petar Velickovic, Michal Valko:
Bootstrapped Representation Learning on Graphs. CoRR abs/2102.06514 (2021)
[i76]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-00107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-00107
Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel:
Revisiting Peng's Q(λ) for Modern Reinforcement Learning. CoRR abs/2103.00107 (2021)
[i75]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-03787
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-03787
Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: the Mean-field Game viewpoint. CoRR abs/2106.03787 (2021)
[i74]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-06170
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-06170
Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Taylor Expansion of Discount Factors. CoRR abs/2106.06170 (2021)
[i73]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-06279
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-06279
Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall. CoRR abs/2106.06279 (2021)
[i72]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-13125
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-13125
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko:
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation. CoRR abs/2106.13125 (2021)
2020
[j25]
- view
  authority control:
- export record
  dblp key:
  - journals/nature/DabneyKUSHMB20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/nature/DabneyKUSHMB20
Will Dabney, Zeb Kurth-Nelson, Naoshige Uchida, Clara Kwon Starkweather, Demis Hassabis, Rémi Munos, Matthew M. Botvinick:
A distributional code for value in dopamine-based reinforcement learning. Nat. 577(7792): 671-675 (2020)
[c130]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/RowlandDM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/RowlandDM20
Mark Rowland, Will Dabney, Rémi Munos:
Adaptive Trade-Offs in Off-Policy Learning. AISTATS 2020: 34-44
[c129]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/RowlandHHBSMD20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/RowlandHHBSMD20
Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney:
Conditional Importance Sampling for Off-Policy Learning. AISTATS 2020: 45-55
[c128]
- view
  authority control:
- export record
  dblp key:
  - conf/atal/HennesMOMPLGLPD20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/HennesMOMPLGLPD20
Daniel Hennes, Dustin Morrill, Shayegan Omidshafiei, Rémi Munos, Julien Pérolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Paavo Parmas, Edgar A. Duéñez-Guzmán, Karl Tuyls:
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients. AAMAS 2020: 492-501
[c127]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/MullerORTPLHMLH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/MullerORTPLHMLH20
Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. ICLR 2020
[c126]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GrillATHVAM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GrillATHVAM20
Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos:
Monte-Carlo Tree Search as Regularized Policy Optimization. ICML 2020: 3769-3778
[c125]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GuoPPGAMA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GuoPPGAMA20
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. ICML 2020: 3875-3886
[c124]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/MunosPLRVLTHOGA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MunosPLRVLTHOGA20
Rémi Munos, Julien Pérolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls:
Fast computation of Nash Equilibria in Imperfect Information Games. ICML 2020: 7119-7129
[c123]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/TangVM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangVM20
Yunhao Tang, Michal Valko, Rémi Munos:
Taylor Expansion Policy Optimization. ICML 2020: 9397-9406
[c122]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/GrillSATRBDPGAP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillSATRBDPGAP20
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020
[c121]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/VieillardKSPMG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/VieillardKSPMG20
Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning. NeurIPS 2020
[i71]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-08456
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-08456
Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. CoRR abs/2002.08456 (2020)
[i70]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2003-06259
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-06259
Yunhao Tang, Michal Valko, Rémi Munos:
Taylor Expansion Policy Optimization. CoRR abs/2003.06259 (2020)
[i69]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2003-14089
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-14089
Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of Regularization in RL. CoRR abs/2003.14089 (2020)
[i68]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-14646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-14646
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. CoRR abs/2004.14646 (2020)
[i67]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-01642
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-01642
Shayegan Omidshafiei, Karl Tuyls, Wojciech M. Czarnecki, Francisco C. Santos, Mark Rowland, Jerome T. Connor, Daniel Hennes, Paul Muller, Julien Pérolat, Bart De Vylder, Audrunas Gruslys, Rémi Munos:
Navigating the Landscape of Games. CoRR abs/2005.01642 (2020)
[i66]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07733
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07733
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020)
[i65]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2007-12509
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2007-12509
Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos:
Monte-Carlo Tree Search as Regularized Policy Optimization. CoRR abs/2007.12509 (2020)
[i64]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2008-12234
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-12234
Audrunas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Pérolat, Dustin Morrill, Vinícius Flores Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls:
The Advantage Regret-Matching Actor-Critic. CoRR abs/2008.12234 (2020)
[i63]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-09192
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-09192
Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. CoRR abs/2011.09192 (2020)
[i62]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-09464
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-09464
Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Marcus Hutter, Lars Buesing, Rémi Munos:
Counterfactual Credit Assignment in Model-Free Reinforcement Learning. CoRR abs/2011.09464 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c120]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/HarutyunyanDBHM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/HarutyunyanDBHM19
Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Rémi Munos, Doina Precup:
The Termination Critic. AISTATS 2019: 2231-2240
[c119]
- view
  - electronic edition @ acm.org
  - details & citations
- export record
  dblp key:
  - conf/atal/BorsaHPLHMP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/BorsaHPLHMP19
Diana Borsa, Nicolas Heess, Bilal Piot, Siqi Liu, Leonard Hasenclever, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. AAMAS 2019: 1117-1124
[c118]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/BorsaBQMHMSS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/BorsaBQMHMSS19
Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Hado van Hasselt, Rémi Munos, David Silver, Tom Schaul:
Universal Successor Features Approximators. ICLR (Poster) 2019
[c117]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/KapturowskiOQMD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/KapturowskiOQMD19
Steven Kapturowski, Georg Ostrovski, John Quan, Rémi Munos, Will Dabney:
Recurrent Experience Replay in Distributed Reinforcement Learning. ICLR (Poster) 2019
[c116]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/RowlandDKMBD19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/RowlandDKMBD19
Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney:
Statistics and Samples in Distributional Reinforcement Learning. ICML 2019: 5528-5536
[c115]
- view
- export record
  dblp key:
  - conf/nips/RowlandOTPVPM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/RowlandOTPVPM19
Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos:
Multiagent Evaluation under Incomplete Information. NeurIPS 2019: 12270-12282
[c114]
- view
- export record
  dblp key:
  - conf/nips/GrillDMMV19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillDMMV19
Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Ménard, Rémi Munos, Michal Valko:
Planning in entropy-regularized Markov decision processes and games. NeurIPS 2019: 12383-12392
[c113]
- view
- export record
  dblp key:
  - conf/nips/HarutyunyanDMAP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HarutyunyanDMAP19
Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. NeurIPS 2019: 12467-12476
[i61]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1901-04884
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1901-04884
Jean-Bastien Grill, Michal Valko, Rémi Munos:
Optimistic optimization of a Brownian. CoRR abs/1901.04884 (2019)
[i60]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1901-10964
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1901-10964
André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. CoRR abs/1901.10964 (2019)
[i59]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1902-07685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-07685
Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Jean-Bastien Grill, Florent Altché, Rémi Munos:
World Discovery Models. CoRR abs/1902.07685 (2019)
[i58]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1902-08102
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-08102
Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney:
Statistics and Samples in Distributional Reinforcement Learning. CoRR abs/1902.08102 (2019)
[i57]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1902-09996
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-09996
Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Rémi Munos, Doina Precup:
The Termination Critic. CoRR abs/1902.09996 (2019)
[i56]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1903-01373
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1903-01373
Shayegan Omidshafiei, Christos H. Papadimitriou, Georgios Piliouras, Karl Tuyls, Mark Rowland, Jean-Baptiste Lespiau, Wojciech M. Czarnecki, Marc Lanctot, Julien Pérolat, Rémi Munos:
α-Rank: Multi-Agent Evaluation by Evolution. CoRR abs/1903.01373 (2019)
[i55]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1906-00190
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1906-00190
Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Rémi Munos, Julien Pérolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Karl Tuyls:
Neural Replicator Dynamics. CoRR abs/1906.00190 (2019)
[i54]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1909-09849
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1909-09849
Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos:
Multiagent Evaluation under Incomplete Information. CoRR abs/1909.09849 (2019)
[i53]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1909-12823
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1909-12823
Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. CoRR abs/1909.12823 (2019)
[i52]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-07478
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-07478
Mark Rowland, Will Dabney, Rémi Munos:
Adaptive Trade-Offs in Off-Policy Learning. CoRR abs/1910.07478 (2019)
[i51]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-07479
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-07479
Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney:
Conditional Importance Sampling for Off-Policy Learning. CoRR abs/1910.07479 (2019)
[i50]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1912-02503
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-02503
Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. CoRR abs/1912.02503 (2019)
2018
[j24]
- view
  authority control:
- export record
  dblp key:
  - journals/automatica/BusoniuPM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/automatica/BusoniuPM18
Lucian Busoniu, Elod Páll, Rémi Munos:
Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values. Autom. 92: 100-108 (2018)
[j23]
- view
  authority control:
- export record
  dblp key:
  - journals/eaai/MatheBMS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/eaai/MatheBMS18
Koppány Máthé, Lucian Busoniu, Rémi Munos, Bart De Schutter:
Optimistic planning with an adaptive number of action switches for near-optimal nonlinear control. Eng. Appl. Artif. Intell. 67: 355-367 (2018)
[c112]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/DabneyRBM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/DabneyRBM18
Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos:
Distributional Reinforcement Learning With Quantile Regression. AAAI 2018: 2892-2901
[c111]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/RowlandBDMT18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/RowlandBDMT18
Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh:
An Analysis of Categorical Distributional Reinforcement Learning. AISTATS 2018: 29-37
[c110]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/AbdolmalekiSTMH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/AbdolmalekiSTMH18
Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller:
Maximum a Posteriori Policy Optimisation. ICLR (Poster) 2018
[c109]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/FortunatoAPMHOG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/FortunatoAPMHOG18
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks For Exploration. ICLR (Poster) 2018
[c108]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/GruslysDAPBM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/GruslysDAPBM18
Audrunas Gruslys, Will Dabney, Mohammad Gheshlaghi Azar, Bilal Piot, Marc G. Bellemare, Rémi Munos:
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning. ICLR (Poster) 2018
[c107]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/BarretoBQSSHMZM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BarretoBQSSHMZM18
André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. ICML 2018: 510-519
[c106]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/DabneyOSM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/DabneyOSM18
Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018: 1104-1113
[c105]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/EspeholtSMSMWDF18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/EspeholtSMSMWDF18
Lasse Espeholt, Hubert Soyer, Rémi Munos, Karen Simonyan, Volodymyr Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu:
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. ICML 2018: 1406-1415
[c104]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GuezWASVWMS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GuezWASVWMS18
Arthur Guez, Theophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. ICML 2018: 1817-1826
[c103]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ODonoghueOMM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ODonoghueOMM18
Brendan O'Donoghue, Ian Osband, Rémi Munos, Volodymyr Mnih:
The Uncertainty Bellman Equation and Exploration. ICML 2018: 3836-3845
[c102]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/OstrovskiDM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/OstrovskiDM18
Georg Ostrovski, Will Dabney, Rémi Munos:
Autoregressive Quantile Networks for Generative Modeling. ICML 2018: 3933-3942
[c101]
- view
- export record
  dblp key:
  - conf/nips/GrillVM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillVM18
Jean-Bastien Grill, Michal Valko, Rémi Munos:
Optimistic optimization of a Brownian. NeurIPS 2018: 3009-3018
[c100]
- view
- export record
  dblp key:
  - conf/nips/SrinivasanLZPTM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/SrinivasanLZPTM18
Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael Bowling:
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. NeurIPS 2018: 3426-3439
[i49]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1802-01561
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1802-01561
Lasse Espeholt, Hubert Soyer, Rémi Munos, Karen Simonyan, Volodymyr Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu:
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. CoRR abs/1802.01561 (2018)
[i48]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1802-04697
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1802-04697
Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. CoRR abs/1802.04697 (2018)
[i47]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1804-06893
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1804-06893
Chiyuan Zhang, Oriol Vinyals, Rémi Munos, Samy Bengio:
A Study on Overfitting in Deep Reinforcement Learning. CoRR abs/1804.06893 (2018)
[i46]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1805-04955
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-04955
Thomas S. Stepleton, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, Rémi Munos:
Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery. CoRR abs/1805.04955 (2018)
[i45]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1805-11593
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-11593
Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Vecerík, Matteo Hessel, Rémi Munos, Olivier Pietquin:
Observe and Look Further: Achieving Consistent Performance on Atari. CoRR abs/1805.11593 (2018)
[i44]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1806-05575
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1806-05575
Georg Ostrovski, Will Dabney, Rémi Munos:
Autoregressive Quantile Networks for Generative Modeling. CoRR abs/1806.05575 (2018)
[i43]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1806-06920
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1806-06920
Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller:
Maximum a Posteriori Policy Optimisation. CoRR abs/1806.06920 (2018)
[i42]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1806-06923
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1806-06923
Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. CoRR abs/1806.06923 (2018)
[i41]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-09026
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-09026
Sriram Srinivasan, Marc Lanctot, Vinícius Flores Zambaldi, Julien Pérolat, Karl Tuyls, Rémi Munos, Michael Bowling:
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. CoRR abs/1810.09026 (2018)
[i40]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-06407
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-06407
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos:
Neural Predictive Belief Representations. CoRR abs/1811.06407 (2018)
[i39]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1812-07626
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1812-07626
Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul:
Universal Successor Features Approximators. CoRR abs/1812.07626 (2018)
2017
[c99]
- view
  - electronic edition @ escholarship.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/cogsci/WangKSLTMBKB17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cogsci/WangKSLTMBKB17
Jane Wang, Zeb Kurth-Nelson, Hubert Soyer, Joel Z. Leibo, Dhruva Tirumala, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matt M. Botvinick:
Learning to reinforcement learn. CogSci 2017
[c98]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/0001BHMMKF17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/0001BHMMKF17
Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Rémi Munos, Koray Kavukcuoglu, Nando de Freitas:
Sample Efficient Actor-Critic with Experience Replay. ICLR (Poster) 2017
[c97]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ODonoghueMKM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ODonoghueMKM17
Brendan O'Donoghue, Rémi Munos, Koray Kavukcuoglu, Volodymyr Mnih:
Combining policy gradient and Q-learning. ICLR (Poster) 2017
[c96]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/AzarOM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AzarOM17
Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. ICML 2017: 263-272
[c95]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/BellemareDM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BellemareDM17
Marc G. Bellemare, Will Dabney, Rémi Munos:
A Distributional Perspective on Reinforcement Learning. ICML 2017: 449-458
[c94]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GravesBMMK17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GravesBMMK17
Alex Graves, Marc G. Bellemare, Jacob Menick, Rémi Munos, Koray Kavukcuoglu:
Automated Curriculum Learning for Neural Networks. ICML 2017: 1311-1320
[c93]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/OstrovskiBOM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/OstrovskiBOM17
Georg Ostrovski, Marc G. Bellemare, Aäron van den Oord, Rémi Munos:
Count-Based Exploration with Neural Density Models. ICML 2017: 2721-2730
[c92]
- view
- export record
  dblp key:
  - conf/nips/BarretoDMHSSH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/BarretoDMHSSH17
André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, David Silver, Hado van Hasselt:
Successor Features for Transfer in Reinforcement Learning. NIPS 2017: 4055-4065
[i38]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/OstrovskiBOM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/OstrovskiBOM17
Georg Ostrovski, Marc G. Bellemare, Aäron van den Oord, Rémi Munos:
Count-Based Exploration with Neural Density Models. CoRR abs/1703.01310 (2017)
[i37]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/AzarOM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/AzarOM17
Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. CoRR abs/1703.05449 (2017)
[i36]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/GravesBMMK17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GravesBMMK17
Alex Graves, Marc G. Bellemare, Jacob Menick, Rémi Munos, Koray Kavukcuoglu:
Automated Curriculum Learning for Neural Networks. CoRR abs/1704.03003 (2017)
[i35]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/GruslysABM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GruslysABM17
Audrunas Gruslys, Mohammad Gheshlaghi Azar, Marc G. Bellemare, Rémi Munos:
The Reactor: A Sample-Efficient Actor-Critic Architecture. CoRR abs/1704.04651 (2017)
[i34]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/BellemareDDMLHM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BellemareDDMLHM17
Marc G. Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, Rémi Munos:
The Cramer Distance as a Solution to Biased Wasserstein Gradients. CoRR abs/1705.10743 (2017)
[i33]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/BorsaPMP17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BorsaPMP17
Diana Borsa, Bilal Piot, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. CoRR abs/1706.06617 (2017)
[i32]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/FortunatoAPMOGM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/FortunatoAPMOGM17
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks for Exploration. CoRR abs/1706.10295 (2017)
[i31]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/BellemareDM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BellemareDM17
Marc G. Bellemare, Will Dabney, Rémi Munos:
A Distributional Perspective on Reinforcement Learning. CoRR abs/1707.06887 (2017)
[i30]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1709-05380
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1709-05380
Brendan O'Donoghue, Ian Osband, Rémi Munos, Volodymyr Mnih:
The Uncertainty Bellman Equation and Exploration. CoRR abs/1709.05380 (2017)
[i29]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1710-10044
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1710-10044
Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos:
Distributional Reinforcement Learning with Quantile Regression. CoRR abs/1710.10044 (2017)
2016
[j22]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/LazaricGM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/LazaricGM16
Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Analysis of Classification-based Policy Iteration Algorithms. J. Mach. Learn. Res. 17: 19:1-19:30 (2016)
[j21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tcs/JainMSZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcs/JainMSZ16
Sanjay Jain, Rémi Munos, Frank Stephan, Thomas Zeugmann:
Guest Editors' foreword. Theor. Comput. Sci. 620: 1-3 (2016)
[c91]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/BellemareOGTM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/BellemareOGTM16
Marc G. Bellemare, Georg Ostrovski, Arthur Guez, Philip S. Thomas, Rémi Munos:
Increasing the Action Gap: New Operators for Reinforcement Learning. AAAI 2016: 1476-1483
[c90]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HallakTMM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HallakTMM16
Assaf Hallak, Aviv Tamar, Rémi Munos, Shie Mannor:
Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis. AAAI 2016: 1631-1637
[c89]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/HarutyunyanBSM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/HarutyunyanBSM16
Anna Harutyunyan, Marc G. Bellemare, Tom Stepleton, Rémi Munos:
Q(λ) with Off-Policy Corrections. ALT 2016: 305-320
[c88]
- view
  authority control:
- export record
  dblp key:
  - conf/amcc/BusoniuPM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/amcc/BusoniuPM16
Lucian Busoniu, Elod Páll, Rémi Munos:
Discounted near-optimal control of general continuous-action nonlinear systems using optimistic planning. ACC 2016: 203-208
[c87]
- view
- export record
  dblp key:
  - conf/nips/MunosSHB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MunosSHB16
Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare:
Safe and Efficient Off-Policy Reinforcement Learning. NIPS 2016: 1046-1054
[c86]
- view
- export record
  dblp key:
  - conf/nips/BellemareSOSSM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/BellemareSOSSM16
Marc G. Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, Rémi Munos:
Unifying Count-Based Exploration and Intrinsic Motivation. NIPS 2016: 1471-1479
[c85]
- view
- export record
  dblp key:
  - conf/nips/GruslysMDLG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GruslysMDLG16
Audrunas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves:
Memory-Efficient Backpropagation Through Time. NIPS 2016: 4125-4133
[c84]
- view
- export record
  dblp key:
  - conf/nips/GrillVM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillVM16
Jean-Bastien Grill, Michal Valko, Rémi Munos:
Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning. NIPS 2016: 4673-4681
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HarutyunyanBSM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HarutyunyanBSM16
Anna Harutyunyan, Marc G. Bellemare, Tom Stepleton, Rémi Munos:
Q($λ$) with Off-Policy Corrections. CoRR abs/1602.04951 (2016)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/BellemareSOSSM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BellemareSOSSM16
Marc G. Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, Rémi Munos:
Unifying Count-Based Exploration and Intrinsic Motivation. CoRR abs/1606.01868 (2016)
[i26]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/MunosSHB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/MunosSHB16
Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare:
Safe and Efficient Off-Policy Reinforcement Learning. CoRR abs/1606.02647 (2016)
[i25]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/GruslysMDLG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GruslysMDLG16
Audrunas Gruslys, Rémi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves:
Memory-Efficient Backpropagation Through Time. CoRR abs/1606.03401 (2016)
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/BarretoMSS16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BarretoMSS16
André Barreto, Rémi Munos, Tom Schaul, David Silver:
Successor Features for Transfer in Reinforcement Learning. CoRR abs/1606.05312 (2016)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/WangBHMMKF16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WangBHMMKF16
Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Rémi Munos, Koray Kavukcuoglu, Nando de Freitas:
Sample Efficient Actor-Critic with Experience Replay. CoRR abs/1611.01224 (2016)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ODonoghueMKM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ODonoghueMKM16
Brendan O'Donoghue, Rémi Munos, Koray Kavukcuoglu, Volodymyr Mnih:
PGQ: Combining policy gradient and Q-learning. CoRR abs/1611.01626 (2016)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/WangKTSLMBKB16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WangKTSLMBKB16
Jane X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matthew M. Botvinick:
Learning to reinforcement learn. CoRR abs/1611.05763 (2016)
2015
[j20]
- view
  authority control:
- export record
  dblp key:
  - journals/jmlr/CarpentierMA15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/CarpentierMA15
Alexandra Carpentier, Rémi Munos, András Antos:
Adaptive strategy for stratified Monte Carlo sampling. J. Mach. Learn. Res. 16: 2231-2271 (2015)
[c83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/KordaAM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/KordaAM15
Nathaniel Korda, Prashanth L. A., Rémi Munos:
Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits. AAAI 2015: 2708-2714
[c82]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/aistats/LiMS15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/LiMS15
Lihong Li, Rémi Munos, Csaba Szepesvári:
Toward Minimax Off-policy Value Estimation. AISTATS 2015
[c81]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/HanawalSVM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/HanawalSVM15
Manjesh Kumar Hanawal, Venkatesh Saligrama, Michal Valko, Rémi Munos:
Cheap Bandits. ICML 2015: 2133-2142
[c80]
- view
- export record
  dblp key:
  - conf/nips/grillVM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/grillVM15
Jean-Bastien Grill, Michal Valko, Rémi Munos:
Black-box optimization of noisy functions with unknown smoothness. NIPS 2015: 667-675
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HanawalSVM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HanawalSVM15
Manjesh Kumar Hanawal, Venkatesh Saligrama, Michal Valko, Rémi Munos:
Cheap Bandits. CoRR abs/1506.04782 (2015)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/CarpentierLGMAA15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/CarpentierLGMAA15
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer, András Antos:
Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits. CoRR abs/1507.04523 (2015)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HallakTMM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HallakTMM15
Assaf Hallak, Aviv Tamar, Rémi Munos, Shie Mannor:
Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis. CoRR abs/1509.05172 (2015)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/BellemareOGTM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/BellemareOGTM15
Marc G. Bellemare, Georg Ostrovski, Arthur Guez, Philip S. Thomas, Rémi Munos:
Increasing the Action Gap: New Operators for Reinforcement Learning. CoRR abs/1512.04860 (2015)
2014
[j19]
- view
  authority control:
- export record
  dblp key:
  - journals/ftml/Munos14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ftml/Munos14
Rémi Munos:
From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning. Found. Trends Mach. Learn. 7(1): 1-129 (2014)
[j18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tcs/OrtnerRAM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcs/OrtnerRAM14
Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos:
Regret bounds for restless Markov bandits. Theor. Comput. Sci. 558: 62-76 (2014)
[j17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tcs/CarpentierM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcs/CarpentierM14
Alexandra Carpentier, Rémi Munos:
Minimax number of strata for online stratified sampling: The case of noisy samples. Theor. Comput. Sci. 558: 77-106 (2014)
[c79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/KocakVMA14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/KocakVMA14
Tomás Kocák, Michal Valko, Rémi Munos, Shipra Agrawal:
Spectral Thompson Sampling. AAAI 2014: 1911-1917
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/adprl/BusoniuMP14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/adprl/BusoniuMP14
Lucian Busoniu, Rémi Munos, Elod Páll:
An analysis of optimistic, best-first search for minimax sequential decision making. ADPRL 2014: 1-8
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/cdc/MatheBMS14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cdc/MatheBMS14
Koppány Máthé, Lucian Busoniu, Rémi Munos, Bart De Schutter:
Optimistic planning with a limited number of action switches for near-optimal nonlinear control. CDC 2014: 3518-3523
[c76]
- view
  authority control:
- export record
  dblp key:
  - conf/cec/PreuxMV14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cec/PreuxMV14
Philippe Preux, Rémi Munos, Michal Valko:
Bandits attack function optimization. IEEE Congress on Evolutionary Computation 2014: 2245-2252
[c75]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ZoghiWMR14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ZoghiWMR14
Masrour Zoghi, Shimon Whiteson, Rémi Munos, Maarten de Rijke:
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem. ICML 2014: 10-18
[c74]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ValkoMKK14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ValkoMKK14
Michal Valko, Rémi Munos, Branislav Kveton, Tomás Kocák:
Spectral Bandits for Smooth Graph Functions. ICML 2014: 46-54
[c73]
- view
- export record
  dblp key:
  - conf/nips/SabatoM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/SabatoM14
Sivan Sabato, Rémi Munos:
Active Regression by Stratification. NIPS 2014: 469-477
[c72]
- view
- export record
  dblp key:
  - conf/nips/LattimoreM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LattimoreM14
Tor Lattimore, Rémi Munos:
Bounded Regret for Finite-Armed Structured Bandits. NIPS 2014: 550-558
[c71]
- view
- export record
  dblp key:
  - conf/nips/KocakNVM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KocakNVM14
Tomás Kocák, Gergely Neu, Michal Valko, Rémi Munos:
Efficient learning by implicit exploration in bandit problems with side observations. NIPS 2014: 613-621
[c70]
- view
- export record
  dblp key:
  - conf/nips/SoareLM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/SoareLM14
Marta Soare, Alessandro Lazaric, Rémi Munos:
Best-Arm Identification in Linear Bandits. NIPS 2014: 828-836
[c69]
- view
- export record
  dblp key:
  - conf/nips/SzorenyiKM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/SzorenyiKM14
Balázs Szörényi, Gunnar Kedenburg, Rémi Munos:
Optimistic Planning in Markov Decision Processes Using a Generative Model. NIPS 2014: 1035-1043
[c68]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/pkdd/AKM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pkdd/AKM14
Prashanth L. A., Nathaniel Korda, Rémi Munos:
Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control. ECML/PKDD (2) 2014: 66-81
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/wsdm/ZoghiWRM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/wsdm/ZoghiWRM14
Masrour Zoghi, Shimon Whiteson, Maarten de Rijke, Rémi Munos:
Relative confidence sampling for efficient on-line ranker evaluation. WSDM 2014: 73-82
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/CoquelinM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/CoquelinM14
Pierre-Arnaud Coquelin, Rémi Munos:
Bandit Algorithms for Tree Search. CoRR abs/1408.2028 (2014)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/LiMS14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/LiMS14
Lihong Li, Rémi Munos, Csaba Szepesvári:
On Minimax Optimal Offline Policy Evaluation. CoRR abs/1409.3653 (2014)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/SoareLM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/SoareLM14
Marta Soare, Alessandro Lazaric, Rémi Munos:
Best-Arm Identification in Linear Bandits. CoRR abs/1409.6110 (2014)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/SabatoM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/SabatoM14
Sivan Sabato, Rémi Munos:
Active Regression by Stratification. CoRR abs/1410.5920 (2014)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/LattimoreM14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/LattimoreM14
Tor Lattimore, Rémi Munos:
Bounded Regret for Finite-Armed Structured Bandits. CoRR abs/1411.2919 (2014)
2013
[j16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/ml/AzarMK13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/AzarMK13
Mohammad Gheshlaghi Azar, Rémi Munos, Hilbert J. Kappen:
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model. Mach. Learn. 91(3): 325-349 (2013)
[c66]
- view
  authority control:
- export record
  dblp key:
  - conf/adprl/BusoniuDMB13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/adprl/BusoniuDMB13
Lucian Busoniu, Alexander Daniels, Rémi Munos, Robert Babuska:
Optimistic planning for continuous-action deterministic systems. ADPRL 2013: 69-76
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/adprl/FonteneauBM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/adprl/FonteneauBM13
Raphaël Fonteneau, Lucian Busoniu, Rémi Munos:
Optimistic planning for belief-augmented Markov Decision Processes. ADPRL 2013: 77-84
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/JainMSZ13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/JainMSZ13
Sanjay Jain, Rémi Munos, Frank Stephan, Thomas Zeugmann:
Editors' Introduction. ALT 2013: 1-12
[c63]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/ValkoCM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/ValkoCM13
Michal Valko, Alexandra Carpentier, Rémi Munos:
Stochastic Simultaneous Optimistic Optimization. ICML (2) 2013: 19-27
[c62]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/CarpentierM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/CarpentierM13
Alexandra Carpentier, Rémi Munos:
Toward Optimal Stratification for Stratified Monte-Carlo Integration. ICML (2) 2013: 28-36
[c61]
- view
- export record
  dblp key:
  - conf/nips/KordaKM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KordaKM13
Nathaniel Korda, Emilie Kaufmann, Rémi Munos:
Thompson Sampling for 1-Dimensional Exponential Family Bandits. NIPS 2013: 1448-1456
[c60]
- view
- export record
  dblp key:
  - conf/nips/KedenburgFM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/KedenburgFM13
Gunnar Kedenburg, Raphaël Fonteneau, Rémi Munos:
Aggregating Optimistic Planning Trees for Solving Markov Decision Processes. NIPS 2013: 2382-2390
[c59]
- view
- export record
  dblp key:
  - conf/uai/ValkoKMFC13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/uai/ValkoKMFC13
Michal Valko, Nathaniel Korda, Rémi Munos, Ilias N. Flaounas, Nello Cristianini:
Finite-Time Analysis of Kernelised Contextual Bandits. UAI 2013
[e2]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/2013
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/2013
Sanjay Jain, Rémi Munos, Frank Stephan, Thomas Zeugmann:
Algorithmic Learning Theory - 24th International Conference, ALT 2013, Singapore, October 6-9, 2013. Proceedings. Lecture Notes in Computer Science 8139, Springer 2013, ISBN 978-3-642-40934-9 [contents]
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1301-1936
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1301-1936
Amir Sani, Alessandro Lazaric, Rémi Munos:
Risk-Aversion in Multi-armed Bandits. CoRR abs/1301.1936 (2013)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1302-2552
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1302-2552
Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko:
Selecting the State-Representation in Reinforcement Learning. CoRR abs/1302.2552 (2013)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/PrashanthKM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/PrashanthKM13
Prashanth L. A., Nathaniel Korda, Rémi Munos:
Analysis of stochastic approximation for efficient least squares regression and LSTD. CoRR abs/1306.2557 (2013)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/KordaPM13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/KordaPM13
Nathaniel Korda, Prashanth L. A., Rémi Munos:
Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits. CoRR abs/1307.3176 (2013)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ValkoKMFC13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ValkoKMFC13
Michal Valko, Nathaniel Korda, Rémi Munos, Ilias N. Flaounas, Nello Cristianini:
Finite-Time Analysis of Kernelised Contextual Bandits. CoRR abs/1309.6869 (2013)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ZoghiWMR13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ZoghiWMR13
Masrour Zoghi, Shimon Whiteson, Rémi Munos, Maarten de Rijke:
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem. CoRR abs/1312.3393 (2013)
2012
[j15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/jcss/LazaricM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jcss/LazaricM12
Alessandro Lazaric, Rémi Munos:
Learning with stochastic inputs and adversarial outputs. J. Comput. Syst. Sci. 78(5): 1516-1537 (2012)
[j14]
- view
  authority control:
- export record
  dblp key:
  - journals/jmlr/MaillardM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/MaillardM12
Odalric-Ambrym Maillard, Rémi Munos:
Linear regression with random projections. J. Mach. Learn. Res. 13: 2735-2772 (2012)
[j13]
- view
  authority control:
- export record
  dblp key:
  - journals/jmlr/LazaricGM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/LazaricGM12
Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Finite-sample analysis of least-squares policy iteration. J. Mach. Learn. Res. 13: 3041-3074 (2012)
[c58]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/KaufmannKM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/KaufmannKM12
Emilie Kaufmann, Nathaniel Korda, Rémi Munos:
Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis. ALT 2012: 199-213
[c57]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/OrtnerRAM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/OrtnerRAM12
Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos:
Regret Bounds for Restless Markov Bandits. ALT 2012: 214-228
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/CarpentierM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/CarpentierM12
Alexandra Carpentier, Rémi Munos:
Minimax Number of Strata for Online Stratified Sampling Given Noisy Samples. ALT 2012: 229-244
[c55]
- view
  - electronic edition @ icml.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/AzarMK12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AzarMK12
Mohammad Gheshlaghi Azar, Rémi Munos, Bert Kappen:
On the Sample Complexity of Reinforcement Learning with a Generative Model . ICML 2012
[c54]
- view
- export record
  dblp key:
  - conf/nips/CarpentierM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/CarpentierM12
Alexandra Carpentier, Rémi Munos:
Adaptive Stratified Sampling for Monte-Carlo integration of Differentiable functions. NIPS 2012: 251-259
[c53]
- view
- export record
  dblp key:
  - conf/nips/FruitetCMC12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/FruitetCMC12
Joan Fruitet, Alexandra Carpentier, Rémi Munos, Maureen Clerc:
Bandit Algorithms boost Brain Computer Interfaces for motor-task selection of a brain-controlled button. NIPS 2012: 458-466
[c52]
- view
- export record
  dblp key:
  - conf/nips/SaniLM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/SaniLM12
Amir Sani, Alessandro Lazaric, Rémi Munos:
Risk-Aversion in Multi-armed Bandits. NIPS 2012: 3284-3292
[c51]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/BusoniuM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/BusoniuM12
Lucian Busoniu, Rémi Munos:
Optimistic planning for Markov decision processes. AISTATS 2012: 182-189
[c50]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/CarpentierM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/CarpentierM12
Alexandra Carpentier, Rémi Munos:
Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit. AISTATS 2012: 190-198
[p1]
- view
  authority control:
- export record
  dblp key:
  - books/sp/12/BusoniuLGMBS12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/12/BusoniuLGMBS12
Lucian Busoniu, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Robert Babuska, Bart De Schutter:
Least-Squares Methods for Policy Iteration. Reinforcement Learning 2012: 75-109
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1205-4217
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1205-4217
Emilie Kaufmann, Nathaniel Korda, Rémi Munos:
Thompson Sampling: An Optimal Finite Time Analysis. CoRR abs/1205.4217 (2012)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1209-2693
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1209-2693
Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos:
Regret Bounds for Restless Markov Bandits. CoRR abs/1209.2693 (2012)
2011
[j12]
- view
  authority control:
- export record
  dblp key:
  - journals/jmlr/BubeckMSS11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/BubeckMSS11
Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári:
X-Armed Bandits. J. Mach. Learn. Res. 12: 1655-1695 (2011)
[j11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tcs/BubeckMS11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcs/BubeckMS11
Sébastien Bubeck, Rémi Munos, Gilles Stoltz:
Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19): 1832-1852 (2011)
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/adprl/BusoniuMSB11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/adprl/BusoniuMSB11
Lucian Busoniu, Rémi Munos, Bart De Schutter, Robert Babuska:
Optimistic planning for sparsely stochastic systems. ADPRL 2011: 48-55
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/CarpentierLGMA11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/CarpentierLGMA11
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer:
Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits. ALT 2011: 189-203
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/ewrl/HoffmanLGM11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ewrl/HoffmanLGM11
Matthew W. Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization. EWRL 2011: 102-114
[c46]
- view
  - electronic edition @ icml.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GhavamzadehLMH11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GhavamzadehLMH11
Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matthew W. Hoffman:
Finite-Sample Analysis of Lasso-TD. ICML 2011: 1177-1184
[c45]
- view
- export record
  dblp key:
  - conf/nips/Munos11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Munos11
Rémi Munos:
Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness. NIPS 2011: 783-791
[c44]
- view
- export record
  dblp key:
  - conf/nips/CarpentierM11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/CarpentierM11
Alexandra Carpentier, Rémi Munos:
Finite Time Analysis of Stratified Sampling for Monte Carlo. NIPS 2011: 1278-1286
[c43]
- view
- export record
  dblp key:
  - conf/nips/CarpentierMM11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/CarpentierMM11
Alexandra Carpentier, Odalric-Ambrym Maillard, Rémi Munos:
Sparse Recovery with Brownian Sensing. NIPS 2011: 1782-1790
[c42]
- view
- export record
  dblp key:
  - conf/nips/AzarMGK11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AzarMGK11
Mohammad Gheshlaghi Azar, Rémi Munos, Mohammad Ghavamzadeh, Hilbert J. Kappen:
Speedy Q-Learning. NIPS 2011: 2411-2419
[c41]
- view
- export record
  dblp key:
  - conf/nips/MaillardMR11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MaillardMR11
Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko:
Selecting the State-Representation in Reinforcement Learning. NIPS 2011: 2627-2635
[c40]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/MaillardMS11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/MaillardMS11
Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz:
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences. COLT 2011: 497-514
[c39]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/MaillardM11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/MaillardM11
Odalric-Ambrym Maillard, Rémi Munos:
Adaptive Bandits: Towards the best history-dependent strategy. AISTATS 2011: 570-578
2010
[c38]
- view
- export record
  dblp key:
  - conf/colt/AudibertBM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/AudibertBM10
Jean-Yves Audibert, Sébastien Bubeck, Rémi Munos:
Best Arm Identification in Multi-Armed Bandits. COLT 2010: 41-53
[c37]
- view
- export record
  dblp key:
  - conf/colt/BubeckM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/BubeckM10
Sébastien Bubeck, Rémi Munos:
Open Loop Optimistic Planning. COLT 2010: 477-489
[c36]
- view
  - electronic edition @ icml.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/LazaricGM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/LazaricGM10
Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Analysis of a Classification-based Policy Iteration Algorithm. ICML 2010: 607-614
[c35]
- view
  - electronic edition @ icml.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/LazaricGM10a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/LazaricGM10a
Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos:
Finite-Sample Analysis of LSTD. ICML 2010: 615-622
[c34]
- view
- export record
  dblp key:
  - conf/nips/FarahmandMS10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/FarahmandMS10
Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvári:
Error Propagation for Approximate Policy and Value Iteration. NIPS 2010: 568-576
[c33]
- view
- export record
  dblp key:
  - conf/nips/GhavamzadehLMM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GhavamzadehLMM10
Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard, Rémi Munos:
LSTD with Random Projections. NIPS 2010: 721-729
[c32]
- view
- export record
  dblp key:
  - conf/nips/MaillardM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MaillardM10
Odalric-Ambrym Maillard, Rémi Munos:
Scrambled Objects for Least-Squares Regression. NIPS 2010: 1549-1557
[c31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/pkdd/MaillardM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pkdd/MaillardM10
Odalric-Ambrym Maillard, Rémi Munos:
Online Learning in Adversarial Lipschitz Environments. ECML/PKDD (2) 2010: 305-320
[c30]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/MaillardMLG10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/MaillardMLG10
Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh:
Finite-sample Analysis of Bellman Residual Minimization. ACML 2010: 299-314
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1001-4475
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1001-4475
Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári:
X-Armed Bandits. CoRR abs/1001.4475 (2010)

2000 – 2009

see FAQ

What is the meaning of the colors in the publication lists?

2009
[j10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tcs/AudibertMS09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcs/AudibertMS09
Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári:
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19): 1876-1902 (2009)
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/BubeckMS09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/BubeckMS09
Sébastien Bubeck, Rémi Munos, Gilles Stoltz:
Pure Exploration in Multi-armed Bandits Problems. ALT 2009: 23-37
[c28]
- view
  - electronic edition @ mcgill.ca (open access)
  - details & citations
- export record
  dblp key:
  - conf/colt/LazaricM09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/LazaricM09
Alessandro Lazaric, Rémi Munos:
Hybrid Stochastic-Adversarial On-line Learning. COLT 2009
[c27]
- view
  authority control:
- export record
  dblp key:
  - conf/icml/AudibertALMRS09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AudibertALMRS09
Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári:
Workshop summary: On-line learning with limited feedback. ICML 2009: 8
[c26]
- view
- export record
  dblp key:
  - conf/nips/CoquelinDM09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/CoquelinDM09
Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos:
Sensitivity analysis in HMMs with application to likelihood maximization. NIPS 2009: 387-395
[c25]
- view
- export record
  dblp key:
  - conf/nips/MaillardM09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MaillardM09
Odalric-Ambrym Maillard, Rémi Munos:
Compressed Least-Squares Regression. NIPS 2009: 1213-1221
2008
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/jmlr/MunosS08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/MunosS08
Rémi Munos, Csaba Szepesvári:
Finite-Time Bounds for Fitted Value Iteration. J. Mach. Learn. Res. 9: 815-857 (2008)
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/ml/AntosSM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/AntosSM08
András Antos, Csaba Szepesvári, Rémi Munos:
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Mach. Learn. 71(1): 89-129 (2008)
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/ecai/MaitrepierreMM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ecai/MaitrepierreMM08
Raphaël Maîtrepierre, Jérémie Mary, Rémi Munos:
Adaptive play in Texas Hold'em Poker. ECAI 2008: 458-462
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/ewrl/HrenM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ewrl/HrenM08
Jean-François Hren, Rémi Munos:
Optimistic Planning of Deterministic Systems. EWRL 2008: 151-164
[c22]
- view
- export record
  dblp key:
  - conf/nips/BubeckMSS08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/BubeckMSS08
Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári:
Online Optimization in X-Armed Bandits. NIPS 2008: 201-208
[c21]
- view
- export record
  dblp key:
  - conf/nips/CoquelinDM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/CoquelinDM08
Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos:
Particle Filter-based Policy Gradient in POMDPs. NIPS 2008: 337-344
[c20]
- view
- export record
  dblp key:
  - conf/nips/WangAM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangAM08
Yizao Wang, Jean-Yves Audibert, Rémi Munos:
Algorithms for Infinitely Many-Armed Bandits. NIPS 2008: 1729-1736
[e1]
- view
  authority control:
- export record
  dblp key:
  - conf/ewrl/2008
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ewrl/2008
Sertan Girgin, Manuel Loth, Rémi Munos, Philippe Preux, Daniil Ryabko:
Recent Advances in Reinforcement Learning, 8th European Workshop, EWRL 2008, Villeneuve d'Ascq, France, June 30 - July 3, 2008, Revised and Selected Papers. Lecture Notes in Computer Science 5323, Springer 2008, ISBN 978-3-540-89721-7 [contents]
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-0802-2655
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-0802-2655
Sébastien Bubeck, Rémi Munos, Gilles Stoltz:
Pure Exploration for Multi-Armed Bandit Problems. CoRR abs/0802.2655 (2008)
2007
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/ria/Munos07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ria/Munos07
Rémi Munos:
Analyse en norme Lp de l'algorithme d'itérations sur les valeurs avec approximations. Rev. d'Intelligence Artif. 21(1): 53-74 (2007)
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/siamco/Munos07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamco/Munos07
Rémi Munos:
Performance Bounds in L_p-norm for Approximate Value Iteration. SIAM J. Control. Optim. 46(2): 541-561 (2007)
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/alt/AudibertMS07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/alt/AudibertMS07
Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári:
Tuning Bandit Algorithms in Stochastic Environments. ALT 2007: 150-165
[c18]
- view
- export record
  dblp key:
  - conf/nips/AntosMS07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AntosMS07
András Antos, Rémi Munos, Csaba Szepesvári:
Fitted Q-iteration in continuous action-space MDPs. NIPS 2007: 9-16
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/uai/CoquelinM07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/uai/CoquelinM07
Pierre-Arnaud Coquelin, Rémi Munos:
Bandit Algorithms for Tree Search. UAI 2007: 67-74
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-cs-0703062
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-cs-0703062
Pierre-Arnaud Coquelin, Rémi Munos:
Bandit Algorithms for Tree Search. CoRR abs/cs/0703062 (2007)
2006
[j5]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/Munos06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/Munos06
Rémi Munos:
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation. J. Mach. Learn. Res. 7: 413-427 (2006)
[j4]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/Munos06a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/Munos06a
Rémi Munos:
Policy Gradient in Continuous Time. J. Mach. Learn. Res. 7: 771-791 (2006)
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/colt/AntosSM06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/AntosSM06
András Antos, Csaba Szepesvári, Rémi Munos:
Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. COLT 2006: 574-588
2005
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/siamco/GobetM05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamco/GobetM05
Emmanuel Gobet, Rémi Munos:
Sensitivity Analysis Using It[o-circumflex]--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control. SIAM J. Control. Optim. 43(5): 1676-1713 (2005)
[c15]
- view
- export record
  dblp key:
  - conf/aaai/Munos05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Munos05
Rémi Munos:
Error Bounds for Approximate Value Iteration. AAAI 2005: 1006-1011
[c14]
- view
- export record
  dblp key:
  - conf/aaai/Munos05a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Munos05a
Rémi Munos:
Geometric Variance Reduction in Markov Chains. Application to Value Function and Gradient Estimation. AAAI 2005: 1012-1017
[c13]
- no documents available
  - details & citations
- export record
  dblp key:
  - conf/cfap/Munos05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cfap/Munos05
Rémi Munos:
Policy gradient in continuous time. CAP 2005: 201-216
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/icml/SzepesvariM05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/SzepesvariM05
Csaba Szepesvári, Rémi Munos:
Finite time bounds for sampling based fitted value iteration. ICML 2005: 880-887
2003
[c11]
- view
  - electronic edition @ aaai.org
  - details & citations
- export record
  dblp key:
  - conf/icml/Munos03
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Munos03
Rémi Munos:
Error Bounds for Approximate Policy Iteration. ICML 2003: 560-567
2002
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/ml/MunosM02
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/MunosM02
Rémi Munos, Andrew W. Moore:
Variable Resolution Discretization in Optimal Control. Mach. Learn. 49(2-3): 291-323 (2002)
2001
[c10]
- view
- export record
  dblp key:
  - conf/nips/Munos01
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Munos01
Rémi Munos:
Efficient Resources Allocation for Markov Decision Processes. NIPS 2001: 1571-1578
2000
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/ml/Munos00
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/Munos00
Rémi Munos:
A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions. Mach. Learn. 40(3): 265-299 (2000)
[c9]
- no documents available
  - details & citations
- export record
  dblp key:
  - conf/icml/MunosM00
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MunosM00
Rémi Munos, Andrew W. Moore:
Rates of Convergence for Variable Resolution Schemes in Optimal Control. ICML 2000: 647-654

1990 – 1999

see FAQ

What is the meaning of the colors in the publication lists?

1999
[c8]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/MunosM99
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/MunosM99
Rémi Munos, Andrew W. Moore:
Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems. IJCAI 1999: 1348-1355
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/ijcnn/MunosB099
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnn/MunosB099
Rémi Munos, Leemon C. Baird III, Andrew W. Moore:
Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation. IJCNN 1999: 2152-2157
1998
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/ecml/Munos98
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ecml/Munos98
Rémi Munos:
A General Convergence Method for Reinforcement Learning in the Continuous Case. ECML 1998: 394-405
[c5]
- view
- export record
  dblp key:
  - conf/nips/MunosM98
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MunosM98
Rémi Munos, Andrew W. Moore:
Barycentric Interpolators for Continuous Space and Time Reinforcement Learning. NIPS 1998: 1024-1030
1997
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/ecml/Munos97
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ecml/Munos97
Rémi Munos:
Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems. ECML 1997: 170-182
[c3]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/Munos97
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/Munos97
Rémi Munos:
A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method. IJCAI (2) 1997: 826-831
[c2]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/MunosB97
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MunosB97
Rémi Munos, Paul Bourgine:
Reinforcement Learning for Continuous Stochastic Control Problems. NIPS 1997: 1029-1035
1996
[c1]
- no documents available
  - details & citations
- export record
  dblp key:
  - conf/icml/Munos96
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Munos96
Rémi Munos:
A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning. ICML 1996: 337-345

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.