


Остановите войну!
for scientists:


default search action
Michal Valko
Person information

- affiliation: DeepMind
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [c101]Mehdi Azabou, Venkataramana Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, Michal Valko, Petar Velickovic, Eva L. Dyer:
Half-Hop: A graph upsampling approach for slowing down message passing. ICML 2023: 1341-1360 - [c100]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. ICML 2023: 10093-10135 - [c99]Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko:
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments. ICML 2023: 14780-14816 - [c98]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c97]Thomas Mesnard, Wenqi Chen, Alaa Saade, Yunhao Tang, Mark Rowland, Theophane Weber, Clare Lyle, Audrunas Gruslys, Michal Valko, Will Dabney, Georg Ostrovski, Eric Moulines, Rémi Munos:
Quantile Credit Assignment. ICML 2023: 24517-24531 - [c96]Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656 - [c95]Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko:
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm. ICML 2023: 33657-33673 - [c94]Yunhao Tang, Rémi Munos, Mark Rowland, Michal Valko:
VA-learning as a more efficient alternative to Q-learning. ICML 2023: 33739-33757 - [c93]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. ICML 2023: 34161-34221 - [i73]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard:
Fast Rates for Maximum Entropy Exploration. CoRR abs/2303.08059 (2023) - [i72]Alaa Saade, Steven Kapturowski, Daniele Calandriello, Charles Blundell, Pablo Sprechmann, Leopoldo Sarra, Oliver Groth, Michal Valko, Bilal Piot:
Unlocking the Power of Representations in Long-term Novelty-based Exploration. CoRR abs/2305.01521 (2023) - [i71]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i70]Yunhao Tang, Rémi Munos, Mark Rowland, Michal Valko:
VA-learning as a more efficient alternative to Q-learning. CoRR abs/2305.18161 (2023) - [i69]Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko:
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm. CoRR abs/2305.18501 (2023) - [i68]Mehdi Azabou, Venkataramana Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, Michal Valko, Petar Velickovic, Eva L. Dyer:
Half-Hop: A graph upsampling approach for slowing down message passing. CoRR abs/2308.09198 (2023) - [i67]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Local and adaptive mirror descents in extensive-form games. CoRR abs/2309.00656 (2023) - 2022
- [c92]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Marginalized Operators for Off-policy Reinforcement Learning. AISTATS 2022: 655-679 - [c91]Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Adaptive Multi-Goal Exploration. AISTATS 2022: 7349-7383 - [c90]Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L. Dyer, Rémi Munos, Petar Velickovic, Michal Valko:
Large-Scale Representation Learning on Graphs via Bootstrapping. ICLR 2022 - [c89]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times. ICML 2022: 2523-2541 - [c88]Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adrià Puigdomènech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Michal Valko, Simon Osindero, Timothy P. Lillicrap, Nicolas Heess, Charles Blundell:
Retrieval-Augmented Reinforcement Learning. ICML 2022: 7740-7765 - [c87]Daniil Tiapkin
, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Ménard:
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses. ICML 2022: 21380-21431 - [c86]Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022 - [c85]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. NeurIPS 2022 - [i66]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times. CoRR abs/2201.12909 (2022) - [i65]Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adrià Puigdomènech Badia, Arthur Guez, Mehdi Mirza, Ksenia Konyushkova, Michal Valko, Simon Osindero, Timothy P. Lillicrap, Nicolas Heess, Charles Blundell:
Retrieval-Augmented Reinforcement Learning. CoRR abs/2202.08417 (2022) - [i64]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Marginalized Operators for Off-policy Reinforcement Learning. CoRR abs/2203.16177 (2022) - [i63]Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Ménard:
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses. CoRR abs/2205.07704 (2022) - [i62]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i61]Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022) - [i60]Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard:
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees. CoRR abs/2209.14414 (2022) - [i59]Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko:
Curiosity in hindsight. CoRR abs/2211.10515 (2022) - [i58]Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022) - [i57]Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko:
Adapting to game trees in zero-sum imperfect information games. CoRR abs/2212.12567 (2022) - 2021
- [j5]Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. J. Artif. Intell. Res. 71: 41-88 (2021) - [j4]Guillaume Gautier
, Rémi Bardenet, Michal Valko:
Fast sampling from β-ensembles. Stat. Comput. 31(1): 7 (2021) - [c84]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces. AISTATS 2021: 3538-3546 - [c83]Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko:
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited. ALT 2021: 578-598 - [c82]Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko:
Adaptive Reward-Free Exploration. ALT 2021: 865-891 - [c81]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model. ALT 2021: 1157-1178 - [c80]Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Patraucean, Florent Altché, Michal Valko, Jean-Bastien Grill, Aäron van den Oord, Andrew Zisserman:
Broaden Your Views for Self-Supervised Video Learning. ICCV 2021: 1235-1245 - [c79]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
Kernel-Based Reinforcement Learning: A Finite-Time Analysis. ICML 2021: 2783-2792 - [c78]Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet:
Online A-Optimal Design and Active Linear Regression. ICML 2021: 3374-3383 - [c77]Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel:
Revisiting Peng's Q(λ) for Modern Reinforcement Learning. ICML 2021: 5794-5804 - [c76]Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Emilie Kaufmann, Edouard Leurent, Michal Valko:
Fast active learning for pure exploration in reinforcement learning. ICML 2021: 7599-7608 - [c75]Pierre Ménard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko:
UCB Momentum Q-learning: Correcting the bias without forgetting. ICML 2021: 7609-7618 - [c74]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Taylor Expansion of Discount Factors. ICML 2021: 10130-10140 - [c73]Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko:
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation. NeurIPS 2021: 5303-5315 - [c72]Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. NeurIPS 2021: 6843-6855 - [c71]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
A Provably Efficient Sample Collection Strategy for Reinforcement Learning. NeurIPS 2021: 7611-7624 - [c70]Ran Liu, Mehdi Azabou, Max Dabagia, Chi-Heng Lin, Mohammad Gheshlaghi Azar, Keith B. Hengen, Michal Valko, Eva L. Dyer:
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity. NeurIPS 2021: 10587-10599 - [c69]Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Learning in two-player zero-sum partially observable Markov games with perfect recall. NeurIPS 2021: 11987-11998 - [i56]Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko:
On the Approximation Relationship between Optimizing Ratio of Submodular (RS) and Difference of Submodular (DS) Functions. CoRR abs/2101.01631 (2021) - [i55]Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021) - [i54]Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Rémi Munos, Petar Velickovic, Michal Valko:
Bootstrapped Representation Learning on Graphs. CoRR abs/2102.06514 (2021) - [i53]Mehdi Azabou, Mohammad Gheshlaghi Azar, Ran Liu, Chi-Heng Lin, Erik C. Johnson, Kiran Bhaskaran-Nair, Max Dabagia, Keith B. Hengen
, William R. Gray Roncal, Michal Valko, Eva L. Dyer:
Mine Your Own vieW: Self-Supervised Learning Through Across-Sample Prediction. CoRR abs/2102.10106 (2021) - [i52]Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel:
Revisiting Peng's Q(λ) for Modern Reinforcement Learning. CoRR abs/2103.00107 (2021) - [i51]Pierre Ménard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko:
UCB Momentum Q-learning: Correcting the bias without forgetting. CoRR abs/2103.01312 (2021) - [i50]Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Patraucean, Florent Altché, Michal Valko, Jean-Bastien Grill, Aäron van den Oord, Andrew Zisserman:
Broaden Your Views for Self-Supervised Video Learning. CoRR abs/2103.16559 (2021) - [i49]Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. CoRR abs/2104.11186 (2021) - [i48]Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko:
Taylor Expansion of Discount Factors. CoRR abs/2106.06170 (2021) - [i47]Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko:
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall. CoRR abs/2106.06279 (2021) - [i46]Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko:
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation. CoRR abs/2106.13125 (2021) - [i45]Ran Liu, Mehdi Azabou, Max Dabagia, Chi-Heng Lin, Mohammad Gheshlaghi Azar, Keith B. Hengen, Michal Valko, Eva L. Dyer:
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity. CoRR abs/2111.02338 (2021) - [i44]Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Adaptive Multi-Goal Exploration. CoRR abs/2111.12045 (2021) - 2020
- [c68]Xuedong Shang, Rianne de Heide, Pierre Ménard, Emilie Kaufmann, Michal Valko:
Fixed-confidence guarantees for Bayesian best-arm identification. AISTATS 2020: 1823-1832 - [c67]Haitham Ammar, Victor Gabillon, Rasul Tutunov, Michal Valko:
Derivative-Free & Order-Robust Optimisation. AISTATS 2020: 2293-2303 - [c66]Côme Fiegel, Victor Gabillon, Michal Valko:
Adaptive multi-fidelity optimization with fast learning rates. AISTATS 2020: 3493-3502 - [c65]Julien Seznec, Pierre Ménard, Alessandro Lazaric, Michal Valko:
A single algorithm for both restless and rested rotting bandits. AISTATS 2020: 3784-3794 - [c64]Pierre Perrault, Michal Valko, Vianney Perchet:
Covariance-adapting algorithm for semi-bandits with application to sparse outcomes. COLT 2020: 3152-3184 - [c63]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Near-linear time Gaussian process optimization with adaptive batching and resparsification. ICML 2020: 1295-1305 - [c62]Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko:
Gamification of Pure Exploration for Linear Bandits. ICML 2020: 2432-2442 - [c61]Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko:
Stochastic bandits with arm-dependent delays. ICML 2020: 3348-3356 - [c60]Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos:
Monte-Carlo Tree Search as Regularized Policy Optimization. ICML 2020: 3769-3778 - [c59]Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko:
Budgeted Online Influence Maximization. ICML 2020: 7620-7631 - [c58]Aadirupa Saha, Pierre Gaillard, Michal Valko:
Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards. ICML 2020: 8357-8366 - [c57]Yunhao Tang, Michal Valko, Rémi Munos:
Taylor Expansion Policy Optimization. ICML 2020: 9397-9406 - [c56]Jean Tarbouriech, Evrard Garcelon, Michal Valko, Matteo Pirotta, Alessandro Lazaric:
No-Regret Exploration in Goal-Oriented Reinforcement Learning. ICML 2020: 9428-9437 - [c55]Daniele Calandriello, Michal Derezinski, Michal Valko:
Sampling from a k-DPP without looking at all items. NeurIPS 2020 - [c54]Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020 - [c53]Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko:
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity. NeurIPS 2020 - [c52]Pierre Perrault, Etienne Boursier, Michal Valko, Vianney Perchet:
Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits. NeurIPS 2020 - [c51]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs. NeurIPS 2020 - [i43]Daniele Calandriello, Luigi Carratino
, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification. CoRR abs/2002.09954 (2020) - [i42]Yunhao Tang, Michal Valko, Rémi Munos:
Taylor Expansion Policy Optimization. CoRR abs/2003.06259 (2020) - [i41]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
Regret Bounds for Kernel-Based Reinforcement Learning. CoRR abs/2004.05599 (2020) - [i40]Aadirupa Saha, Pierre Gaillard, Michal Valko:
Improved Sleeping Bandits with Stochastic Actions Sets and Adversarial Rewards. CoRR abs/2004.06248 (2020) - [i39]Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko:
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity. CoRR abs/2006.05879 (2020) - [i38]Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko:
Adaptive Reward-Free Exploration. CoRR abs/2006.06294 (2020) - [i37]Pierre Perrault, Etienne Boursier, Vianney Perchet, Michal Valko:
Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits. CoRR abs/2006.06613 (2020) - [i36]Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020) - [i35]Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko:
Stochastic bandits with arm-dependent delays. CoRR abs/2006.10459 (2020) - [i34]Daniele Calandriello, Michal Derezinski, Michal Valko:
Sampling from a k-DPP without looking at all items. CoRR abs/2006.16947 (2020) - [i33]Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko:
Gamification of Pure Exploration for Linear Bandits. CoRR abs/2007.00953 (2020) - [i32]Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko:
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces. CoRR abs/2007.05078 (2020) - [i31]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
A Provably Efficient Sample Collection Strategy for Reinforcement Learning. CoRR abs/2007.06437 (2020) - [i30]Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos:
Monte-Carlo Tree Search as Regularized Policy Optimization. CoRR abs/2007.12509 (2020) - [i29]Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Emilie Kaufmann, Edouard Leurent, Michal Valko:
Fast active learning for pure exploration in reinforcement learning. CoRR abs/2007.13442 (2020) - [i28]Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko:
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited. CoRR abs/2010.03531 (2020) - [i27]Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel L. Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko:
BYOL works even without batch statistics. CoRR abs/2010.10241 (2020) - [i26]Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. CoRR abs/2011.09192 (2020) - [i25]Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs. CoRR abs/2012.14755 (2020)
2010 – 2019
- 2019
- [j3]Guillaume Gautier, Guillermo Polito, Rémi Bardenet, Michal Valko:
DPPy: DPP Sampling with Python. J. Mach. Learn. Res. 20: 180:1-180:7 (2019) - [c50]Pierre Perrault, Vianney Perchet, Michal Valko:
Finding the bandit in a graph: Sequential search-and-stop. AISTATS 2019: 1668-1677 - [c49]Andrea Locatelli, Alexandra Carpentier, Michal Valko:
Active multiple matrix completion with adaptive confidence sets. AISTATS 2019: 1783-1791 - [c48]Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko:
Rotting bandits are no harder than stochastic ones. AISTATS 2019: 2564-2572 - [c47]Peter L. Bartlett, Victor Gabillon, Michal Valko:
A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption. ALT 2019: 184-206 - [c46]Xuedong Shang, Emilie Kaufmann, Michal Valko:
General parallel optimization a without metric. ALT 2019: 762-787 - [c45]Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret. COLT 2019: 533-557 - [c44]Peter L. Bartlett, Victor Gabillon, Jennifer Healey, Michal Valko:
Scale-free adaptive planning for deterministic dynamics & discounted rewards. ICML 2019: 495-504 - [c43]Pierre Perrault, Vianney Perchet, Michal Valko:
Exploiting structure of uncertainty for efficient matroid semi-bandits. ICML 2019: 5123-5132 - [c42]Guillaume Gautier, Rémi Bardenet, Michal Valko:
On two ways to use determinantal point processes for Monte Carlo integration. NeurIPS 2019: 7768-7777 - [c41]Michal Derezinski, Daniele Calandriello, Michal Valko:
Exact sampling of determinantal point processes with sublinear time preprocessing. NeurIPS 2019: 11542-11554 - [c40]Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos:
Multiagent Evaluation under Incomplete Information. NeurIPS 2019: 12270-12282 - [c39]Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Ménard, Rémi Munos, Michal Valko:
Planning in entropy-regularized Markov decision processes and games. NeurIPS 2019: 12383-12392 - [i24]Jean-Bastien Grill, Michal Valko, Rémi Munos:
Optimistic optimization of a Brownian. CoRR abs/1901.04884 (2019) - [i23]Pierre Perrault, Vianney Perchet, Michal Valko:
Exploiting Structure of Uncertainty for Efficient Combinatorial Semi-Bandits. CoRR abs/1902.03794 (2019) - [i22]Daniele Calandriello, Luigi Carratino
, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco:
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret. CoRR abs/1903.05594 (2019) - [i21]Michal Derezinski, Daniele Calandriello, Michal Valko:
Exact sampling of determinantal point processes with sublinear time preprocessing. CoRR abs/1905.13476 (2019) - [i20]Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos:
Multiagent Evaluation under Incomplete Information. CoRR abs/1909.09849 (2019) - [i19]