default search action
Csaba Szepesvári
Person information
- affiliation: University of Alberta
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c214]David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári:
Exploration via linearly perturbed loss minimisation. AISTATS 2024: 721-729 - [c213]Jihao Andreas Lin, Shreyas Padhy, Javier Antorán, Austin Tripp, Alexander Terenin, Csaba Szepesvári, José Miguel Hernández-Lobato, David Janz:
Stochastic Gradient Descent for Gaussian Processes Done Right. ICLR 2024 - [c212]Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári:
Switching the Loss Reduces the Cost in Batch Reinforcement Learning. ICML 2024 - [i141]Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvári, Dale Schuurmans:
Stochastic Gradient Succeeds for Bandits. CoRR abs/2402.17235 (2024) - [i140]Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári:
Switching the Loss Reduces the Cost in Batch Reinforcement Learning. CoRR abs/2403.05385 (2024) - [i139]Johannes Kirschner, Seyed Alireza Bakhtiari, Kushagra Chandak, Volodymyr Tkachuk, Csaba Szepesvári:
Regret Minimization via Saddle Point Optimization. CoRR abs/2403.10379 (2024) - [i138]Yasin Abbasi-Yadkori, Ilja Kuzborskij, David Stutz, András György, Adam Fisch, Arnaud Doucet, Iuliya Beloshapka, Wei-Hung Weng, Yao-Yuan Yang, Csaba Szepesvári, Ali Taylan Cemgil, Nenad Tomasev:
Mitigating LLM Hallucinations via Conformal Abstention. CoRR abs/2405.01563 (2024) - [i137]Volodymyr Tkachuk, Gellért Weisz, Csaba Szepesvári:
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear qπ-Realizability and Concentrability. CoRR abs/2405.16809 (2024) - [i136]Yasin Abbasi-Yadkori, Ilja Kuzborskij, András György, Csaba Szepesvári:
To Believe or Not to Believe Your LLM. CoRR abs/2406.02543 (2024) - [i135]Tian Tian, Lin F. Yang, Csaba Szepesvári:
Confident Natural Policy Gradient for Local Planning in qπ-realizable Constrained MDPs. CoRR abs/2406.18529 (2024) - [i134]Shuai Liu, Alex Ayoub, Flore Sentenac, Xiaoqi Tan, Csaba Szepesvári:
Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits. CoRR abs/2410.01112 (2024) - 2023
- [c211]Volodymyr Tkachuk, Seyed Alireza Bakhtiari, Johannes Kirschner, Matej Jusup, Ilija Bogunovic, Csaba Szepesvári:
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning. AISTATS 2023: 6342-6370 - [c210]Sihan Liu, Gaurav Mahajan, Daniel Kane, Shachar Lovett, Gellért Weisz, Csaba Szepesvári:
Exponential Hardness of Reinforcement Learning with Linear Function Approximation. COLT 2023: 1588-1617 - [c209]Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvári, Zhaoran Wang:
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics. ICLR 2023 - [c208]Philip Amortila, Nan Jiang, Csaba Szepesvári:
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation. ICML 2023: 768-790 - [c207]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c206]Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvári, Dale Schuurmans:
Stochastic Gradient Succeeds for Bandits. ICML 2023: 24325-24360 - [c205]Yao Zhao, Connor Stephens, Csaba Szepesvári, Kwang-Sung Jun:
Revisiting Simple Regret: Fast Rates for Returning a Good Arm. ICML 2023: 42110-42158 - [c204]Johannes Kirschner, Seyed Alireza Bakhtiari, Kushagra Chandak, Volodymyr Tkachuk, Csaba Szepesvári:
Regret Minimization via Saddle Point Optimization. NeurIPS 2023 - [c203]Chung-Wei Lee, Qinghua Liu, Yasin Abbasi-Yadkori, Chi Jin, Tor Lattimore, Csaba Szepesvári:
Context-lumpable stochastic bandits. NeurIPS 2023 - [c202]Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári:
Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL. NeurIPS 2023 - [c201]Jincheng Mei, Bo Dai, Alekh Agarwal, Mohammad Ghavamzadeh, Csaba Szepesvári, Dale Schuurmans:
Ordering-based Conditions for Global Convergence of Policy Gradient Methods. NeurIPS 2023 - [c200]Gellért Weisz, András György, Csaba Szepesvári:
Online RL in Linearly qπ-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore. NeurIPS 2023 - [c199]Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin:
Optimistic MLE: A Generic Model-Based Algorithm for Partially Observable Sequential Decision Making. STOC 2023: 363-376 - [i133]Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
The Role of Baselines in Policy Gradient Optimization. CoRR abs/2301.06276 (2023) - [i132]Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao, Csaba Szepesvári:
Sample Efficient Deep Reinforcement Learning via Local Planning. CoRR abs/2301.12579 (2023) - [i131]Volodymyr Tkachuk, Seyed Alireza Bakhtiari, Johannes Kirschner, Matej Jusup, Ilija Bogunovic, Csaba Szepesvári:
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2302.04376 (2023) - [i130]Daniel Kane, Sihan Liu, Shachar Lovett, Gaurav Mahajan, Csaba Szepesvári, Gellért Weisz:
Exponential Hardness of Reinforcement Learning with Linear Function Approximation. CoRR abs/2302.12940 (2023) - [i129]Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári:
Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL. CoRR abs/2305.11032 (2023) - [i128]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i127]Chung-Wei Lee, Qinghua Liu, Yasin Abbasi-Yadkori, Chi Jin, Tor Lattimore, Csaba Szepesvári:
Context-lumpable stochastic bandits. CoRR abs/2306.13053 (2023) - [i126]Philip Amortila, Nan Jiang, Csaba Szepesvári:
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation. CoRR abs/2307.13332 (2023) - [i125]Gellért Weisz, András György, Csaba Szepesvári:
Online RL in Linearly qπ-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore. CoRR abs/2310.07811 (2023) - [i124]Jihao Andreas Lin, Shreyas Padhy, Javier Antorán, Austin Tripp, Alexander Terenin, Csaba Szepesvári, José Miguel Hernández-Lobato, David Janz:
Stochastic Gradient Descent for Gaussian Processes Done Right. CoRR abs/2310.20581 (2023) - [i123]David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári:
Exploration via linearly perturbed loss minimisation. CoRR abs/2311.07565 (2023) - [i122]David Janz, Alexander E. Litvak, Csaba Szepesvári:
Ensemble sampling for linear bandits: small ensembles suffice. CoRR abs/2311.08376 (2023) - 2022
- [c198]Botao Hao, Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvári:
Confident Least Square Value Iteration with Local Access to a Simulator. AISTATS 2022: 2420-2435 - [c197]Anant Raj, Pooria Joulani, András György, Csaba Szepesvári:
Faster Rates, Adaptive Algorithms, and Finite-Time Bounds for Linear Composition Optimization and Gradient TD Learning. AISTATS 2022: 7176-7186 - [c196]Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvári:
The Curse of Passive Data Collection in Batch Reinforcement Learning. AISTATS 2022: 8413-8438 - [c195]Gellért Weisz, Csaba Szepesvári, András György:
TensorPlan and the Few Actions Lower Bound for Planning in MDPs under Linear Realizability of Optimal Value Functions. ALT 2022: 1097-1137 - [c194]Dong Yin, Botao Hao, Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári:
Efficient local planning with linear function approximation. ALT 2022: 1165-1192 - [c193]Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin:
When Is Partially Observable Reinforcement Learning Not Scary? COLT 2022: 5175-5220 - [c192]Qinghua Liu, Csaba Szepesvári, Chi Jin:
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games. NeurIPS 2022 - [c191]Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
The Role of Baselines in Policy Gradient Optimization. NeurIPS 2022 - [c190]Sharan Vaswani, Lin Yang, Csaba Szepesvári:
Near-Optimal Sample Complexity Bounds for Constrained MDPs. NeurIPS 2022 - [c189]Gellért Weisz, András György, Tadashi Kozuno, Csaba Szepesvári:
Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs. NeurIPS 2022 - [c188]Hui Yuan, Chengzhuo Ni, Huazheng Wang, Xuezhou Zhang, Le Cong, Csaba Szepesvári, Mengdi Wang:
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization. NeurIPS 2022 - [c187]Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai:
A free lunch from the noise: Provable and practical exploration for representation learning. UAI 2022: 1686-1696 - [e4]Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, Sivan Sabato:
International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research 162, PMLR 2022 [contents] - [i121]Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvári, Doina Precup:
Towards Painless Policy Optimization for Constrained MDPs. CoRR abs/2204.05176 (2022) - [i120]Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin:
When Is Partially Observable Reinforcement Learning Not Scary? CoRR abs/2204.08967 (2022) - [i119]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i118]Qinghua Liu, Csaba Szepesvári, Chi Jin:
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games. CoRR abs/2206.01315 (2022) - [i117]Hui Yuan, Chengzhuo Ni, Huazheng Wang, Xuezhou Zhang, Le Cong, Csaba Szepesvári, Mengdi Wang:
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization. CoRR abs/2206.02092 (2022) - [i116]Sharan Vaswani, Lin F. Yang, Csaba Szepesvári:
Near-Optimal Sample Complexity Bounds for Constrained MDPs. CoRR abs/2206.06270 (2022) - [i115]Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin:
Optimistic MLE - A Generic Model-based Algorithm for Partially Observable Sequential Decision Making. CoRR abs/2209.14997 (2022) - [i114]Gellért Weisz, András György, Tadashi Kozuno, Csaba Szepesvári:
Confident Approximate Policy Iteration for Efficient Local Planning in qπ-realizable MDPs. CoRR abs/2210.15755 (2022) - [i113]Yao Zhao, Connor Stephens, Csaba Szepesvári, Kwang-Sung Jun:
Revisiting Simple Regret Minimization in Multi-Armed Bandits. CoRR abs/2210.16913 (2022) - [i112]Ilja Kuzborskij, Csaba Szepesvári:
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks. CoRR abs/2212.13848 (2022) - 2021
- [j45]María Pérez-Ortiz, Omar Rivasplata, John Shawe-Taylor, Csaba Szepesvári:
Tighter Risk Certificates for Neural Networks. J. Mach. Learn. Res. 22: 227:1-227:40 (2021) - [j44]Yuxi Li, Alborz Geramifard, Lihong Li, Csaba Szepesvári, Tao Wang:
Guest editorial: special issue on reinforcement learning for real life. Mach. Learn. 110(9): 2291-2293 (2021) - [c186]Botao Hao, Tor Lattimore, Csaba Szepesvári, Mengdi Wang:
Online Sparse Reinforcement Learning. AISTATS 2021: 316-324 - [c185]Botao Hao, Nevena Lazic, Yasin Abbasi-Yadkori, Pooria Joulani, Csaba Szepesvári:
Adaptive Approximate Policy Iteration. AISTATS 2021: 523-531 - [c184]Ilja Kuzborskij, Claire Vernade, András György, Csaba Szepesvári:
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting. AISTATS 2021: 640-648 - [c183]Gellért Weisz, Philip Amortila, Csaba Szepesvári:
Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions. ALT 2021: 1237-1264 - [c182]Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvári:
Asymptotically Optimal Information-Directed Sampling. COLT 2021: 2777-2821 - [c181]Ilja Kuzborskij, Csaba Szepesvári:
Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping. COLT 2021: 2853-2890 - [c180]Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári:
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function. COLT 2021: 4355-4385 - [c179]Dongruo Zhou, Quanquan Gu, Csaba Szepesvári:
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes. COLT 2021: 4532-4576 - [c178]Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvári, Mengdi Wang:
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient. ICML 2021: 4063-4073 - [c177]Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang:
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference. ICML 2021: 4074-4084 - [c176]Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvári:
A Distribution-dependent Analysis of Meta Learning. ICML 2021: 5697-5706 - [c175]Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvári:
Meta-Thompson Sampling. ICML 2021: 5884-5893 - [c174]Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvári:
Improved Regret Bound and Experience Replay in Regularized Policy Iteration. ICML 2021: 6032-6042 - [c173]Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
Leveraging Non-uniformity in First-order Non-convex Optimization. ICML 2021: 7555-7564 - [c172]Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
On the Optimality of Batch Policy Optimization Algorithms. ICML 2021: 11362-11371 - [c171]Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvári, Mengdi Wang:
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method. NeurIPS 2021: 2228-2240 - [c170]Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
Understanding the Effect of Stochasticity in Policy Optimization. NeurIPS 2021: 19339-19351 - [c169]Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári:
No Regrets for Learning the Prior in Bandits. NeurIPS 2021: 28029-28041 - [c168]Ilja Kuzborskij, Csaba Szepesvári, Omar Rivasplata, Amal Rannen-Triki, Razvan Pascanu:
On the Role of Optimization in Double Descent: A Least Squares Study. NeurIPS 2021: 29567-29577 - [i111]Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári:
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function. CoRR abs/2102.02049 (2021) - [i110]Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang:
Bootstrapping Statistical Inference for Off-Policy Evaluation. CoRR abs/2102.03607 (2021) - [i109]Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvári:
Meta-Thompson Sampling. CoRR abs/2102.06129 (2021) - [i108]Nevena Lazic, Botao Hao, Yasin Abbasi-Yadkori, Dale Schuurmans, Csaba Szepesvári:
Optimization Issues in KL-Constrained Approximate Policy Iteration. CoRR abs/2102.06234 (2021) - [i107]Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvári, Mengdi Wang:
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method. CoRR abs/2102.08607 (2021) - [i106]Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvári:
Improved Regret Bound and Experience Replay in Regularized Policy Iteration. CoRR abs/2102.12611 (2021) - [i105]Chenjun Xiao, Yifan Wu, Tor Lattimore, Bo Dai, Jincheng Mei, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
On the Optimality of Batch Policy Optimization Algorithms. CoRR abs/2104.02293 (2021) - [i104]Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
Leveraging Non-uniformity in First-order Non-convex Optimization. CoRR abs/2105.06072 (2021) - [i103]Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, András György, Csaba Szepesvári, Raia Hadsell, Nicolas Heess, Martin A. Riedmiller:
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning. CoRR abs/2106.08199 (2021) - [i102]Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvári:
On the Sample Complexity of Batch Reinforcement Learning with Policy-Induced Data. CoRR abs/2106.09973 (2021) - [i101]Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári:
No Regrets for Learning the Prior in Bandits. CoRR abs/2107.06196 (2021) - [i100]Ilja Kuzborskij, Csaba Szepesvári, Omar Rivasplata, Amal Rannen-Triki, Razvan Pascanu:
On the Role of Optimization in Double Descent: A Least Squares Study. CoRR abs/2107.12685 (2021) - [i99]Dong Yin, Botao Hao, Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári:
Efficient Local Planning with Linear Function Approximation. CoRR abs/2108.05533 (2021) - [i98]Gellért Weisz, Csaba Szepesvári, András György:
TensorPlan and the Few Actions Lower Bound for Planning in MDPs under Linear Realizability of Optimal Value Functions. CoRR abs/2110.02195 (2021) - [i97]Han Zhong, Zhuoran Yang, Zhaoran Wang, Csaba Szepesvári:
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs. CoRR abs/2110.08984 (2021) - [i96]Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
Understanding the Effect of Stochasticity in Policy Optimization. CoRR abs/2110.15572 (2021) - [i95]Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai:
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning. CoRR abs/2111.11485 (2021) - 2020
- [j43]Karl Tuyls, Julien Pérolat, Marc Lanctot, Edward Hughes, Richard Everett, Joel Z. Leibo, Csaba Szepesvári, Thore Graepel:
Bounds and dynamics for empirical game theoretic analysis. Auton. Agents Multi Agent Syst. 34(1): 7 (2020) - [j42]Yao Ma, Alex Olshevsky, Csaba Szepesvári, Venkatesh Saligrama:
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers. J. Mach. Learn. Res. 21: 133:1-133:36 (2020) - [j41]Pooria Joulani, András György, Csaba Szepesvári:
A modular analysis of adaptive (non-)convex optimization: Optimism, composite objectives, variance reduction, and variational bounds. Theor. Comput. Sci. 808: 108-138 (2020) - [c167]Branislav Kveton, Manzil Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier:
Randomized Exploration in Generalized Linear Bandits. AISTATS 2020: 2066-2076 - [c166]Botao Hao, Tor Lattimore, Csaba Szepesvári:
Adaptive Exploration in Linear Contextual Bandit. AISTATS 2020: 3536-3545 - [c165]Tor Lattimore, Csaba Szepesvári:
Exploration by Optimisation in Partial Monitoring. COLT 2020: 2488-2515 - [c164]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [c163]Alex Ayoub, Zeyu Jia, Csaba Szepesvári, Mengdi Wang, Lin Yang:
Model-Based Reinforcement Learning with Value-Targeted Regression. ICML 2020: 463-474 - [c162]Pooria Joulani, Anant Raj, András György, Csaba Szepesvári:
A simpler approach to accelerated optimization: iterative averaging meets optimism. ICML 2020: 4984-4993 - [c161]Tor Lattimore, Csaba Szepesvári, Gellért Weisz:
Learning with Good Feature Representations in Bandits and in RL with a Generative Model. ICML 2020: 5662-5670 - [c160]Jincheng Mei, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
On the Global Convergence Rates of Softmax Policy Gradient Methods. ICML 2020: 6820-6829 - [c159]Zeyu Jia, Lin Yang, Csaba Szepesvári, Mengdi Wang:
Model-Based Reinforcement Learning with Value-Targeted Regression. L4DC 2020: 666-686 - [c158]Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvári, Manzil Zaheer:
Differentiable Meta-Learning of Bandit Policies. NeurIPS 2020 - [c157]Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
CoinDICE: Off-Policy Confidence Interval Estimation. NeurIPS 2020 - [c156]Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
Escaping the Gravitational Pull of Softmax. NeurIPS 2020 - [c155]Aldo Pacchiano, My Phan, Yasin Abbasi-Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvári:
Model Selection in Contextual Stochastic Bandit Problems. NeurIPS 2020 - [c154]Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvári, John Shawe-Taylor:
PAC-Bayes Analysis Beyond the Usual Bounds. NeurIPS 2020 - [c153]Roshan Shariff, Csaba Szepesvári:
Efficient Planning in Large MDPs with Weak Linear Function Approximation. NeurIPS 2020 - [c152]Arun Verma, Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Online Algorithm for Unsupervised Sequential Selection with Contextual Information. NeurIPS 2020 - [c151]Gellért Weisz, András György, Wei-I Lin, Devon R. Graham, Kevin Leyton-Brown, Csaba Szepesvári, Brendan Lucier:
ImpatientCapsAndRuns: Approximately Optimal Algorithm Configuration from an Infinite Pool. NeurIPS 2020 - [c150]Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvári, Mengdi Wang:
Variational Policy Gradient Method for Reinforcement Learning with General Utilities. NeurIPS 2020 - [i94]Botao Hao, Nevena Lazic, Yasin Abbasi-Yadkori, Pooria Joulani, Csaba Szepesvári:
Provably Efficient Adaptive Approximate Policy Iteration. CoRR abs/2002.03069 (2020) - [i93]Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvári, Manzil Zaheer:
Differentiable Bandit Exploration. CoRR abs/2002.06772 (2020) - [i92]Aldo Pacchiano, My Phan, Yasin Abbasi-Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvári:
Model Selection in Contextual Stochastic Bandit Problems. CoRR abs/2003.01704 (2020) - [i91]