default search action
Shimon Whiteson
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j31]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
SplAgger: Split Aggregation for Meta-Reinforcement Learning. RLJ 1: 450-469 (2024) - [j30]Matthew Thomas Jackson, Michael T. Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob Nicolaus Foerster:
Policy-Guided Diffusion. RLJ 4: 1855-1872 (2024) - [c156]Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Garðar Ingvarsson, Timon Willi, Akbir Khan, Christian Schröder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert T. Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktäschel, Chris Lu, Jakob N. Foerster:
JaxMARL: Multi-Agent RL Environments and Algorithms in JAX. AAMAS 2024: 2444-2446 - [c155]Matthew Thomas Jackson, Chris Lu, Louis Kirsch, Robert Tjarko Lange, Shimon Whiteson, Jakob Nicolaus Foerster:
Discovering Temporally-Aware Reinforcement Learning Algorithms. ICLR 2024 - [c154]Mattie Fellows, Brandon Kaplowitz, Christian Schröder de Witt, Shimon Whiteson:
Bayesian Exploration Networks. ICML 2024 - [c153]Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson:
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control. ICML 2024 - [c152]Reza Mahjourian, Rongbing Mu, Valerii Likhosherstov, Paul Mougin, Xiukun Huang, João V. Messias, Shimon Whiteson:
UniGen: Unified Modeling of Initial Agent States and Trajectories for Generating Autonomous Driving Scenarios. ICRA 2024: 16367-16373 - [i115]Matthew Thomas Jackson, Chris Lu, Louis Kirsch, Robert T. Lange, Shimon Whiteson, Jakob Nicolaus Foerster:
Discovering Temporally-Aware Reinforcement Learning Algorithms. CoRR abs/2402.05828 (2024) - [i114]Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson:
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control. CoRR abs/2402.06570 (2024) - [i113]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
SplAgger: Split Aggregation for Meta-Reinforcement Learning. CoRR abs/2403.03020 (2024) - [i112]Matthew Thomas Jackson, Michael T. Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob N. Foerster:
Policy-Guided Diffusion. CoRR abs/2404.06356 (2024) - [i111]Reza Mahjourian, Rongbing Mu, Valerii Likhosherstov, Paul Mougin, Xiukun Huang, João V. Messias, Shimon Whiteson:
UniGen: Unified Modeling of Initial Agent States and Trajectories for Generating Autonomous Driving Scenarios. CoRR abs/2405.03807 (2024) - [i110]Risto Vuorio, Mattie Fellows, Cong Lu, Clémence Grislain, Shimon Whiteson:
A Bayesian Solution To The Imitation Gap. CoRR abs/2407.00495 (2024) - [i109]Alexander David Goldie, Chris Lu, Matthew Thomas Jackson, Shimon Whiteson, Jakob Nicolaus Foerster:
Can Learned Optimization Make Reinforcement Learning Less Difficult? CoRR abs/2407.07082 (2024) - 2023
- [c151]Mingfei Sun, Sam Devlin, Jacob Beck, Katja Hofmann, Shimon Whiteson:
Trust Region Bounds for Decentralized PPO Under Non-stationarity. AAMAS 2023: 5-13 - [c150]Yat Long Lo, Christian Schröder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson:
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning. ICLR 2023 - [c149]Mattie Fellows, Matthew J. A. Smith, Shimon Whiteson:
Why Target Networks Stabilise Temporal Difference Methods. ICML 2023: 9886-9909 - [c148]Zheng Xiong, Jacob Beck, Shimon Whiteson:
Universal Morphology Control via Contextual Modulation. ICML 2023: 38286-38300 - [c147]Maximilian Igl, Punit Shah, Paul Mougin, Sirish Srinivasan, Tarun Gupta, Brandyn White, Kyriacos Shiarlis, Shimon Whiteson:
Hierarchical Imitation Learning for Stochastic Environments. IROS 2023: 1697-1704 - [c146]Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine:
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios. IROS 2023: 7553-7560 - [c145]Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
Recurrent Hypernetworks are Surprisingly Strong in Meta-RL. NeurIPS 2023 - [c144]Benjamin Ellis, Jonathan Cook, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson:
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning. NeurIPS 2023 - [c143]Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob N. Foerster:
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design. NeurIPS 2023 - [c142]Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nicholas Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov:
The Waymo Open Sim Agents Challenge. NeurIPS 2023 - [i108]Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa M. Zintgraf, Chelsea Finn, Shimon Whiteson:
A Survey of Meta-Reinforcement Learning. CoRR abs/2301.08028 (2023) - [i107]Mingfei Sun, Benjamin Ellis, Anuj Mahajan, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Trust-Region-Free Policy Optimization for Stochastic Policies. CoRR abs/2302.07985 (2023) - [i106]Zheng Xiong, Jacob Beck, Shimon Whiteson:
Universal Morphology Control via Contextual Modulation. CoRR abs/2302.11070 (2023) - [i105]Yat Long Lo, Christian Schröder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson:
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning. CoRR abs/2303.10733 (2023) - [i104]Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nick Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov:
The Waymo Open Sim Agents Challenge. CoRR abs/2305.12032 (2023) - [i103]Mattie Fellows, Brandon Kaplowitz, Christian Schröder de Witt, Shimon Whiteson:
Bayesian Exploration Networks. CoRR abs/2308.13049 (2023) - [i102]Maximilian Igl, Punit Shah, Paul Mougin, Sirish Srinivasan, Tarun Gupta, Brandyn White, Kyriacos Shiarlis, Shimon Whiteson:
Hierarchical Imitation Learning for Stochastic Environments. CoRR abs/2309.14003 (2023) - [i101]Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
Recurrent Hypernetworks are Surprisingly Strong in Meta-RL. CoRR abs/2309.14970 (2023) - [i100]Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster:
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design. CoRR abs/2310.02782 (2023) - [i99]Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Garðar Ingvarsson, Timon Willi, Akbir Khan, Christian Schröder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktäschel, Chris Lu, Jakob Nicolaus Foerster:
JaxMARL: Multi-Agent RL Environments in JAX. CoRR abs/2311.10090 (2023) - 2022
- [j29]Shangtong Zhang, Shimon Whiteson:
Truncated Emphatic Temporal Difference Methods for Prediction and Control. J. Mach. Learn. Res. 23: 153:1-153:59 (2022) - [c141]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency. AAAI 2022: 8378-8385 - [c140]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. AAMAS 2022: 1491-1499 - [c139]Matthew J. A. Smith, Jelena Luketina, Kristian Hartikainen, Maximilian Igl, Shimon Whiteson:
Learning Skills Diverse in Value-Relevant Features. CoLLAs 2022: 1174-1194 - [c138]Eli Bronstein, Sirish Srinivasan, Supratik Paul, Aman Sinha, Matthew O'Kelly, Payam Nikdel, Shimon Whiteson:
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula. CoRL 2022: 188-198 - [c137]Angad Singh, Omar Makhlouf, Maximilian Igl, João V. Messias, Arnaud Doucet, Shimon Whiteson:
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving. CoRL 2022: 1168-1177 - [c136]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Shimon Whiteson:
Hypernetworks in Meta-Reinforcement Learning. CoRL 2022: 1478-1487 - [c135]Darius Muglich, Luisa M. Zintgraf, Christian A. Schröder de Witt, Shimon Whiteson, Jakob N. Foerster:
Generalized Beliefs for Cooperative AI. ICML 2022: 16062-16082 - [c134]Samuel Sokota, Christian A. Schröder de Witt, Maximilian Igl, Luisa M. Zintgraf, Philip H. S. Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob N. Foerster:
Communicating via Markov Decision Processes. ICML 2022: 20314-20328 - [c133]Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson:
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. ICRA 2022: 2445-2451 - [c132]Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov:
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving. IROS 2022: 8652-8659 - [c131]Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, Pawan Kumar Mudigonda:
In Defense of the Unitary Scalarization for Deep Multi-Task Learning. NeurIPS 2022 - [c130]Darius Muglich, Christian Schröder de Witt, Elise van der Pol, Shimon Whiteson, Jakob N. Foerster:
Equivariant Networks for Zero-Shot Coordination. NeurIPS 2022 - [i98]Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar:
In Defense of the Unitary Scalarization for Deep Multi-Task Learning. CoRR abs/2201.04122 (2022) - [i97]Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson:
You May Not Need Ratio Clipping in PPO. CoRR abs/2202.00079 (2022) - [i96]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Monotonic Improvement Guarantees under Non-stationarity for Decentralized PPO. CoRR abs/2202.00082 (2022) - [i95]Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson:
Generalization in Cooperative Multi-Agent Systems. CoRR abs/2202.00104 (2022) - [i94]Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson:
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. CoRR abs/2205.03195 (2022) - [i93]Darius Muglich, Luisa M. Zintgraf, Christian Schröder de Witt, Shimon Whiteson, Jakob N. Foerster:
Generalized Beliefs for Cooperative AI. CoRR abs/2206.12765 (2022) - [i92]Risto Vuorio, Jacob Beck, Shimon Whiteson, Jakob N. Foerster, Gregory Farquhar:
An Investigation of the Bias-Variance Tradeoff in Meta-Gradients. CoRR abs/2209.11303 (2022) - [i91]Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov:
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving. CoRR abs/2210.09539 (2022) - [i90]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Shimon Whiteson:
Hypernetworks in Meta-Reinforcement Learning. CoRR abs/2210.11348 (2022) - [i89]Darius Muglich, Christian Schröder de Witt, Elise van der Pol, Shimon Whiteson, Jakob N. Foerster:
Equivariant Networks for Zero-Shot Coordination. CoRR abs/2210.12124 (2022) - [i88]Eli Bronstein, Sirish Srinivasan, Supratik Paul, Aman Sinha, Matthew O'Kelly, Payam Nikdel, Shimon Whiteson:
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula. CoRR abs/2212.01375 (2022) - [i87]Angad Singh, Omar Makhlouf, Maximilian Igl, João V. Messias, Arnaud Doucet, Shimon Whiteson:
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving. CoRR abs/2212.06968 (2022) - [i86]Benjamin Ellis, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson:
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2212.07489 (2022) - [i85]Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Becca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine:
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios. CoRR abs/2212.11419 (2022) - 2021
- [j28]Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson:
Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning. Auton. Agents Multi Agent Syst. 35(2): 25 (2021) - [j27]Luisa M. Zintgraf, Sebastian Schulze, Cong Lu, Leo Feng, Maximilian Igl, Kyriacos Shiarlis, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning. J. Mach. Learn. Res. 22: 289:1-289:39 (2021) - [j26]Dmitrii Beloborodov, Alexander E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky:
Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization. Mach. Learn. Sci. Technol. 2(2): 25009 (2021) - [c129]Shangtong Zhang, Bo Liu, Shimon Whiteson:
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning. AAAI 2021: 10905-10913 - [c128]Luisa M. Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann:
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning. AAMAS 2021: 1712-1714 - [c127]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework. AAMAS 2021: 1735-1737 - [c126]Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang:
RODE: Learning Roles to Decompose Multi-Agent Tasks. ICLR 2021 - [c125]Maximilian Igl, Gregory Farquhar, Jelena Luketina, Wendelin Boehmer, Shimon Whiteson:
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning. ICLR 2021 - [c124]Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson:
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control. ICLR 2021 - [c123]Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Boehmer, Shimon Whiteson:
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning. ICML 2021: 3930-3941 - [c122]Shariq Iqbal, Christian A. Schröder de Witt, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha:
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning. ICML 2021: 4596-4606 - [c121]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. ICML 2021: 7301-7312 - [c120]Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson:
Average-Reward Off-Policy Policy Evaluation with Function Approximation. ICML 2021: 12578-12588 - [c119]Shangtong Zhang, Hengshuai Yao, Shimon Whiteson:
Breaking the Deadly Triad with a Target Network. ICML 2021: 12621-12631 - [c118]Luisa M. Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson:
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning. ICML 2021: 12991-13001 - [c117]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning (Extended Abstract). IJCAI 2021: 4869-4873 - [c116]Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson:
Regularized Softmax Deep Multi-Agent Q-Learning. NeurIPS 2021: 1365-1377 - [c115]Bei Peng, Tabish Rashid, Christian Schröder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson:
FACMAC: Factored Multi-Agent Centralised Policy Gradients. NeurIPS 2021: 12208-12221 - [c114]Mattie Fellows, Kristian Hartikainen, Shimon Whiteson:
Bayesian Bellman Operators. NeurIPS 2021: 13641-13656 - [c113]Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson:
Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing. NeurIPS 2021: 23983-23992 - [i84]Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson:
Average-Reward Off-Policy Policy Evaluation with Function Approximation. CoRR abs/2101.02808 (2021) - [i83]Luisa M. Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann:
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning. CoRR abs/2101.03864 (2021) - [i82]Shangtong Zhang, Hengshuai Yao, Shimon Whiteson:
Breaking the Deadly Triad with a Target Network. CoRR abs/2101.08862 (2021) - [i81]Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson:
Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing. CoRR abs/2103.01009 (2021) - [i80]Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson:
Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning. CoRR abs/2103.11883 (2021) - [i79]Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson:
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients. CoRR abs/2104.13446 (2021) - [i78]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. CoRR abs/2106.00136 (2021) - [i77]Mingfei Sun, Anuj Mahajan, Katja Hofmann, Shimon Whiteson:
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching. CoRR abs/2106.03155 (2021) - [i76]Matthew Fellows, Kristian Hartikainen, Shimon Whiteson:
Bayesian Bellman Operators. CoRR abs/2106.05012 (2021) - [i75]Samuel Sokota, Christian Schröder de Witt, Maximilian Igl, Luisa M. Zintgraf, Philip H. S. Torr, Shimon Whiteson, Jakob N. Foerster:
Implicit Communication as Minimum Entropy Coupling. CoRR abs/2107.08295 (2021) - [i74]Shangtong Zhang, Shimon Whiteson:
Truncated Emphatic Temporal Difference Methods for Prediction and Control. CoRR abs/2108.05338 (2021) - [i73]Pascal Van Der Vaart, Anuj Mahajan, Shimon Whiteson:
Model based Multi-agent Reinforcement Learning with Tensor Decompositions. CoRR abs/2110.14524 (2021) - [i72]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Reinforcement Learning in Factored Action Spaces using Tensor Decompositions. CoRR abs/2110.14538 (2021) - [i71]Zheng Xiong, Luisa M. Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson:
On the Practical Consistency of Meta-Reinforcement Learning Algorithms. CoRR abs/2112.00478 (2021) - [i70]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency. CoRR abs/2112.06054 (2021) - 2020
- [j25]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework. Auton. Agents Multi Agent Syst. 34(1): 22 (2020) - [j24]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients for Reinforcement Learning. J. Mach. Learn. Res. 21: 52:1-52:51 (2020) - [j23]Supratik Paul, Konstantinos I. Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson:
Robust Reinforcement Learning with Bayesian Optimisation and Quadrature. J. Mach. Learn. Res. 21: 151:1-151:31 (2020) - [j22]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. J. Mach. Learn. Res. 21: 178:1-178:51 (2020) - [c112]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards. AAMAS 2020: 1215-1223 - [c111]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning. AAMAS 2020: 1611-1619 - [c110]Tabish Rashid, Bei Peng, Wendelin Boehmer, Shimon Whiteson:
Optimistic Exploration even with a Pessimistic Initialisation. ICLR 2020 - [c109]Luisa M. Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning. ICLR 2020 - [c108]Wendelin Boehmer, Vitaly Kurin, Shimon Whiteson:
Deep Coordination Graphs. ICML 2020: 980-991 - [c107]Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve:
Growing Action Spaces. ICML 2020: 3040-3051 - [c106]Shangtong Zhang, Bo Liu, Shimon Whiteson:
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values. ICML 2020: 11194-11203 - [c105]Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson:
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation. ICML 2020: 11204-11213 - [c104]Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro:
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? NeurIPS 2020 - [c103]Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson:
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. NeurIPS 2020 - [c102]Shangtong Zhang, Vivek Veeriah, Shimon Whiteson:
Learning Retrospective Knowledge with Reverse Reinforcement Learning. NeurIPS 2020 - [c101]Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N. Siddharth, Wendelin Boehmer, Shimon Whiteson:
Multitask Soft Option Learning. UAI 2020: 969-978 - [i69]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework. CoRR abs/2001.08703 (2020) - [i68]Shangtong Zhang, Bo Liu, Shimon Whiteson:
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values. CoRR abs/2001.11113 (2020) - [i67]Dmitrii Beloborodov, Alexander E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky:
Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization. CoRR abs/2002.04676 (2020) - [i66]Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson:
Optimistic Exploration even with a Pessimistic Initialisation. CoRR abs/2002.12174 (2020) - [i65]Christian Schröder de Witt, Bei Peng, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson:
Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control. CoRR abs/2003.06709 (2020) - [i64]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. CoRR abs/2003.08839 (2020) - [i63]Shangtong Zhang, Bo Liu, Shimon Whiteson:
Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning. CoRR abs/2004.10888 (2020) - [i62]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Reward. CoRR abs/2005.04912 (2020) - [i61]