default search action
Roy Fox
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c28]Stephen Marcus McAleer, JB Lanier, Kevin A. Wang, Pierre Baldi, Tuomas Sandholm, Roy Fox:
Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games. ICLR 2024 - [c27]Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox:
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills. ICML 2024 - [c26]Kolby Nottingham, Yasaman Razeghi, Kyungmin Kim, JB Lanier, Pierre Baldi, Roy Fox, Sameer Singh:
Selective Perception: Learning Concise State Descriptions for Language Model Actors. NAACL (Short Papers) 2024: 327-341 - [c25]Davide Corsi, Guy Amir, Andoni Rodríguez, Guy Katz, César Sánchez, Roy Fox:
Verification-Guided Shielding for Deep Reinforcement Learning. RLC 2024: 1759-1780 - [c24]Armin Karamzade, Kyungmin Kim, Montek Kalsi, Roy Fox:
Reinforcement Learning from Delayed Observations via World Models. RLC 2024: 2123-2139 - [i36]Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox:
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills. CoRR abs/2402.03244 (2024) - [i35]Dmitrii Krylov, Armin Karamzade, Roy Fox:
Moonwalk: Inverse-Forward Differentiation. CoRR abs/2402.14212 (2024) - [i34]Armin Karamzade, Kyungmin Kim, Montek Kalsi, Roy Fox:
Reinforcement Learning from Delayed Observations via World Models. CoRR abs/2403.12309 (2024) - [i33]Davide Corsi, Guy Amir, Andoni Rodríguez, César Sánchez, Guy Katz, Roy Fox:
Verification-Guided Shielding for Deep Reinforcement Learning. CoRR abs/2406.06507 (2024) - [i32]Kyungmin Kim, Davide Corsi, Andoni Rodríguez, JB Lanier, Benjami Parellada, Pierre Baldi, César Sánchez, Roy Fox:
Realizable Continuous-Space Shields for Safe Reinforcement Learning. CoRR abs/2410.02038 (2024) - 2023
- [c23]Dmitrii Krylov, Pooya Khajeh, Junhan Ouyang, Thomas Reeves, Tongkai Liu, Hiba Ajmal, Hamidreza Aghasi, Roy Fox:
Learning to Design Analog Circuits to Meet Threshold Specifications. ICML 2023: 17858-17873 - [c22]Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, Roy Fox:
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling. ICML 2023: 26311-26325 - [i31]Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, Roy Fox:
Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling. CoRR abs/2301.12050 (2023) - [i30]Kolby Nottingham, Yasaman Razeghi, Kyungmin Kim, JB Lanier, Pierre Baldi, Roy Fox, Sameer Singh:
Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors. CoRR abs/2307.11922 (2023) - [i29]Dmitrii Krylov, Pooya Khajeh, Junhan Ouyang, Thomas Reeves, Tongkai Liu, Hiba Ajmal, Hamidreza Aghasi, Roy Fox:
Learning to Design Analog Circuits to Meet Threshold Specifications. CoRR abs/2307.13861 (2023) - 2022
- [c21]Roy Fox, Stephen M. McAleer, Will Overman, Ioannis Panageas:
Independent Natural Policy Gradient always converges in Markov Potential Games. AISTATS 2022: 4414-4425 - [c20]Litian Liang, Yaosheng Xu, Stephen McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox:
Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks. ICML 2022: 13285-13301 - [i28]Stephen McAleer, Kevin Wang, John B. Lanier, Marc Lanctot, Pierre Baldi, Tuomas Sandholm, Roy Fox:
Anytime PSRO for Two-Player Zero-Sum Games. CoRR abs/2201.07700 (2022) - [i27]Kolby Nottingham, Alekhya Pyla, Sameer Singh, Roy Fox:
Learning to Query Internet Text for Informing Reinforcement Learning Agents. CoRR abs/2205.13079 (2022) - [i26]Stephen McAleer, John B. Lanier, Kevin A. Wang, Pierre Baldi, Roy Fox, Tuomas Sandholm:
Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games. CoRR abs/2207.06541 (2022) - [i25]John B. Lanier, Stephen McAleer, Pierre Baldi, Roy Fox:
Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments. CoRR abs/2207.09597 (2022) - [i24]Litian Liang, Yaosheng Xu, Stephen McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox:
Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks. CoRR abs/2209.07670 (2022) - 2021
- [c19]Stephen McAleer, John B. Lanier, Kevin A. Wang, Pierre Baldi, Roy Fox:
XDO: A Double Oracle Algorithm for Extensive-Form Games. NeurIPS 2021: 23128-23139 - [i23]Forest Agostinelli, Alexander Shmakov, Stephen McAleer, Roy Fox, Pierre Baldi:
A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks. CoRR abs/2102.04518 (2021) - [i22]Stephen McAleer, John B. Lanier, Pierre Baldi, Roy Fox:
XDO: A Double Oracle Algorithm for Extensive-Form Games. CoRR abs/2103.06426 (2021) - [i21]Stephen McAleer, John B. Lanier, Michael Dennis, Pierre Baldi, Roy Fox:
Improving Social Welfare While Preserving Autonomy via a Pareto Mediator. CoRR abs/2106.03927 (2021) - [i20]Kolby Nottingham, Litian Liang, Daeyun Shin, Charless C. Fowlkes, Roy Fox, Sameer Singh:
Modular Framework for Visuomotor Language Grounding. CoRR abs/2109.02161 (2021) - [i19]Roy Fox, Stephen McAleer, Will Overman, Ioannis Panageas:
Independent Natural Policy Gradient Always Converges in Markov Potential Games. CoRR abs/2110.10614 (2021) - [i18]Litian Liang, Yaosheng Xu, Stephen McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox:
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates. CoRR abs/2110.14818 (2021) - [i17]Dailin Hu, Pieter Abbeel, Roy Fox:
Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning. CoRR abs/2111.14204 (2021) - [i16]Yaosheng Xu, Dailin Hu, Litian Liang, Stephen McAleer, Pieter Abbeel, Roy Fox:
Target Entropy Annealing for Discrete Soft Actor-Critic. CoRR abs/2112.02852 (2021) - 2020
- [c18]Stephen McAleer, John B. Lanier, Roy Fox, Pierre Baldi:
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games. NeurIPS 2020 - [i15]Stephen McAleer, John B. Lanier, Roy Fox, Pierre Baldi:
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games. CoRR abs/2006.08555 (2020)
2010 – 2019
- 2019
- [j1]Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, Ion Stoica:
AutoPandas: neural-backed generators for program synthesis. Proc. ACM Program. Lang. 3(OOPSLA): 168:1-168:27 (2019) - [c17]Roy Fox, Ron Berenstein, Ion Stoica, Ken Goldberg:
Multi-Task Hierarchical Imitation Learning for Home Automation. CASE 2019: 1-8 - [i14]Roy Fox, Richard Shin, William Paul, Yitian Zou, Dawn Song, Ken Goldberg, Pieter Abbeel, Ion Stoica:
Hierarchical Variational Imitation Learning of Control Programs. CoRR abs/1912.12612 (2019) - 2018
- [c16]Jonathan Lee, Michael Laskey, Roy Fox, Ken Goldberg:
Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations. CASE 2018: 270-277 - [c15]Roy Fox, Richard Shin, Sanjay Krishnan, Ken Goldberg, Dawn Song, Ion Stoica:
Parametrized Hierarchical Procedures for Neural Programming. ICLR (Poster) 2018 - [c14]Eric Liang, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Ken Goldberg, Joseph Gonzalez, Michael I. Jordan, Ion Stoica:
RLlib: Abstractions for Distributed Reinforcement Learning. ICML 2018: 3059-3068 - [c13]Ron Berenstein, Roy Fox, Stephen McKinley, Stefano Carpin, Ken Goldberg:
Robustly Adjusting Indoor Drip Irrigation Emitters with the Toyota HSR Robot. ICRA 2018: 2236-2243 - [c12]Daniel Seita, Sanjay Krishnan, Roy Fox, Stephen McKinley, John F. Canny, Ken Goldberg:
Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure. ICRA 2018: 6651-6658 - [c11]Ajay Kumar Tanwani, Jonathan Lee, Brijen Thananjeyan, Michael Laskey, Sanjay Krishnan, Roy Fox, Ken Goldberg, Sylvain Calinon:
Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models. WAFR 2018: 196-211 - [i13]Jonathan Lee, Michael Laskey, Roy Fox, Ken Goldberg:
Derivative-Free Failure Avoidance Control for Manipulation using Learned Support Constraints. CoRR abs/1801.10321 (2018) - [i12]Ajay Kumar Tanwani, Jonathan Lee, Brijen Thananjeyan, Michael Laskey, Sanjay Krishnan, Roy Fox, Ken Goldberg, Sylvain Calinon:
Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models. CoRR abs/1811.07489 (2018) - 2017
- [c10]Carolyn Chen, Sanjay Krishnan, Michael Laskey, Roy Fox, Ken Goldberg:
An algorithm and user study for teaching bilateral manipulation via iterated best response demonstrations. CASE 2017: 151-158 - [c9]Caleb Chuck, Michael Laskey, Sanjay Krishnan, Ruta Joshi, Roy Fox, Ken Goldberg:
Statistical data cleaning for deep learning of automation tasks from demonstrations. CASE 2017: 1142-1149 - [c8]Michael Laskey, Jonathan Lee, Roy Fox, Anca D. Dragan, Ken Goldberg:
DART: Noise Injection for Robust Imitation Learning. CoRL 2017: 143-156 - [c7]Sanjay Krishnan, Roy Fox, Ion Stoica, Ken Goldberg:
DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations. CoRL 2017: 418-437 - [i11]Roy Fox, Sanjay Krishnan, Ion Stoica, Ken Goldberg:
Multi-Level Discovery of Deep Options. CoRR abs/1703.08294 (2017) - [i10]Michael Laskey, Jonathan Lee, Wesley Yu-Shu Hsieh, Richard Liaw, Jeffrey Mahler, Roy Fox, Ken Goldberg:
Iterative Noise Injection for Scalable Imitation Learning. CoRR abs/1703.09327 (2017) - [i9]Daniel Seita, Sanjay Krishnan, Roy Fox, Stephen McKinley, John F. Canny, Ken Goldberg:
Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure. CoRR abs/1709.06668 (2017) - [i8]Sanjay Krishnan, Roy Fox, Ion Stoica, Ken Goldberg:
DDCO: Discovery of Deep Continuous Options forRobot Learning from Demonstrations. CoRR abs/1710.05421 (2017) - [i7]Eric Liang, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Joseph Gonzalez, Ken Goldberg, Ion Stoica:
Ray RLLib: A Composable and Scalable Reinforcement Learning Library. CoRR abs/1712.09381 (2017) - 2016
- [b1]Roy Fox:
Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes (שער נוסף בעברית: שיטות תורת-האינפורמציה לתכנון ולמידה בתהליכי החלטה מרקוב נצפים חלקית.). Hebrew University of Jerusalem, Israel, 2016 - [c6]Roy Fox, Naftali Tishby:
Minimum-information LQG control Part II: Retentive controllers. CDC 2016: 5603-5609 - [c5]Roy Fox, Naftali Tishby:
Minimum-information LQG control part I: Memoryless controllers. CDC 2016: 5610-5616 - [c4]Roy Fox, Ari Pakman, Naftali Tishby:
Taming the Noise in Reinforcement Learning via Soft Updates. UAI 2016 - [i6]Roy Fox, Naftali Tishby:
Minimum-Information LQG Control - Part I: Memoryless Controllers. CoRR abs/1606.01946 (2016) - [i5]Roy Fox, Naftali Tishby:
Minimum-Information LQG Control - Part II: Retentive Controllers. CoRR abs/1606.01947 (2016) - [i4]Roy Fox, Michal Moshkovitz, Naftali Tishby:
Principled Option Learning in Markov Decision Processes. CoRR abs/1609.05524 (2016) - [i3]Roy Fox:
Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes. CoRR abs/1609.07672 (2016) - 2015
- [i2]Roy Fox, Ari Pakman, Naftali Tishby:
G-Learning: Taming the Noise in Reinforcement Learning via Soft Updates. CoRR abs/1512.08562 (2015) - [i1]Roy Fox, Naftali Tishby:
Optimal Selective Attention in Reactive Agents. CoRR abs/1512.08575 (2015) - 2013
- [c3]Josh Merel, Roy Fox, Tony Jebara, Liam Paninski:
A multi-agent control framework for co-adaptation in brain-computer interfaces. NIPS 2013: 2841-2849 - 2012
- [c2]Roy Fox, Naftali Tishby:
Bounded Planning in Passive POMDPs. ICML 2012
2000 – 2009
- 2007
- [c1]Roy Fox, Moshe Tennenholtz:
A Reinforcement Learning Algorithm with Polynomial Interaction Complexity for Only-Costly-Observable MDPs. AAAI 2007: 553-558
Coauthor Index
aka: Ken Goldberg
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-13 23:52 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint