


Остановите войну!
for scientists:


default search action
Simon S. Du
Simon Shaolei Du – 杜少雷
Person information

- unicode name: 杜少雷
- affiliation: University of Washington, USA
- affiliation (former): Carnegie Mellon University, Machine Learning Department
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j5]Wenqing Zheng, Hao (Frank) Yang
, Jiarui Cai, Peihao Wang, Xuan Jiang, Simon Shaolei Du, Yinhai Wang, Zhangyang Wang:
Integrating the traffic science with representation learning for city-wide network congestion prediction. Inf. Fusion 99: 101837 (2023) - [j4]Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu:
Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [c86]Yulai Zhao, Jianshu Chen, Simon S. Du:
Blessing of Class Diversity in Pre-training. AISTATS 2023: 283-305 - [c85]Weihang Xu, Simon S. Du:
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron. COLT 2023: 1155-1198 - [c84]Qiwen Cui, Kaiqing Zhang, Simon S. Du:
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation. COLT 2023: 2651-2652 - [c83]Yan Dai, Ruosong Wang, Simon Shaolei Du:
Variance-Aware Sparse Linear Bandits. ICLR 2023 - [c82]Shicong Cen, Yuejie Chi, Simon Shaolei Du, Lin Xiao:
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games. ICLR 2023 - [c81]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du:
Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement. ICLR 2023 - [c80]Rui Yuan, Simon Shaolei Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao:
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies. ICLR 2023 - [c79]Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon Shaolei Du, Jason D. Lee:
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing. ICML 2023: 15200-15238 - [c78]Yiping Wang, Yifang Chen, Kevin Jamieson, Simon Shaolei Du:
Improved Active Multi-Task Representation Learning via Lasso. ICML 2023: 35548-35578 - [c77]Haotian Ye, Xiaoyu Chen, Liwei Wang, Simon Shaolei Du:
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness. ICML 2023: 39770-39800 - [c76]Runlong Zhou, Ruosong Wang, Simon Shaolei Du:
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes. ICML 2023: 42698-42723 - [c75]Runlong Zhou, Zihan Zhang, Simon Shaolei Du:
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments. ICML 2023: 42878-42914 - [i109]Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon S. Du, Jason D. Lee:
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing. CoRR abs/2301.11500 (2023) - [i108]Runlong Zhou, Zihan Zhang, Simon S. Du:
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments. CoRR abs/2301.13446 (2023) - [i107]Yunchang Yang, Han Zhong, Tianhao Wu, Bin Liu, Liwei Wang, Simon S. Du:
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback. CoRR abs/2302.01477 (2023) - [i106]Qiwen Cui, Kaiqing Zhang, Simon S. Du:
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation. CoRR abs/2302.03673 (2023) - [i105]Weihang Xu, Simon S. Du:
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron. CoRR abs/2302.10034 (2023) - [i104]Yuandong Tian, Yiping Wang, Beidi Chen, Simon S. Du:
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer. CoRR abs/2305.16380 (2023) - [i103]Yiping Wang, Yifang Chen, Kevin G. Jamieson, Simon S. Du:
Improved Active Multi-Task Representation Learning via Lasso. CoRR abs/2306.02556 (2023) - [i102]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning. CoRR abs/2306.07465 (2023) - [i101]Yifang Chen, Yingbing Huang, Simon S. Du, Kevin G. Jamieson, Guanya Shi:
Active Representation Learning for General Task Space with Applications in Robotics. CoRR abs/2306.08942 (2023) - [i100]Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Yinglun Zhu, Simon Shaolei Du, Kevin G. Jamieson, Robert D. Nowak:
LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning. CoRR abs/2306.09910 (2023) - [i99]Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du:
Settling the Sample Complexity of Online Reinforcement Learning. CoRR abs/2307.13586 (2023) - [i98]Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon S. Du:
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention. CoRR abs/2310.00535 (2023) - [i97]Nuoya Xiong, Lijun Ding, Simon S. Du:
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization. CoRR abs/2310.01769 (2023) - [i96]Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du:
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning. CoRR abs/2310.19308 (2023) - [i95]Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon S. Du, Huazhe Xu:
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning. CoRR abs/2310.20587 (2023) - 2022
- [j3]Bin Shi
, Simon S. Du, Michael I. Jordan, Weijie J. Su:
Understanding the acceleration phenomenon via high-resolution differential equations. Math. Program. 195(1): 79-148 (2022) - [c74]Xiaoxia Wu, Yuege Xie, Simon Shaolei Du, Rachel Ward:
AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method. AAAI 2022: 8691-8699 - [c73]Zehao Dou, Zhuoran Yang, Zhaoran Wang, Simon S. Du:
Gap-Dependent Bounds for Two-Player Markov Games. AISTATS 2022: 432-455 - [c72]Yulai Zhao, Yuandong Tian, Jason D. Lee, Simon S. Du:
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games. AISTATS 2022: 2736-2761 - [c71]Zihan Zhang, Xiangyang Ji, Simon S. Du:
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies. COLT 2022: 3858-3904 - [c70]Zhili Feng, Shaobo Han, Simon Shaolei Du:
Provable Adaptation across Multiway Domains via Representation Learning. ICLR 2022 - [c69]Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon Shaolei Du:
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning. ICLR 2022 - [c68]Haoyuan Cai, Tengyu Ma, Simon S. Du:
Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path. ICML 2022: 2434-2456 - [c67]Yifang Chen, Kevin G. Jamieson, Simon S. Du:
Active Multi-Task Representation Learning. ICML 2022: 3271-3298 - [c66]Andrew J. Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin G. Jamieson:
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach. ICML 2022: 22384-22429 - [c65]Andrew J. Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin G. Jamieson:
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes. ICML 2022: 22430-22456 - [c64]Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian:
Denoised MDPs: Learning World Models Better Than the World Itself. ICML 2022: 22591-22612 - [c63]Tianhao Wu, Yunchang Yang, Han Zhong, Liwei Wang, Simon S. Du, Jiantao Jiao:
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee. ICML 2022: 24243-24265 - [c62]Qiwen Cui, Simon S. Du:
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus. NeurIPS 2022 - [c61]Qiwen Cui, Simon S. Du:
When are Offline Two-Player Zero-Sum Markov Games Solvable? NeurIPS 2022 - [c60]Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
Learning in Congestion Games with Bandit Feedback. NeurIPS 2022 - [c59]Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang:
Provable General Function Class Representation Learning in Multitask Bandits and MDP. NeurIPS 2022 - [c58]Xinqi Wang, Qiwen Cui, Simon S. Du:
On Gap-dependent Bounds for Offline Reinforcement Learning. NeurIPS 2022 - [c57]Zhihan Xiong, Ruoqi Shen, Qiwen Cui, Maryam Fazel, Simon S. Du:
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes. NeurIPS 2022 - [i94]Qiwen Cui, Simon S. Du:
When is Offline Two-Player Zero-Sum Markov Game Solvable? CoRR abs/2201.03522 (2022) - [i93]Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson:
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes. CoRR abs/2201.11206 (2022) - [i92]Yifang Chen, Simon S. Du, Kevin Jamieson:
Active Multi-Task Representation Learning. CoRR abs/2202.00911 (2022) - [i91]Meixin Zhu, Simon S. Du, Xuesong Wang, Hao (Frank) Yang, Ziyuan Pu, Yinhai Wang:
TransFollower: Long-Sequence Car-Following Trajectory Prediction through Transformer. CoRR abs/2202.03183 (2022) - [i90]Runlong Zhou, Yuandong Tian, Yi Wu, Simon S. Du:
Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems. CoRR abs/2202.05423 (2022) - [i89]Zihan Zhang, Xiangyang Ji, Simon S. Du:
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies. CoRR abs/2203.12922 (2022) - [i88]Jiaqi Yang, Qi Lei, Jason D. Lee, Simon S. Du:
Nearly Minimax Algorithms for Linear Bandits with Shared Representation. CoRR abs/2203.15664 (2022) - [i87]Haoyuan Cai, Tengyu Ma, Simon S. Du:
Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path. CoRR abs/2205.10729 (2022) - [i86]Yan Dai, Ruosong Wang, Simon S. Du:
Variance-Aware Sparse Linear Bandits. CoRR abs/2205.13450 (2022) - [i85]Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang:
Provable General Function Class Representation Learning in Multitask Bandits and MDPs. CoRR abs/2205.15701 (2022) - [i84]Qiwen Cui, Simon S. Du:
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus. CoRR abs/2206.00159 (2022) - [i83]Xinqi Wang, Qiwen Cui, Simon S. Du:
On Gap-dependent Bounds for Offline Reinforcement Learning. CoRR abs/2206.00177 (2022) - [i82]Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
Learning in Congestion Games with Bandit Feedback. CoRR abs/2206.01880 (2022) - [i81]Simon S. Du, Gauthier Gidel, Michael I. Jordan, Chris Junchi Li:
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization. CoRR abs/2206.08573 (2022) - [i80]Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian:
Denoised MDPs: Learning World Models Better Than the World Itself. CoRR abs/2206.15477 (2022) - [i79]Yulai Zhao, Jianshu Chen, Simon S. Du:
Blessing of Class Diversity in Pre-training. CoRR abs/2209.03447 (2022) - [i78]Shicong Cen, Yuejie Chi
, Simon S. Du, Lin Xiao:
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games. CoRR abs/2210.01050 (2022) - [i77]Rui Yuan, Simon S. Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao:
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies. CoRR abs/2210.01400 (2022) - [i76]Haotian Ye, Xiaoyu Chen, Liwei Wang, Simon S. Du:
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness. CoRR abs/2210.10464 (2022) - [i75]Runlong Zhou, Ruosong Wang, Simon S. Du:
Horizon-Free Reinforcement Learning for Latent Markov Decision Processes. CoRR abs/2210.11604 (2022) - [i74]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
Offline congestion games: How feedback type affects data coverage requirement. CoRR abs/2210.13396 (2022) - 2021
- [j2]Yining Wang
, Yi Wu, Simon S. Du:
Near-Linear Time Local Polynomial Nonparametric Estimation with Box Kernels. INFORMS J. Comput. 33(4): 1339-1353 (2021) - [c56]Kunhe Yang, Lin F. Yang
, Simon S. Du:
Q-learning with Logarithmic Regret. AISTATS 2021: 1576-1584 - [c55]Haike Xu, Tengyu Ma, Simon S. Du:
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap. COLT 2021: 4438-4472 - [c54]Zihan Zhang, Xiangyang Ji, Simon S. Du:
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon. COLT 2021: 4528-4531 - [c53]Yining Wang, Ruosong Wang, Simon Shaolei Du, Akshay Krishnamurthy:
Optimism in Reinforcement Learning with Generalized Linear Function Approximation. ICLR 2021 - [c52]Simon Shaolei Du, Wei Hu, Sham M. Kakade, Jason D. Lee, Qi Lei:
Few-Shot Learning via Learning the Representation, Provably. ICLR 2021 - [c51]Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu:
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization. ICLR 2021 - [c50]Keyulu Xu, Mozhi Zhang, Jingling Li, Simon Shaolei Du, Ken-ichi Kawarabayashi, Stefanie Jegelka:
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks. ICLR 2021 - [c49]Jiaqi Yang, Wei Hu, Jason D. Lee, Simon Shaolei Du:
Impact of Representation Learning in Linear Bandits. ICLR 2021 - [c48]Yifang Chen, Simon S. Du, Kevin Jamieson:
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning. ICML 2021: 1561-1570 - [c47]Simon S. Du, Sham M. Kakade, Jason D. Lee, Shachar Lovett, Gaurav Mahajan, Wen Sun, Ruosong Wang:
Bilinear Classes: A Structural Framework for Provable Generalization in RL. ICML 2021: 2826-2836 - [c46]Tianhao Wu, Yunchang Yang, Simon S. Du, Liwei Wang:
On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP. ICML 2021: 11296-11306 - [c45]Zihan Zhang, Simon S. Du, Xiangyang Ji:
Near Optimal Reward-Free Reinforcement Learning. ICML 2021: 12402-12412 - [c44]Tian Ye, Simon S. Du:
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization. NeurIPS 2021: 1429-1439 - [c43]Zihan Zhang, Jiaqi Yang, Xiangyang Ji, Simon S. Du:
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP. NeurIPS 2021: 4342-4355 - [c42]Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. NeurIPS 2021: 6843-6855 - [c41]Tongzheng Ren, Jialian Li, Bo Dai, Simon S. Du, Sujay Sanghavi:
Nearly Horizon-Free Offline Reinforcement Learning. NeurIPS 2021: 15621-15634 - [c40]Yifang Chen, Simon S. Du, Kevin G. Jamieson:
Corruption Robust Active Learning. NeurIPS 2021: 29643-29654 - [c39]Simon S. Du, Wei Hu, Zhiyuan Li, Ruoqi Shen, Zhao Song, Jiajun Wu:
When is particle filtering efficient for planning in partially observed linear dynamical systems? UAI 2021: 728-737 - [i73]Minbo Gao, Tianle Xie, Simon S. Du, Lin F. Yang:
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost. CoRR abs/2101.00494 (2021) - [i72]Zihan Zhang, Jiaqi Yang, Xiangyang Ji, Simon S. Du:
Variance-Aware Confidence Set: Variance-Dependent Bound for Linear Bandits and Horizon-Free Bound for Linear Mixture MDP. CoRR abs/2101.12745 (2021) - [i71]Haike Xu, Tengyu Ma, Simon S. Du:
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap. CoRR abs/2102.04692 (2021) - [i70]Yifang Chen, Simon S. Du, Kevin Jamieson:
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning. CoRR abs/2102.06875 (2021) - [i69]Yulai Zhao, Yuandong Tian, Jason D. Lee, Simon S. Du:
Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games. CoRR abs/2102.08903 (2021) - [i68]Zhihan Xiong, Ruoqi Shen, Simon S. Du:
Randomized Exploration is Near-Optimal for Tabular MDP. CoRR abs/2102.09703 (2021) - [i67]Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon S. Du, Yu Wang, Yi Wu:
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization. CoRR abs/2103.04564 (2021) - [i66]Simon S. Du, Sham M. Kakade, Jason D. Lee, Shachar Lovett, Gaurav Mahajan, Wen Sun, Ruosong Wang:
Bilinear Classes: A Structural Framework for Provable Generalization in RL. CoRR abs/2103.10897 (2021) - [i65]Tongzheng Ren, Jialian Li, Bo Dai, Simon S. Du, Sujay Sanghavi:
Nearly Horizon-Free Offline Reinforcement Learning. CoRR abs/2103.14077 (2021) - [i64]Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. CoRR abs/2104.11186 (2021) - [i63]Zhili Feng, Shaobo Han, Simon S. Du:
Provable Adaptation across Multiway Domains via Representation Learning. CoRR abs/2106.06657 (2021) - [i62]Rui Lu, Gao Huang, Simon S. Du:
On the Power of Multitask Representation Learning in Linear MDP. CoRR abs/2106.08053 (2021) - [i61]Yifang Chen, Simon S. Du, Kevin Jamieson:
Corruption Robust Active Learning. CoRR abs/2106.11220 (2021) - [i60]Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon S. Du:
A Unified Framework for Conservative Exploration. CoRR abs/2106.11692 (2021) - [i59]Tian Ye, Simon S. Du:
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization. CoRR abs/2106.14289 (2021) - [i58]Zehao Dou, Zhuoran Yang, Zhaoran Wang, Simon S. Du:
Gap-Dependent Bounds for Two-Player Markov Games. CoRR abs/2107.00685 (2021) - [i57]Xiaoxia Wu, Yuege Xie, Simon S. Du, Rachel Ward:
AdaLoss: A computationally-efficient and provably convergent adaptive gradient method. CoRR abs/2109.08282 (2021) - [i56]Xiang Wang, Xinlei Chen, Simon S. Du, Yuandong Tian:
Towards Demystifying Representation Learning with Non-contrastive Self-supervision. CoRR abs/2110.04947 (2021) - [i55]Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson:
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach. CoRR abs/2112.03432 (2021) - [i54]Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu:
A Benchmark for Low-Switching-Cost Reinforcement Learning. CoRR abs/2112.06424 (2021) - [i53]Tianhao Wu, Yunchang Yang, Han Zhong, Liwei Wang, Simon S. Du, Jiantao Jiao:
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee. CoRR abs/2112.10935 (2021) - 2020
- [j1]Xi Chen, Simon S. Du, Xin T. Tong:
On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics. J. Mach. Learn. Res. 21: 68:1-68:41 (2020) - [c38]Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu:
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks. ICLR 2020 - [c37]Simon S. Du, Sham M. Kakade, Ruosong Wang, Lin F. Yang
:
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? ICLR 2020 - [c36]Keyulu Xu, Jingling Li, Mozhi Zhang, Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka:
What Can Neural Networks Reason About? ICLR 2020 - [c35]Sanjeev Arora, Simon S. Du, Sham M. Kakade, Yuping Luo, Nikunj Saunshi:
Provable Representation Learning for Imitation Learning via Bi-level Optimization. ICML 2020: 367-376 - [c34]Yunbo Wang, Bo Liu, Jiajun Wu, Yuke Zhu, Simon S. Du, Li Fei-Fei, Joshua B. Tenenbaum:
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs. IJCAI 2020: 4190-4198 - [c33]Simon S. Du, Jason D. Lee, Gaurav Mahajan, Ruosong Wang:
Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity. NeurIPS 2020 - [c32]Fei Feng, Ruosong Wang, Wotao Yin, Simon S. Du, Lin F. Yang
:
Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning. NeurIPS 2020 - [c31]Ruosong Wang, Simon S. Du, Lin F. Yang
, Sham M. Kakade:
Is Long Horizon RL More Difficult Than Short Horizon RL? NeurIPS 2020 - [c30]Ruosong Wang, Simon S. Du, Lin F. Yang
, Ruslan Salakhutdinov:
On Reward-Free Reinforcement Learning with Linear Function Approximation. NeurIPS 2020 - [c29]Ruosong Wang, Peilin Zhong, Simon S. Du, Ruslan Salakhutdinov, Lin F. Yang
:
Planning with General Objective Functions: Going Beyond Total Rewards. NeurIPS 2020 - [c28]Yi Zhang, Orestis Plevrakis, Simon S. Du, Xingguo Li, Zhao Song, Sanjeev Arora:
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality. NeurIPS 2020 - [i52]Yi Zhang, Orestis Plevrakis, Simon S. Du, Xingguo Li, Zhao Song, Sanjeev Arora:
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality. CoRR abs/2002.06668 (2020) - [i51]Simon S. Du, Jason D. Lee, Gaurav Mahajan, Ruosong Wang:
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity. CoRR abs/2002.07125 (2020) - [i50]