default search action
Simon S. Du
Simon Shaolei Du – 杜少雷
Person information
- unicode name: 杜少雷
- affiliation: University of Washington, USA
- affiliation (former): Carnegie Mellon University, Machine Learning Department
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c103]Runlong Zhou, Simon S. Du, Beibin Li:
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs. ACL (1) 2024: 995-1015 - [c102]Gantavya Bhatt, Yifang Chen, Arnav Mohanty Das, Jifan Zhang, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeff A. Bilmes, Simon S. Du, Kevin G. Jamieson, Jordan T. Ash, Robert D. Nowak:
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models. ACL (Findings) 2024: 6549-6560 - [c101]Yan Dai, Qiwen Cui, Simon S. Du:
Refined Sample Complexity for Markov Games with Independent Linear Function Approximation (Extended Abstract). COLT 2024: 1260-1261 - [c100]Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du:
Settling the sample complexity of online reinforcement learning. COLT 2024: 5213-5219 - [c99]Zihan Zhang, Wenhao Zhan, Yuxin Chen, Simon S. Du, Jason D. Lee:
Optimal Multi-Distribution Learning. COLT 2024: 5220-5223 - [c98]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du:
A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning. ICLR 2024 - [c97]Kaifeng Lyu, Jikai Jin, Zhiyuan Li, Simon Shaolei Du, Jason D. Lee, Wei Hu:
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking. ICLR 2024 - [c96]Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon Shaolei Du, Huazhe Xu:
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning. ICLR 2024 - [c95]Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Shaolei Du:
JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention. ICLR 2024 - [c94]Nuoya Xiong, Lijun Ding, Simon Shaolei Du:
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization. ICLR 2024 - [c93]Zihan Zhang, Jason D. Lee, Yuxin Chen, Simon Shaolei Du:
Horizon-Free Regret for Linear Markov Decision Processes. ICLR 2024 - [c92]Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du:
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning. ICLR 2024 - [c91]Chenhao Lu, Ruizhe Shi, Yuyao Liu, Kaizhe Hu, Simon Shaolei Du, Huazhe Xu:
Rethinking Transformers in Solving POMDPs. ICML 2024 - [i133]Gantavya Bhatt, Yifang Chen, Arnav Mohanty Das, Jifan Zhang, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey A. Bilmes, Simon S. Du, Kevin G. Jamieson, Jordan T. Ash, Robert D. Nowak:
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models. CoRR abs/2401.06692 (2024) - [i132]Yiping Wang, Yifang Chen, Wendan Yan, Kevin G. Jamieson, Simon Shaolei Du:
Variance Alignment Score: A Simple But Tough-to-Beat Data Selection Method for Multimodal Contrastive Learning. CoRR abs/2402.02055 (2024) - [i131]Yan Dai, Qiwen Cui, Simon S. Du:
Refined Sample Complexity for Markov Games with Independent Linear Function Approximation. CoRR abs/2402.07082 (2024) - [i130]Qiwen Cui, Maryam Fazel, Simon S. Du:
Learning Optimal Tax Design in Nonatomic Congestion Games. CoRR abs/2402.07437 (2024) - [i129]Avinandan Bose, Simon Shaolei Du, Maryam Fazel:
Offline Multi-task Transfer RL with Representational Penalization. CoRR abs/2402.12570 (2024) - [i128]Runlong Zhou, Simon S. Du, Beibin Li:
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs. CoRR abs/2402.12621 (2024) - [i127]Chuning Zhu, Xinqi Wang, Tyler Han, Simon S. Du, Abhishek Gupta:
Transferable Reinforcement Learning via Generalized Occupancy Models. CoRR abs/2403.06328 (2024) - [i126]Zihan Zhang, Jason D. Lee, Yuxin Chen, Simon S. Du:
Horizon-Free Regret for Linear Markov Decision Processes. CoRR abs/2403.10738 (2024) - [i125]Chenhao Lu, Ruizhe Shi, Yuyao Liu, Kaizhe Hu, Simon S. Du, Huazhe Xu:
Rethinking Transformers in Solving POMDPs. CoRR abs/2405.17358 (2024) - [i124]Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du:
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning. CoRR abs/2405.19547 (2024) - [i123]Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon S. Du:
Decoding-Time Language Model Alignment with Multiple Objectives. CoRR abs/2406.18853 (2024) - [i122]Weihang Xu, Maryam Fazel, Simon S. Du:
Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models. CoRR abs/2407.00490 (2024) - [i121]Yifang Chen, Shuohang Wang, Ziyi Yang, Hiteshi Sharma, Nikos Karampatziakis, Donghan Yu, Kevin G. Jamieson, Simon Shaolei Du, Yelong Shen:
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning. CoRR abs/2407.02119 (2024) - [i120]Divyansh Pareek, Simon S. Du, Sewoong Oh:
Understanding the Gains from Repeated Self-Distillation. CoRR abs/2407.04600 (2024) - [i119]Natalia Zhang, Xinqi Wang, Qiwen Cui, Runlong Zhou, Sham M. Kakade, Simon S. Du:
Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques. CoRR abs/2409.00717 (2024) - [i118]Ruizhe Shi, Runlong Zhou, Simon S. Du:
The Crucial Role of Samplers in Online Direct Preference Optimization. CoRR abs/2409.19605 (2024) - [i117]Xiyu Zhai, Runlong Zhou, Liao Zhang, Simon Shaolei Du:
Transformers are Efficient Compilers, Provably. CoRR abs/2410.14706 (2024) - [i116]Siting Li, Pang Wei Koh, Simon Shaolei Du:
On Erroneous Agreements of CLIP Image Embeddings. CoRR abs/2411.05195 (2024) - [i115]Yancheng Liang, Daphne Chen, Abhishek Gupta, Simon S. Du, Natasha Jaques:
Learning to Cooperate with Humans using Generative Agents. CoRR abs/2411.13934 (2024) - [i114]Zihan Zhang, Jason D. Lee, Simon S. Du, Yuxin Chen:
Anytime Acceleration of Gradient Descent. CoRR abs/2411.17668 (2024) - [i113]Avinandan Bose, Zhihan Xiong, Aadirupa Saha, Simon Shaolei Du, Maryam Fazel:
Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration. CoRR abs/2412.10616 (2024) - [i112]Yiping Wang, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen:
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation. CoRR abs/2412.16211 (2024) - 2023
- [j6]Wenqing Zheng, Hao (Frank) Yang, Jiarui Cai, Peihao Wang, Xuan Jiang, Simon Shaolei Du, Yinhai Wang, Zhangyang Wang:
Integrating the traffic science with representation learning for city-wide network congestion prediction. Inf. Fusion 99: 101837 (2023) - [j5]Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu:
Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [j4]Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon Shaolei Du:
Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization. Trans. Mach. Learn. Res. 2023 (2023) - [c90]Yulai Zhao, Jianshu Chen, Simon S. Du:
Blessing of Class Diversity in Pre-training. AISTATS 2023: 283-305 - [c89]Weihang Xu, Simon S. Du:
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron. COLT 2023: 1155-1198 - [c88]Qiwen Cui, Kaiqing Zhang, Simon S. Du:
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation. COLT 2023: 2651-2652 - [c87]Yan Dai, Ruosong Wang, Simon Shaolei Du:
Variance-Aware Sparse Linear Bandits. ICLR 2023 - [c86]Shicong Cen, Yuejie Chi, Simon Shaolei Du, Lin Xiao:
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games. ICLR 2023 - [c85]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du:
Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement. ICLR 2023 - [c84]Rui Yuan, Simon Shaolei Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao:
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies. ICLR 2023 - [c83]Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon Shaolei Du, Jason D. Lee:
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing. ICML 2023: 15200-15238 - [c82]Yiping Wang, Yifang Chen, Kevin Jamieson, Simon Shaolei Du:
Improved Active Multi-Task Representation Learning via Lasso. ICML 2023: 35548-35578 - [c81]Haotian Ye, Xiaoyu Chen, Liwei Wang, Simon Shaolei Du:
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness. ICML 2023: 39770-39800 - [c80]Runlong Zhou, Ruosong Wang, Simon Shaolei Du:
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes. ICML 2023: 42698-42723 - [c79]Runlong Zhou, Zihan Zhang, Simon Shaolei Du:
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments. ICML 2023: 42878-42914 - [c78]Yifang Chen, Yingbing Huang, Simon S. Du, Kevin G. Jamieson, Guanya Shi:
Active representation learning for general task space with applications in robotics. NeurIPS 2023 - [c77]Yuandong Tian, Yiping Wang, Beidi Chen, Simon S. Du:
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer. NeurIPS 2023 - [c76]Yunchang Yang, Han Zhong, Tianhao Wu, Bin Liu, Liwei Wang, Simon S. Du:
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback. NeurIPS 2023 - [c75]Angela Yuan, Chris Junchi Li, Gauthier Gidel, Michael I. Jordan, Quanquan Gu, Simon S. Du:
Optimal Extragradient-Based Algorithms for Stochastic Variational Inequalities with Separable Structure. NeurIPS 2023 - [i111]Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon S. Du, Jason D. Lee:
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing. CoRR abs/2301.11500 (2023) - [i110]Runlong Zhou, Zihan Zhang, Simon S. Du:
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments. CoRR abs/2301.13446 (2023) - [i109]Yunchang Yang, Han Zhong, Tianhao Wu, Bin Liu, Liwei Wang, Simon S. Du:
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback. CoRR abs/2302.01477 (2023) - [i108]Qiwen Cui, Kaiqing Zhang, Simon S. Du:
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation. CoRR abs/2302.03673 (2023) - [i107]Weihang Xu, Simon S. Du:
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron. CoRR abs/2302.10034 (2023) - [i106]Yuandong Tian, Yiping Wang, Beidi Chen, Simon S. Du:
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer. CoRR abs/2305.16380 (2023) - [i105]Yiping Wang, Yifang Chen, Kevin G. Jamieson, Simon S. Du:
Improved Active Multi-Task Representation Learning via Lasso. CoRR abs/2306.02556 (2023) - [i104]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning. CoRR abs/2306.07465 (2023) - [i103]Yifang Chen, Yingbing Huang, Simon S. Du, Kevin G. Jamieson, Guanya Shi:
Active Representation Learning for General Task Space with Applications in Robotics. CoRR abs/2306.08942 (2023) - [i102]Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Yinglun Zhu, Simon Shaolei Du, Kevin G. Jamieson, Robert D. Nowak:
LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning. CoRR abs/2306.09910 (2023) - [i101]Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du:
Settling the Sample Complexity of Online Reinforcement Learning. CoRR abs/2307.13586 (2023) - [i100]Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon S. Du:
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention. CoRR abs/2310.00535 (2023) - [i99]Nuoya Xiong, Lijun Ding, Simon S. Du:
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization. CoRR abs/2310.01769 (2023) - [i98]Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du:
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning. CoRR abs/2310.19308 (2023) - [i97]Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon S. Du, Huazhe Xu:
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning. CoRR abs/2310.20587 (2023) - [i96]Kaifeng Lyu, Jikai Jin, Zhiyuan Li, Simon S. Du, Jason D. Lee, Wei Hu:
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking. CoRR abs/2311.18817 (2023) - [i95]Zihan Zhang, Wenhao Zhan, Yuxin Chen, Simon S. Du, Jason D. Lee:
Optimal Multi-Distribution Learning. CoRR abs/2312.05134 (2023) - 2022
- [j3]Bin Shi, Simon S. Du, Michael I. Jordan, Weijie J. Su:
Understanding the acceleration phenomenon via high-resolution differential equations. Math. Program. 195(1): 79-148 (2022) - [c74]Xiaoxia Wu, Yuege Xie, Simon Shaolei Du, Rachel A. Ward:
AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method. AAAI 2022: 8691-8699 - [c73]Zehao Dou, Zhuoran Yang, Zhaoran Wang, Simon S. Du:
Gap-Dependent Bounds for Two-Player Markov Games. AISTATS 2022: 432-455 - [c72]Yulai Zhao, Yuandong Tian, Jason D. Lee, Simon S. Du:
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games. AISTATS 2022: 2736-2761 - [c71]Zihan Zhang, Xiangyang Ji, Simon S. Du:
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies. COLT 2022: 3858-3904 - [c70]Zhili Feng, Shaobo Han, Simon Shaolei Du:
Provable Adaptation across Multiway Domains via Representation Learning. ICLR 2022 - [c69]Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon Shaolei Du:
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning. ICLR 2022 - [c68]Haoyuan Cai, Tengyu Ma, Simon S. Du:
Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path. ICML 2022: 2434-2456 - [c67]Yifang Chen, Kevin G. Jamieson, Simon S. Du:
Active Multi-Task Representation Learning. ICML 2022: 3271-3298 - [c66]Andrew J. Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin G. Jamieson:
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach. ICML 2022: 22384-22429 - [c65]Andrew J. Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin G. Jamieson:
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes. ICML 2022: 22430-22456 - [c64]Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian:
Denoised MDPs: Learning World Models Better Than the World Itself. ICML 2022: 22591-22612 - [c63]Tianhao Wu, Yunchang Yang, Han Zhong, Liwei Wang, Simon S. Du, Jiantao Jiao:
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee. ICML 2022: 24243-24265 - [c62]Qiwen Cui, Simon S. Du:
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus. NeurIPS 2022 - [c61]Qiwen Cui, Simon S. Du:
When are Offline Two-Player Zero-Sum Markov Games Solvable? NeurIPS 2022 - [c60]Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
Learning in Congestion Games with Bandit Feedback. NeurIPS 2022 - [c59]Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang:
Provable General Function Class Representation Learning in Multitask Bandits and MDP. NeurIPS 2022 - [c58]Xinqi Wang, Qiwen Cui, Simon S. Du:
On Gap-dependent Bounds for Offline Reinforcement Learning. NeurIPS 2022 - [c57]Zhihan Xiong, Ruoqi Shen, Qiwen Cui, Maryam Fazel, Simon S. Du:
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes. NeurIPS 2022 - [i94]Qiwen Cui, Simon S. Du:
When is Offline Two-Player Zero-Sum Markov Game Solvable? CoRR abs/2201.03522 (2022) - [i93]Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson:
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes. CoRR abs/2201.11206 (2022) - [i92]Yifang Chen, Simon S. Du, Kevin Jamieson:
Active Multi-Task Representation Learning. CoRR abs/2202.00911 (2022) - [i91]Meixin Zhu, Simon S. Du, Xuesong Wang, Hao (Frank) Yang, Ziyuan Pu, Yinhai Wang:
TransFollower: Long-Sequence Car-Following Trajectory Prediction through Transformer. CoRR abs/2202.03183 (2022) - [i90]Runlong Zhou, Yuandong Tian, Yi Wu, Simon S. Du:
Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems. CoRR abs/2202.05423 (2022) - [i89]Zihan Zhang, Xiangyang Ji, Simon S. Du:
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies. CoRR abs/2203.12922 (2022) - [i88]Jiaqi Yang, Qi Lei, Jason D. Lee, Simon S. Du:
Nearly Minimax Algorithms for Linear Bandits with Shared Representation. CoRR abs/2203.15664 (2022) - [i87]Haoyuan Cai, Tengyu Ma, Simon S. Du:
Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path. CoRR abs/2205.10729 (2022) - [i86]Yan Dai, Ruosong Wang, Simon S. Du:
Variance-Aware Sparse Linear Bandits. CoRR abs/2205.13450 (2022) - [i85]Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang:
Provable General Function Class Representation Learning in Multitask Bandits and MDPs. CoRR abs/2205.15701 (2022) - [i84]Qiwen Cui, Simon S. Du:
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus. CoRR abs/2206.00159 (2022) - [i83]Xinqi Wang, Qiwen Cui, Simon S. Du:
On Gap-dependent Bounds for Offline Reinforcement Learning. CoRR abs/2206.00177 (2022) - [i82]Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
Learning in Congestion Games with Bandit Feedback. CoRR abs/2206.01880 (2022) - [i81]Simon S. Du, Gauthier Gidel, Michael I. Jordan, Chris Junchi Li:
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization. CoRR abs/2206.08573 (2022) - [i80]Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian:
Denoised MDPs: Learning World Models Better Than the World Itself. CoRR abs/2206.15477 (2022) - [i79]Yulai Zhao, Jianshu Chen, Simon S. Du:
Blessing of Class Diversity in Pre-training. CoRR abs/2209.03447 (2022) - [i78]Shicong Cen, Yuejie Chi, Simon S. Du, Lin Xiao:
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games. CoRR abs/2210.01050 (2022) - [i77]Rui Yuan, Simon S. Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao:
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies. CoRR abs/2210.01400 (2022) - [i76]Haotian Ye, Xiaoyu Chen, Liwei Wang, Simon S. Du:
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness. CoRR abs/2210.10464 (2022) - [i75]Runlong Zhou, Ruosong Wang, Simon S. Du:
Horizon-Free Reinforcement Learning for Latent Markov Decision Processes. CoRR abs/2210.11604 (2022) - [i74]Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du:
Offline congestion games: How feedback type affects data coverage requirement. CoRR abs/2210.13396 (2022) - 2021
- [j2]Yining Wang, Yi Wu, Simon S. Du:
Near-Linear Time Local Polynomial Nonparametric Estimation with Box Kernels. INFORMS J. Comput. 33(4): 1339-1353 (2021) - [c56]Kunhe Yang, Lin F. Yang, Simon S. Du:
Q-learning with Logarithmic Regret. AISTATS 2021: 1576-1584 - [c55]Haike Xu, Tengyu Ma, Simon S. Du:
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap. COLT 2021: 4438-4472 - [c54]Zihan Zhang, Xiangyang Ji, Simon S. Du:
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon. COLT 2021: 4528-4531 - [c53]Yining Wang, Ruosong Wang, Simon Shaolei Du, Akshay Krishnamurthy:
Optimism in Reinforcement Learning with Generalized Linear Function Approximation. ICLR 2021 - [c52]Simon Shaolei Du, Wei Hu, Sham M. Kakade, Jason D. Lee, Qi Lei:
Few-Shot Learning via Learning the Representation, Provably. ICLR 2021 - [c51]Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu:
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization. ICLR 2021 - [c50]Keyulu Xu, Mozhi Zhang, Jingling Li, Simon Shaolei Du, Ken-ichi Kawarabayashi, Stefanie Jegelka:
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks. ICLR 2021 - [c49]Jiaqi Yang, Wei Hu, Jason D. Lee, Simon Shaolei Du:
Impact of Representation Learning in Linear Bandits. ICLR 2021 - [c48]Yifang Chen, Simon S. Du, Kevin Jamieson:
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning. ICML 2021: 1561-1570 - [c47]Simon S. Du, Sham M. Kakade, Jason D. Lee, Shachar Lovett, Gaurav Mahajan, Wen Sun, Ruosong Wang:
Bilinear Classes: A Structural Framework for Provable Generalization in RL. ICML 2021: 2826-2836 - [c46]Tianhao Wu, Yunchang Yang, Simon S. Du, Liwei Wang:
On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP. ICML