default search action
Kelvin Xu
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j2]Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron T. Parisi, Abhishek Kumar, Alexander A. Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Fathy Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel:
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models. Trans. Mach. Learn. Res. 2024 (2024) - [c16]Shuo Zhang, Ziqi Kong, Kelvin Xu, Guangxiao Shi, Zixiao Kong, Xia Li, Jinjin Zan:
ContMulti-objective Optimization Model for Momentum Change Based on Genetic Algorithm. ICIC (1) 2024: 134-145 - [c15]Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie E. Everett, Alexander A. Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith:
Small-scale proxies for large-scale Transformer training instabilities. ICLR 2024 - [i21]Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar:
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. CoRR abs/2408.03314 (2024) - [i20]Jiri Hron, Laura Culp, Gamaleldin F. Elsayed, Rosanne Liu, Ben Adlam, Maxwell L. Bileschi, Bernd Bohnet, JD Co-Reyes, Noah Fiedel, C. Daniel Freeman, Izzeddin Gur, Kathleen Kenealy, Jaehoon Lee, Peter J. Liu, Gaurav Mishra, Igor Mordatch, Azade Nova, Roman Novak, Aaron Parisi, Jeffrey Pennington, Alex Rizkowsky, Isabelle Simpson, Hanie Sedghi, Jascha Sohl-Dickstein, Kevin Swersky, Sharad Vikram, Tris Warkentin, Lechao Xiao, Kelvin Xu, Jasper Snoek, Simon Kornblith:
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability. CoRR abs/2408.07852 (2024) - 2023
- [c14]Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine:
Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance. ICRA 2023: 5938-5945 - [i19]Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith:
Small-scale proxies for large-scale Transformer training instabilities. CoRR abs/2309.14322 (2023) - [i18]C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L. Bileschi, Gamaleldin F. Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, John D. Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant, Peter J. Liu, Roman Novak, Yundi Qian, Noah Fiedel, Jascha Sohl-Dickstein:
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? CoRR abs/2311.07587 (2023) - [i17]Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine:
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models. CoRR abs/2311.18232 (2023) - [i16]Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin F. Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel:
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models. CoRR abs/2312.06585 (2023) - 2022
- [b1]Kelvin Xu:
Towards Adaptive, Continual Embodied Agents. University of California, Berkeley, USA, 2022 - [c13]Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn:
Autonomous Reinforcement Learning: Formalism and Benchmarking. ICLR 2022 - [i15]Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine:
Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance. CoRR abs/2212.09902 (2022) - 2021
- [c12]Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine:
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention. ICRA 2021: 6664-6671 - [i14]Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine:
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention. CoRR abs/2104.11203 (2021) - [i13]Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn:
Autonomous Reinforcement Learning: Formalism and Benchmarking. CoRR abs/2112.09605 (2021) - 2020
- [c11]Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle:
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR 2020 - [c10]Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine:
Continual Learning of Control Primitives : Skill Discovery via Reset-Games. NeurIPS 2020 - [i12]Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine:
Continual Learning of Control Primitives: Skill Discovery via Reset-Games. CoRR abs/2011.05286 (2020)
2010 – 2019
- 2019
- [c9]Kelvin Xu, Ellis Ratner, Anca D. Dragan, Sergey Levine, Chelsea Finn:
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning. ICML 2019: 6952-6962 - [c8]Yangfan Sun, Renlong Hang, Zhu Li, Mouqing Jin, Kelvin Xu:
Privacy-Preserving Fall Detection with Deep Learning on mmWave Radar Signal. VCIP 2019: 1-4 - [i11]Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle:
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. CoRR abs/1903.03096 (2019) - 2018
- [c7]Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans:
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control. ICLR (Poster) 2018 - [c6]Chelsea Finn, Kelvin Xu, Sergey Levine:
Probabilistic Model-Agnostic Meta-Learning. NeurIPS 2018: 9537-9548 - [i10]Kelvin Xu, Ellis Ratner, Anca D. Dragan, Sergey Levine, Chelsea Finn:
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning. CoRR abs/1805.12573 (2018) - [i9]Chelsea Finn, Kelvin Xu, Sergey Levine:
Probabilistic Model-Agnostic Meta-Learning. CoRR abs/1806.02817 (2018) - 2017
- [j1]Çaglar Gülçehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Yoshua Bengio:
On integrating a language model into neural machine translation. Comput. Speech Lang. 45: 137-148 (2017) - [c5]Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron C. Courville, Yoshua Bengio:
An Actor-Critic Algorithm for Sequence Prediction. ICLR (Poster) 2017 - [c4]Pierre Sermanet, Kelvin Xu, Sergey Levine:
Unsupervised Perceptual Rewards for Imitation Learning. ICLR (Workshop) 2017 - [c3]Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans:
Bridging the Gap Between Value and Policy Based Reinforcement Learning. NIPS 2017: 2775-2785 - [c2]Pierre Sermanet, Kelvin Xu, Sergey Levine:
Unsupervised Perceptual Rewards for Imitation Learning. Robotics: Science and Systems 2017 - [i8]Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans:
Bridging the Gap Between Value and Policy Based Reinforcement Learning. CoRR abs/1702.08892 (2017) - [i7]Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans:
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control. CoRR abs/1707.01891 (2017) - 2016
- [i6]Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermüller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul F. Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron C. Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Melanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian J. Goodfellow, Matthew Graham, Çaglar Gülçehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrançois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Joseph Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph P. Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang:
Theano: A Python framework for fast computation of mathematical expressions. CoRR abs/1605.02688 (2016) - [i5]Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron C. Courville, Yoshua Bengio:
An Actor-Critic Algorithm for Sequence Prediction. CoRR abs/1607.07086 (2016) - [i4]Pierre Sermanet, Kelvin Xu, Sergey Levine:
Unsupervised Perceptual Rewards for Imitation Learning. CoRR abs/1612.06699 (2016) - 2015
- [c1]Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, Yoshua Bengio:
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015: 2048-2057 - [i3]Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, Yoshua Bengio:
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. CoRR abs/1502.03044 (2015) - [i2]Çaglar Gülçehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loïc Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio:
On Using Monolingual Corpora in Neural Machine Translation. CoRR abs/1503.03535 (2015) - [i1]Marcin Moczulski, Kelvin Xu, Aaron C. Courville, KyungHyun Cho:
A Controller Recognizer Framework: How necessary is recognition for control? CoRR abs/1511.06428 (2015)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:23 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint