Stop the war!
Остановите войну!
for scientists:
default search action
Yi Tay
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [i89]Aitor Ormazabal, Che Zheng, Cyprien de Masson d'Autume, Dani Yogatama, Deyu Fu, Donovan Ong, Eric Chen, Eugenie Lamprecht, Hai Pham, Isaac Ong, Kaloyan Aleksiev, Lei Li, Matthew Henderson, Max Bain, Mikel Artetxe, Nishant Relan, Piotr Padlewski, Qi Liu, Ren Chen, Samuel Phua, Yazheng Yang, Yi Tay, Yuqi Wang, Zhongkai Zhu, Zhihui Xie:
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models. CoRR abs/2404.12387 (2024) - [i88]Piotr Padlewski, Max Bain, Matthew Henderson, Zhongkai Zhu, Nishant Relan, Hai Pham, Donovan Ong, Kaloyan Aleksiev, Aitor Ormazabal, Samuel Phua, Ethan Yeo, Eugenie Lamprecht, Qi Liu, Yuqi Wang, Eric Chen, Deyu Fu, Lei Li, Che Zheng, Cyprien de Masson d'Autume, Dani Yogatama, Mikel Artetxe, Yi Tay:
Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models. CoRR abs/2405.02287 (2024) - 2023
- [j8]Yi Tay, Mostafa Dehghani, Dara Bahri, Donald Metzler:
Efficient Transformers: A Survey. ACM Comput. Surv. 55(6): 109:1-109:28 (2023) - [j7]Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel:
PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 24: 240:1-240:113 (2023) - [j6]Valerii Likhosherstov, Anurag Arnab, Krzysztof Marcin Choromanski, Mario Lucic, Yi Tay, Mostafa Dehghani:
PolyViT: Co-training Vision Transformers on Images, Videos and Audio. Trans. Mach. Learn. Res. 2023 (2023) - [c89]Mirac Suzgun, Nathan Scales, Nathanael Schärli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, Jason Wei:
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. ACL (Findings) 2023: 13003-13051 - [c88]Jerry W. Wei, Le Hou, Andrew K. Lampinen, Xiangning Chen, Da Huang, Yi Tay, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma, Quoc V. Le:
Symbol tuning improves in-context learning in language models. EMNLP 2023: 968-979 - [c87]Yi Tay, Jason Wei, Hyung Won Chung, Vinh Q. Tran, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani:
Transcending Scaling Laws with 0.1% Extra Compute. EMNLP 2023: 1471-1486 - [c86]Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David C. Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai:
CoLT5: Faster Long-Range Transformers with Conditional Computation. EMNLP 2023: 5085-5100 - [c85]Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler:
DSI++: Updating Transformer Memory with New Documents. EMNLP 2023: 8198-8213 - [c84]Yi Tay, Mostafa Dehghani, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani Yogatama, Donald Metzler:
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling? EMNLP (Findings) 2023: 12342-12364 - [c83]Jason Wei, Najoung Kim, Yi Tay, Quoc V. Le:
Inverse Scaling Can Become U-Shaped. EMNLP 2023: 15580-15591 - [c82]Hyung Won Chung, Xavier Garcia, Adam Roberts, Yi Tay, Orhan Firat, Sharan Narang, Noah Constant:
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining. ICLR 2023 - [c81]Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, Neil Houlsby:
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints. ICLR 2023 - [c80]Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei:
Language models are multilingual chain-of-thought reasoners. ICLR 2023 - [c79]Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou:
Recitation-Augmented Language Models. ICLR 2023 - [c78]Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Xavier Garcia, Jason Wei, Xuezhi Wang, Hyung Won Chung, Dara Bahri, Tal Schuster, Huaixiu Steven Zheng, Denny Zhou, Neil Houlsby, Donald Metzler:
UL2: Unifying Language Learning Paradigms. ICLR 2023 - [c77]Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Peter Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme Ruiz, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin Fathy Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Collier, Alexey A. Gritsenko, Vighnesh Birodkar, Cristina Nader Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetic, Dustin Tran, Thomas Kipf, Mario Lucic, Xiaohua Zhai, Daniel Keysers, Jeremiah J. Harmsen, Neil Houlsby:
Scaling Vision Transformers to 22 Billion Parameters. ICML 2023: 7480-7512 - [c76]Shayne Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V. Le, Barret Zoph, Jason Wei, Adam Roberts:
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. ICML 2023: 22631-22648 - [c75]Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Q. Tran, Jonah Samost, Maciej Kula, Ed H. Chi, Mahesh Sathiamoorthy:
Recommender Systems with Generative Retrieval. NeurIPS 2023 - [c74]Dara Bahri, Che Zheng, Yi Tay, Donald Metzler, Andrew Tomkins:
Surprise: Result List Truncation via Extreme Value Theory. SIGIR 2023: 2404-2408 - [i87]Shayne Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V. Le, Barret Zoph, Jason Wei, Adam Roberts:
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. CoRR abs/2301.13688 (2023) - [i86]Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey A. Gritsenko, Vighnesh Birodkar, Cristina Nader Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetic, Dustin Tran, Thomas Kipf, Mario Lucic, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby:
Scaling Vision Transformers to 22 Billion Parameters. CoRR abs/2302.05442 (2023) - [i85]Jerry W. Wei, Jason Wei, Yi Tay, Dustin Tran, Albert Webson, Yifeng Lu, Xinyun Chen, Hanxiao Liu, Da Huang, Denny Zhou, Tengyu Ma:
Larger language models do in-context learning differently. CoRR abs/2303.03846 (2023) - [i84]Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David C. Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai:
CoLT5: Faster Long-Range Transformers with Conditional Computation. CoRR abs/2303.09752 (2023) - [i83]Hyung Won Chung, Noah Constant, Xavier Garcia, Adam Roberts, Yi Tay, Sharan Narang, Orhan Firat:
UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining. CoRR abs/2304.09151 (2023) - [i82]Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan H. Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Q. Tran, Jonah Samost, Maciej Kula, Ed H. Chi, Maheswaran Sathiamoorthy:
Recommender Systems with Generative Retrieval. CoRR abs/2305.05065 (2023) - [i81]Jerry W. Wei, Le Hou, Andrew K. Lampinen, Xiangning Chen, Da Huang, Yi Tay, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma, Quoc V. Le:
Symbol tuning improves in-context learning in language models. CoRR abs/2305.08298 (2023) - [i80]Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernández Ábrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan A. Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vladimir Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, et al.:
PaLM 2 Technical Report. CoRR abs/2305.10403 (2023) - [i79]Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, A. J. Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut:
PaLI-X: On Scaling up a Multilingual Vision and Language Model. CoRR abs/2305.18565 (2023) - 2022
- [j5]Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus:
Emergent Abilities of Large Language Models. Trans. Mach. Learn. Res. 2022 (2022) - [c73]Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cícero Nogueira dos Santos, Yi Tay, Donald Metzler:
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference. ACL (Findings) 2022: 3747-3758 - [c72]Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur Parikh, Emma Strubell:
Improving Compositional Generalization with Self-Training for Data-to-Text Generation. ACL (1) 2022: 4205-4219 - [c71]Dara Bahri, Hossein Mobahi, Yi Tay:
Sharpness-Aware Minimization Improves Language Model Generalization. ACL (1) 2022: 7360-7371 - [c70]Mostafa Dehghani, Alexey A. Gritsenko, Anurag Arnab, Matthias Minderer, Yi Tay:
SCENIC: A JAX Library for Computer Vision Research and Beyond. CVPR 2022: 21361-21366 - [c69]Jai Gupta, Yi Tay, Chaitanya Kamath, Vinh Tran, Donald Metzler, Shailesh Bavadekar, Mimi Sun, Evgeniy Gabrilovich:
Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification. EMNLP (Industry Track) 2022: 521-530 - [c68]Mostafa Dehghani, Yi Tay, Anurag Arnab, Lucas Beyer, Ashish Vaswani:
The Efficiency Misnomer. ICLR 2022 - [c67]Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Prakash Gupta, Kai Hui, Sebastian Ruder, Donald Metzler:
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning. ICLR 2022 - [c66]Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler:
Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption. ICLR 2022 - [c65]Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler:
Scale Efficiently: Insights from Pretraining and Finetuning Transformers. ICLR 2022 - [c64]Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Prakash Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler:
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization. ICLR 2022 - [c63]Yun He, Huaixiu Steven Zheng, Yi Tay, Jai Prakash Gupta, Yu Du, Vamsi Aribandi, Zhe Zhao, YaGuang Li, Zhao Chen, Donald Metzler, Heng-Tze Cheng, Ed H. Chi:
HyperPrompt: Prompt-based Task-Conditioning of Transformers. ICML 2022: 8678-8690 - [c62]Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Prakash Gupta, Donald Metzler, Lucy Vasserman:
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers. KDD 2022: 3197-3207 - [c61]Tal Schuster, Adam Fisch, Jai Gupta, Mostafa Dehghani, Dara Bahri, Vinh Tran, Yi Tay, Donald Metzler:
Confident Adaptive Language Modeling. NeurIPS 2022 - [c60]Yi Tay, Vinh Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Prakash Gupta, Tal Schuster, William W. Cohen, Donald Metzler:
Transformer Memory as a Differentiable Search Index. NeurIPS 2022 - [r1]Shuai Zhang, Yi Tay, Lina Yao, Aixin Sun, Ce Zhang:
Deep Learning for Recommender Systems. Recommender Systems Handbook 2022: 173-210 - [i78]Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Prakash Gupta, Tal Schuster, William W. Cohen, Donald Metzler:
Transformer Memory as a Differentiable Search Index. CoRR abs/2202.06991 (2022) - [i77]Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Prakash Gupta, Donald Metzler, Lucy Vasserman:
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers. CoRR abs/2202.11176 (2022) - [i76]Yun He, Huaixiu Steven Zheng, Yi Tay, Jai Prakash Gupta, Yu Du, Vamsi Aribandi, Zhe Zhao, YaGuang Li, Zhao Chen, Donald Metzler, Heng-Tze Cheng, Ed H. Chi:
HyperPrompt: Prompt-based Task-Conditioning of Transformers. CoRR abs/2203.00759 (2022) - [i75]Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel:
PaLM: Scaling Language Modeling with Pathways. CoRR abs/2204.02311 (2022) - [i74]Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cícero Nogueira dos Santos, Yi Tay, Don Metzler:
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference. CoRR abs/2204.11458 (2022) - [i73]Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Xavier Garcia, Dara Bahri, Tal Schuster, Huaixiu Steven Zheng, Neil Houlsby, Donald Metzler:
Unifying Language Learning Paradigms. CoRR abs/2205.05131 (2022) - [i72]Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus:
Emergent Abilities of Large Language Models. CoRR abs/2206.07682 (2022) - [i71]Tal Schuster, Adam Fisch, Jai Prakash Gupta, Mostafa Dehghani, Dara Bahri, Vinh Q. Tran, Yi Tay, Donald Metzler:
Confident Adaptive Language Modeling. CoRR abs/2207.07061 (2022) - [i70]Yi Tay, Mostafa Dehghani, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani Yogatama, Donald Metzler:
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling? CoRR abs/2207.10551 (2022) - [i69]Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou:
Recitation-Augmented Language Models. CoRR abs/2210.01296 (2022) - [i68]Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei:
Language Models are Multilingual Chain-of-Thought Reasoners. CoRR abs/2210.03057 (2022) - [i67]Mirac Suzgun, Nathan Scales, Nathanael Schärli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, Jason Wei:
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. CoRR abs/2210.09261 (2022) - [i66]Yi Tay, Jason Wei, Hyung Won Chung, Vinh Q. Tran, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani:
Transcending Scaling Laws with 0.1% Extra Compute. CoRR abs/2210.11399 (2022) - [i65]Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Y. Zhao, Yanping Huang, Andrew M. Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei:
Scaling Instruction-Finetuned Language Models. CoRR abs/2210.11416 (2022) - [i64]Jason Wei, Yi Tay, Quoc V. Le:
Inverse scaling can become U-shaped. CoRR abs/2211.02011 (2022) - [i63]Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, Neil Houlsby:
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints. CoRR abs/2212.05055 (2022) - [i62]Sanket Vaibhav Mehta, Jai Prakash Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler:
DSI++: Updating Transformer Memory with New Documents. CoRR abs/2212.09744 (2022) - [i61]Jai Gupta, Yi Tay, Chaitanya Kamath, Vinh Q. Tran, Donald Metzler, Shailesh Bavadekar, Mimi Sun, Evgeniy Gabrilovich:
Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification. CoRR abs/2212.13898 (2022) - 2021
- [j4]Donald Metzler, Yi Tay, Dara Bahri, Marc Najork:
Rethinking search: making domain experts out of dilettantes. SIGIR Forum 55(1): 13:1-13:27 (2021) - [c59]Aston Zhang, Alvin Chan, Yi Tay, Jie Fu, Shuohang Wang, Shuai Zhang, Huajie Shao, Shuochao Yao, Roy Ka-Wei Lee:
On Orthogonality Constraints for Transformers. ACL/IJCNLP (2) 2021: 375-382 - [c58]Vamsi Aribandi, Yi Tay, Donald Metzler:
How Reliable are Model Diagnostics? ACL/IJCNLP (Findings) 2021: 1778-1785 - [c57]Yi Tay, Mostafa Dehghani, Jai Prakash Gupta, Vamsi Aribandi, Dara Bahri, Zhen Qin, Donald Metzler:
Are Pretrained Convolutions Better than Pretrained Transformers? ACL/IJCNLP (1) 2021: 4349-4359 - [c56]Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron C. Courville:
StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling. ACL/IJCNLP (1) 2021: 7196-7209 - [c55]Sharan Narang, Hyung Won Chung, Yi Tay, Liam Fedus, Thibault Févry, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, Colin Raffel:
Do Transformer Modifications Transfer Across Implementations and Applications? EMNLP (1) 2021: 5758-5773 - [c54]Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork:
Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees? ICLR 2021 - [c53]Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler:
Long Range Arena : A Benchmark for Efficient Transformers. ICLR 2021 - [c52]Yi Tay, Zhe Zhao, Dara Bahri, Donald Metzler, Da-Cheng Juan:
HyperGrid Transformers: Towards A Single Model for Multiple Tasks. ICLR 2021 - [c51]Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, Jie Fu:
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters. ICLR 2021 - [c50]Yi Tay, Dara Bahri, Donald Metzler, Da-Cheng Juan, Zhe Zhao, Che Zheng:
Synthesizer: Rethinking Self-Attention for Transformer Models. ICML 2021: 10183-10192 - [c49]Yi Tay, Mostafa Dehghani, Vamsi Aribandi, Jai Prakash Gupta, Philip Pham, Zhen Qin, Dara Bahri, Da-Cheng Juan, Donald Metzler:
OmniNet: Omnidirectional Representations from Transformers. ICML 2021: 10193-10202 - [c48]Shuai Zhang, Xi Rao, Yi Tay, Ce Zhang:
Knowledge Router: Learning Disentangled Representations for Knowledge Graphs. NAACL-HLT 2021: 1-10 - [c47]Aston Zhang, Yi Tay, Yikang Shen, Alvin Chan, Shuai Zhang:
Self-Instantiated Recurrent Units with Dynamic Soft Recursion. NeurIPS 2021: 6503-6514 - [c46]Dara Bahri, Yi Tay, Che Zheng, Cliff Brunk, Donald Metzler, Andrew Tomkins:
Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study. WSDM 2021: 301-309 - [i60]Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler:
Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection. CoRR abs/2102.05131 (2021) - [i59]Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, Jie Fu:
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters. CoRR abs/2102.08597 (2021) - [i58]Shuai Zhang, Yi Tay, Wenqi Jiang, Da-Cheng Juan, Ce Zhang:
Switch Spaces: Learning Product Spaces with Sparse Gating. CoRR abs/2102.08688 (2021) - [i57]Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, Thibault Févry, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, Colin Raffel:
Do Transformer Modifications Transfer Across Implementations and Applications? CoRR abs/2102.11972 (2021) - [i56]Yi Tay, Mostafa Dehghani, Vamsi Aribandi, Jai Prakash Gupta, Philip Pham, Zhen Qin, Dara Bahri, Da-Cheng Juan, Donald Metzler:
OmniNet: Omnidirectional Representations from Transformers. CoRR abs/2103.01075 (2021) - [i55]Donald Metzler, Yi Tay, Dara Bahri, Marc Najork:
Rethinking Search: Making Experts out of Dilettantes. CoRR abs/2105.02274 (2021) - [i54]Yi Tay, Mostafa Dehghani, Jai Prakash Gupta, Dara Bahri, Vamsi Aribandi, Zhen Qin, Donald Metzler:
Are Pre-trained Convolutions Better than Pre-trained Transformers? CoRR abs/2105.03322 (2021) - [i53]Vamsi Aribandi, Yi Tay, Donald Metzler:
How Reliable are Model Diagnostics? CoRR abs/2105.05641 (2021) - [i52]Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Prakash Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler:
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization. CoRR abs/2106.12672 (2021) - [i51]Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler:
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption. CoRR abs/2106.15147 (2021) - [i50]Mostafa Dehghani, Yi Tay, Alexey A. Gritsenko, Zhe Zhao, Neil Houlsby, Fernando Diaz, Donald Metzler, Oriol Vinyals:
The Benchmark Lottery. CoRR abs/2107.07002 (2021) - [i49]Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler:
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers. CoRR abs/2109.10686 (2021) - [i48]Zhen Qin, Le Yan, Yi Tay, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork:
Born Again Neural Rankers. CoRR abs/2109.15285 (2021) - [i47]Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur Parikh, Hongtao Zhong, Emma Strubell:
Improving Compositional Generalization with Self-Training for Data-to-Text Generation. CoRR abs/2110.08467 (2021) - [i46]Dara Bahri, Hossein Mobahi, Yi Tay:
Sharpness-Aware Minimization Improves Language Model Generalization. CoRR abs/2110.08529 (2021) - [i45]Mostafa Dehghani, Alexey A. Gritsenko, Anurag Arnab, Matthias Minderer, Yi Tay:
SCENIC: A JAX Library for Computer Vision Research and Beyond. CoRR abs/2110.11403 (2021) - [i44]Mostafa Dehghani, Anurag Arnab, Lucas Beyer, Ashish Vaswani, Yi Tay:
The Efficiency Misnomer. CoRR abs/2110.12894 (2021) - [i43]Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Prakash Gupta, Kai Hui, Sebastian Ruder, Donald Metzler:
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning. CoRR abs/2111.10952 (2021) - [i42]Valerii Likhosherstov, Anurag Arnab, Krzysztof Choromanski, Mario Lucic, Yi Tay, Adrian Weller, Mostafa Dehghani:
PolyViT: Co-training Vision Transformers on Images, Videos and Audio. CoRR abs/2111.12993 (2021) - 2020
- [j3]Anran Wang, Anh Tuan Luu, Chuan-Sheng Foo, Hongyuan Zhu, Yi Tay, Vijay Chandrasekhar:
Holistic Multi-Modal Memory Network for Movie Question Answering. IEEE Trans. Image Process. 29: 489-499 (2020) - [c45]Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu:
Multi-Level Head-Wise Match and Aggregation in Transformer for Textual Sequence Matching. AAAI 2020: 9209-9216 - [c44]Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, Andrew Tomkins:
Reverse Engineering Configurations of Neural Text Generation Models. ACL 2020: 275-279 - [c43]Xingdi Yuan, Jie Fu, Marc-Alexandre Côté, Yi Tay, Chris Pal, Adam Trischler:
Interactive Machine Comprehension with Information Seeking Agents. ACL 2020: 2325-2338 - [c42]Yi Tay, Donovan Ong, Jie Fu, Alvin Chan, Nancy Chen, Anh Tuan Luu, Chris Pal:
Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences. ACL 2020: 5369-5373 - [c41]Alvin Chan, Yi Tay, Yew-Soon Ong:
What It Thinks Is Important Is Important: Robustness Transfers Through Input Gradients. CVPR 2020: 329-338 - [c40]Alvin Chan, Yi Tay, Yew-Soon Ong, Aston Zhang:
Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder. EMNLP (Findings) 2020: 4175-4189 - [c39]Alvin Chan, Yi Tay, Yew-Soon Ong, Jie Fu:
Jacobian Adversarially Regularized Networks for Robustness. ICLR 2020 - [c38]