


default search action
Kailash Gopalakrishnan
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c30]Ankur Agrawal, Monodeep Kar, Kyu-Hyoun Kim
, Sergey V. Rylov, Jinwook Jung, Seiji Munetoh, Kohji Hosokawa, Xin Zhang
, Bahman Hekmatshoartabari, Fabio Carta, Martin Cochet, Robert Casatuta, Mingu Kang, Sunil Shukla, Kailash Gopalakrishnan, Leland Chang:
A Switched-Capacitor Integer Compute Unit with Decoupled Storage and Arithmetic for Cloud AI Inference in 5nm CMOS. VLSI Technology and Circuits 2023: 1-2 - 2022
- [j6]Sae Kyu Lee
, Ankur Agrawal
, Joel Silberman, Matthew M. Ziegler
, Mingu Kang, Swagath Venkataramani
, Nianzheng Cao
, Bruce M. Fleischer
, Michael Guillorn, Matthew Cohen, Silvia M. Mueller, Jinwook Oh, Martin Lutz, Jinwook Jung
, Siyu Koswatta, Ching Zhou, Vidhi Zalani, Monodeep Kar, James Bonanno
, Robert Casatuta, Chia-Yu Chen
, Jungwook Choi, Howard Haynie, Alyssa Herbert, Radhika Jain
, Kyu-Hyoun Kim
, Yulong Li, Zhibin Ren, Scot Rider, Marcel Schaal, Kerstin Schelm, Michael Scheuermann, Xiao Sun
, Hung Tran, Naigang Wang
, Wei Wang, Xin Zhang
, Vinay Shah, Brian W. Curran, Vijayalakshmi Srinivasan, Pong-Fei Lu, Sunil Shukla, Kailash Gopalakrishnan, Leland Chang:
A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling. IEEE J. Solid State Circuits 57(1): 182-197 (2022) - [j5]Subhankar Pal
, Swagath Venkataramani
, Viji Srinivasan
, Kailash Gopalakrishnan
:
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators. ACM Trans. Embed. Comput. Syst. 21(6): 86:1-86:29 (2022) - [c29]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon
, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. INTERSPEECH 2022: 2038-2042 - [i12]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon
, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. CoRR abs/2206.07882 (2022) - 2021
- [c28]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun
, Naigang Wang, Swagath Venkataramani, George Saon
, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-Bit Quantization of LSTM-Based Speech Recognition Models. Interspeech 2021: 2586-2590 - [c27]Swagath Venkataramani, Vijayalakshmi Srinivasan, Wei Wang, Sanchari Sen, Jintao Zhang, Ankur Agrawal, Monodeep Kar, Shubham Jain, Alberto Mannari, Hoang Tran, Yulong Li, Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue, Marcel Schaal, Mauricio J. Serrano, Jungwook Choi, Xiao Sun
, Naigang Wang, Chia-Yu Chen, Allison Allain, James Bonanno, Nianzheng Cao, Robert Casatuta, Matthew Cohen, Bruce M. Fleischer, Michael Guillorn, Howard Haynie, Jinwook Jung, Mingu Kang, Kyu-Hyoun Kim
, Siyu Koswatta, Sae Kyu Lee, Martin Lutz, Silvia M. Mueller, Jinwook Oh, Ashish Ranjan, Zhibin Ren, Scot Rider, Kerstin Schelm, Michael Scheuermann, Joel Silberman, Jie Yang, Vidhi Zalani, Xin Zhang, Ching Zhou, Matthew M. Ziegler, Vinay Shah, Moriyoshi Ohara, Pong-Fei Lu, Brian W. Curran, Sunil Shukla, Leland Chang, Kailash Gopalakrishnan:
RaPiD: AI Accelerator for Ultra-low Precision Training and Inference. ISCA 2021: 153-166 - [c26]Subhankar Pal
, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan:
Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators. ISPASS 2021: 240-242 - [c25]Ankur Agrawal, Sae Kyu Lee, Joel Silberman, Matthew M. Ziegler, Mingu Kang, Swagath Venkataramani, Nianzheng Cao, Bruce M. Fleischer, Michael Guillorn, Matt Cohen, Silvia M. Mueller, Jinwook Oh, Martin Lutz, Jinwook Jung, Siyu Koswatta, Ching Zhou, Vidhi Zalani, James Bonanno, Robert Casatuta, Chia-Yu Chen, Jungwook Choi, Howard Haynie, Alyssa Herbert, Radhika Jain, Monodeep Kar, Kyu-Hyoun Kim
, Yulong Li, Zhibin Ren, Scot Rider, Marcel Schaal, Kerstin Schelm, Michael Scheuermann, Xiao Sun
, Hung Tran, Naigang Wang, Wei Wang, Xin Zhang, Vinay Shah, Brian W. Curran, Vijayalakshmi Srinivasan, Pong-Fei Lu, Sunil Shukla, Leland Chang, Kailash Gopalakrishnan:
A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling. ISSCC 2021: 144-146 - [i11]Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Naigang Wang, Bowen Pan, Kailash Gopalakrishnan, Aude Oliva, Rogério Feris, Kate Saenko:
All at Once Network Quantization via Collaborative Knowledge Transfer. CoRR abs/2103.01435 (2021) - [i10]Chia-Yu Chen, Jiamin Ni, Songtao Lu, Xiaodong Cui, Pin-Yu Chen, Xiao Sun, Naigang Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Wei Zhang, Kailash Gopalakrishnan:
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training. CoRR abs/2104.11125 (2021) - [i9]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-bit Quantization of LSTM-based Speech Recognition Models. CoRR abs/2108.12074 (2021) - 2020
- [j4]Swagath Venkataramani
, Xiao Sun
, Naigang Wang
, Chia-Yu Chen
, Jungwook Choi
, Mingu Kang, Ankur Agarwal
, Jinwook Oh, Shubham Jain
, Tina Babinsky, Nianzheng Cao
, Thomas W. Fox
, Bruce M. Fleischer, George Gristede, Michael Guillorn, Howard Haynie, Hiroshi Inoue
, Kazuaki Ishizaki, Michael J. Klaiber, Shih-Hsien Lo, Gary W. Maier, Silvia M. Mueller, Michael Scheuermann, Eri Ogawa, Marcel Schaal, Mauricio J. Serrano, Joel Silberman, Christos Vezyrtzis, Wei Wang, Fanchieh Yee, Jintao Zhang
, Matthew M. Ziegler
, Ching Zhou, Moriyoshi Ohara, Pong-Fei Lu, Brian W. Curran, Sunil Shukla
, Vijayalakshmi Srinivasan, Leland Chang, Kailash Gopalakrishnan:
Efficient AI System Design With Cross-Layer Approximate Computing. Proc. IEEE 108(12): 2232-2250 (2020) - [c24]Chia-Yu Chen, Jiamin Ni, Songtao Lu, Xiaodong Cui, Pin-Yu Chen, Xiao Sun, Naigang Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Wei Zhang, Kailash Gopalakrishnan:
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training. NeurIPS 2020 - [c23]Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin:
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training. NeurIPS 2020 - [c22]Xiao Sun, Naigang Wang, Chia-Yu Chen, Jiamin Ni, Ankur Agrawal, Xiaodong Cui, Swagath Venkataramani, Kaoutar El Maghraoui, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan:
Ultra-Low Precision 4-bit Training of Deep Neural Networks. NeurIPS 2020 - [c21]Jinwook Oh, Sae Kyu Lee, Mingu Kang, Matthew M. Ziegler, Joel Silberman, Ankur Agrawal, Swagath Venkataramani, Bruce M. Fleischer, Michael Guillorn, Jungwook Choi, Wei Wang, Silvia M. Mueller, Shimon Ben-Yehuda, James Bonanno, Nianzheng Cao, Robert Casatuta, Chia-Yu Chen, Matt Cohen, Ophir Erez, Thomas W. Fox, George Gristede, Howard Haynie, Vicktoria Ivanov, Siyu Koswatta, Shih-Hsien Lo, Martin Lutz, Gary W. Maier, Alex Mesh, Yevgeny Nustov, Scot Rider, Marcel Schaal, Michael Scheuermann, Xiao Sun
, Naigang Wang, Fanchieh Yee, Ching Zhou, Vinay Shah, Brian W. Curran, Vijayalakshmi Srinivasan, Pong-Fei Lu, Sunil Shukla, Kailash Gopalakrishnan, Leland Chang:
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference. VLSI Circuits 2020: 1-2 - [i8]Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin:
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training. CoRR abs/2012.13113 (2020)
2010 – 2019
- 2019
- [j3]Swagath Venkataramani, Jungwook Choi, Vijayalakshmi Srinivasan, Wei Wang, Jintao Zhang, Marcel Schaal, Mauricio J. Serrano, Kazuaki Ishizaki, Hiroshi Inoue, Eri Ogawa, Moriyoshi Ohara, Leland Chang, Kailash Gopalakrishnan:
DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator. IEEE Micro 39(5): 102-111 (2019) - [c20]Ankur Agrawal, Bruce M. Fleischer, Silvia M. Mueller, Xiao Sun
, Naigang Wang, Jungwook Choi, Kailash Gopalakrishnan:
DLFloat: A 16-b Floating Point Format Designed for Deep Learning Training and Inference. ARITH 2019: 92-95 - [c19]Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue, Swagath Venkataramani, Jungwook Choi, Wei Wang, Vijayalakshmi Srinivasan, Moriyoshi Ohara, Kailash Gopalakrishnan:
A Compiler for Deep Neural Network Accelerators to Generate Optimized Code for a Wide Range of Data Parameters from a Hand-crafted Computation Kernel. COOL CHIPS 2019: 1-3 - [c18]Shubham Jain, Swagath Venkataramani, Vijayalakshmi Srinivasan, Jungwook Choi, Kailash Gopalakrishnan, Leland Chang:
BiScaled-DNN: Quantizing Long-tailed Datastructures with Two Scale Factors for Deep Neural Networks. DAC 2019: 201 - [c17]Swagath Venkataramani, Vijayalakshmi Srinivasan, Jungwook Choi, Philip Heidelberger, Leland Chang, Kailash Gopalakrishnan:
Memory and Interconnect Optimizations for Peta-Scale Deep Learning Systems. HiPC 2019: 225-234 - [c16]Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh R. Shanbhag, Kailash Gopalakrishnan:
Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks. ICLR (Poster) 2019 - [c15]Swagath Venkataramani, Jungwook Choi, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan, Leland Chang:
Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators. IISWC 2019: 257-262 - [c14]Jungwook Choi, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan, Zhuo Wang, Pierce Chuang:
Accurate and Efficient 2-bit Quantized Neural Networks. SysML 2019 - [c13]Xiao Sun, Jungwook Choi, Chia-Yu Chen, Naigang Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Xiaodong Cui, Wei Zhang, Kailash Gopalakrishnan:
Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks. NeurIPS 2019: 4901-4910 - [i7]Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh R. Shanbhag, Kailash Gopalakrishnan:
Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks. CoRR abs/1901.06588 (2019) - 2018
- [c12]Chia-Yu Chen, Jungwook Choi, Daniel Brand, Ankur Agrawal, Wei Zhang, Kailash Gopalakrishnan:
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training. AAAI 2018: 2827-2835 - [c11]Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Viji Srinivasan, Swagath Venkataramani:
Exploiting approximate computing for deep learning acceleration. DATE 2018: 821-826 - [c10]Charbel Sakr, Jungwook Choi, Zhuo Wang, Kailash Gopalakrishnan, Naresh R. Shanbhag:
True Gradient-Based Training of Deep Binary Activated Neural Networks Via Continuous Binarization. ICASSP 2018: 2346-2350 - [c9]Swagath Venkataramani, Vijayalakshmi Srinivasan, Jungwook Choi, Kailash Gopalakrishnan, Leland Chang:
Taming the beast: Programming Peta-FLOP class Deep Learning Systems. ISLPED 2018: 18:1 - [c8]Vijayalakshmi Srinivasan, Bruce M. Fleischer, Sunil Shukla, Matthew M. Ziegler, Joel Silberman, Jinwook Oh, Jungwook Choi, Silvia M. Mueller, Ankur Agrawal, Tina Babinsky, Nianzheng Cao, Chia-Yu Chen, Pierce Chuang, Thomas W. Fox, George Gristede, Michael Guillorn, Howard Haynie, Michael J. Klaiber, Dongsoo Lee, Shih-Hsien Lo, Gary W. Maier, Michael Scheuermann, Swagath Venkataramani, Christos Vezyrtzis, Naigang Wang, Fanchieh Yee, Ching Zhou, Pong-Fei Lu, Brian W. Curran, Leland Chang, Kailash Gopalakrishnan:
Across the Stack Opportunities for Deep Learning Acceleration. ISLPED 2018: 35:1-35:2 - [c7]Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, Kailash Gopalakrishnan:
Training Deep Neural Networks with 8-bit Floating Point Numbers. NeurIPS 2018: 7686-7695 - [c6]Bruce M. Fleischer, Sunil Shukla, Matthew M. Ziegler, Joel Silberman, Jinwook Oh, Vijayalakshmi Srinivasan, Jungwook Choi, Silvia M. Mueller, Ankur Agrawal, Tina Babinsky, Nianzheng Cao, Chia-Yu Chen, Pierce Chuang, Thomas W. Fox, George Gristede, Michael Guillorn, Howard Haynie, Michael J. Klaiber, Dongsoo Lee, Shih-Hsien Lo, Gary W. Maier, Michael Scheuermann, Swagath Venkataramani, Christos Vezyrtzis, Naigang Wang, Fanchieh Yee, Ching Zhou, Pong-Fei Lu, Brian W. Curran, Leland Chang, Kailash Gopalakrishnan:
A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference. VLSI Circuits 2018: 35-36 - [i6]Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan:
PACT: Parameterized Clipping Activation for Quantized Neural Networks. CoRR abs/1805.06085 (2018) - [i5]Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan:
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN). CoRR abs/1807.06964 (2018) - [i4]Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, Kailash Gopalakrishnan:
Training Deep Neural Networks with 8-bit Floating Point Numbers. CoRR abs/1812.08011 (2018) - 2017
- [c5]Swagath Venkataramani, Jungwook Choi, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan, Leland Chang:
POSTER: Design Space Exploration for Performance Optimization of Deep Neural Networks on Shared Memory Accelerators. PACT 2017: 146-147 - [c4]Ankur Agrawal, Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Jinwook Oh, Sunil Shukla, Viji Srinivasan, Swagath Venkataramani, Wei Zhang:
Accelerator Design for Deep Learning Training: Extended Abstract: Invited. DAC 2017: 57:1-57:2 - [i3]Chia-Yu Chen, Jungwook Choi, Daniel Brand, Ankur Agrawal, Wei Zhang, Kailash Gopalakrishnan:
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training. CoRR abs/1712.02679 (2017) - 2016
- [c3]Ankur Agrawal, Jungwook Choi, Kailash Gopalakrishnan, Suyog Gupta, Ravi Nair, Jinwook Oh, Daniel A. Prener, Sunil Shukla, Vijayalakshmi Srinivasan, Zehra Sura:
Approximate computing: Challenges and opportunities. ICRC 2016: 1-8 - [c2]Jinwook Oh, Jungwook Choi, Guilherme C. Januario, Kailash Gopalakrishnan:
Energy-Efficient Simultaneous Localization and Mapping via Compounded Approximate Computing. SiPS 2016: 51-56 - 2015
- [c1]Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, Pritish Narayanan:
Deep Learning with Limited Numerical Precision. ICML 2015: 1737-1746 - [i2]Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, Pritish Narayanan:
Deep Learning with Limited Numerical Precision. CoRR abs/1502.02551 (2015) - 2014
- [i1]Suyog Gupta, Vikas Sindhwani, Kailash Gopalakrishnan:
Learning Machines Implemented on Non-Deterministic Hardware. CoRR abs/1409.2620 (2014) - 2013
- [j2]Bryan L. Jackson, Bipin Rajendran
, Gregory S. Corrado, Matthew J. Breitwisch, Geoffrey W. Burr, Roger Cheek, Kailash Gopalakrishnan, Simone Raoux, Charles T. Rettner, Alvaro Padilla, Alejandro G. Schrott, Rohit S. Shenoy, Bülent N. Kurdi, Chung Hon Lam, Dharmendra S. Modha:
Nanoscale electronic synapses using phase change devices. ACM J. Emerg. Technol. Comput. Syst. 9(2): 12:1-12:20 (2013)
2000 – 2009
- 2008
- [j1]Geoffrey W. Burr, Bülent N. Kurdi, J. Campbell Scott, Chung Hon Lam, Kailash Gopalakrishnan, Rohit S. Shenoy:
Overview of candidate device technologies for storage-class memory. IBM J. Res. Dev. 52(4-5): 449-464 (2008)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-20 22:54 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint