
Sashank J. Reddi
Person information
- affiliation: Carnegie Mellon University, Machine Learning Department
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2020
- [c31]Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Can gradient clipping mitigate label noise? ICLR 2020 - [c30]Yangjun Ruan, Yuanhao Xiong, Sashank J. Reddi, Sanjiv Kumar, Cho-Jui Hsieh:
Learning to Learn by Zeroth-Order Oracle. ICLR 2020 - [c29]Yang You, Jing Li, Sashank J. Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, Cho-Jui Hsieh:
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes. ICLR 2020 - [c28]Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Are Transformers universal approximators of sequence-to-sequence functions? ICLR 2020 - [c27]Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Low-Rank Bottleneck in Multi-head Attention Models. ICML 2020: 864-873 - [c26]Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh:
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. ICML 2020: 5132-5143 - [c25]Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers. NeurIPS 2020 - [c24]Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar, Suvrit Sra:
Why are Adaptive Methods Good for Attention Models? NeurIPS 2020 - [i29]Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Low-Rank Bottleneck in Multi-head Attention Models. CoRR abs/2002.07028 (2020) - [i28]Ilqar Ramazanli, Han Nguyen, Hai Pham, Sashank J. Reddi, Barnabás Póczos:
Adaptive Sampling Distributed Stochastic Variance Reduced Gradient for Heterogeneous Distributed Datasets. CoRR abs/2002.08528 (2020) - [i27]Sashank J. Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konecný, Sanjiv Kumar, H. Brendan McMahan:
Adaptive Federated Optimization. CoRR abs/2003.00295 (2020) - [i26]Ankit Singh Rawat, Aditya Krishna Menon, Andreas Veit, Felix X. Yu, Sashank J. Reddi, Sanjiv Kumar:
Doubly-stochastic mining for heterogeneous retrieval. CoRR abs/2004.10915 (2020) - [i25]Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Seungyeon Kim, Sanjiv Kumar:
Why distillation helps: a statistical perspective. CoRR abs/2005.10419 (2020) - [i24]Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers. CoRR abs/2006.04862 (2020) - [i23]Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh:
Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning. CoRR abs/2008.03606 (2020) - [i22]Honglin Yuan, Manzil Zaheer, Sashank J. Reddi:
Federated Composite Optimization. CoRR abs/2011.08474 (2020)
2010 – 2019
- 2019
- [c23]Sashank J. Reddi, Satyen Kale, Felix X. Yu, Daniel Niels Holtmann-Rice, Jiecao Chen, Sanjiv Kumar:
Stochastic Negative Mining for Learning with Large Output Spaces. AISTATS 2019: 1940-1949 - [c22]Matthew Staib, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar, Suvrit Sra:
Escaping Saddle Points with Adaptive Gradient Methods. ICML 2019: 5956-5965 - [c21]Chuan Guo, Ali Mousavi, Xiang Wu, Daniel Niels Holtmann-Rice, Satyen Kale, Sashank J. Reddi, Sanjiv Kumar:
Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces. NeurIPS 2019: 4944-4954 - [i21]Matthew Staib, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar, Suvrit Sra:
Escaping Saddle Points with Adaptive Gradient Methods. CoRR abs/1901.09149 (2019) - [i20]Sashank J. Reddi, Satyen Kale, Sanjiv Kumar:
On the Convergence of Adam and Beyond. CoRR abs/1904.09237 (2019) - [i19]Venkatadheeraj Pichapati, Ananda Theertha Suresh, Felix X. Yu, Sashank J. Reddi, Sanjiv Kumar:
AdaCliP: Adaptive Clipping for Private SGD. CoRR abs/1908.07643 (2019) - [i18]Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh:
SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning. CoRR abs/1910.06378 (2019) - [i17]Yangjun Ruan, Yuanhao Xiong, Sashank J. Reddi, Sanjiv Kumar, Cho-Jui Hsieh:
Learning to Learn by Zeroth-Order Oracle. CoRR abs/1910.09464 (2019) - [i16]Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar, Suvrit Sra:
Why ADAM Beats SGD for Attention Models. CoRR abs/1912.03194 (2019) - [i15]Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar:
Are Transformers universal approximators of sequence-to-sequence functions? CoRR abs/1912.10077 (2019) - 2018
- [c20]Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabás Póczos, Francis R. Bach, Ruslan Salakhutdinov, Alexander J. Smola:
A Generic Approach for Escaping Saddle points. AISTATS 2018: 1233-1242 - [c19]Sashank J. Reddi, Satyen Kale, Sanjiv Kumar:
On the Convergence of Adam and Beyond. ICLR 2018 - [c18]Manzil Zaheer, Sashank J. Reddi, Devendra Singh Sachan, Satyen Kale, Sanjiv Kumar:
Adaptive Methods for Nonconvex Optimization. NeurIPS 2018: 9815-9825 - [i14]Sashank J. Reddi, Satyen Kale, Felix X. Yu, Daniel N. Holtmann-Rice, Jiecao Chen, Sanjiv Kumar:
Stochastic Negative Mining for Learning with Large Output Spaces. CoRR abs/1810.07076 (2018) - 2017
- [i13]Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabás Póczos, Francis R. Bach, Ruslan Salakhutdinov, Alexander J. Smola:
A Generic Approach for Escaping Saddle points. CoRR abs/1709.01434 (2017) - 2016
- [c17]Sashank J. Reddi, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Stochastic Frank-Wolfe methods for nonconvex optimization. Allerton 2016: 1244-1251 - [c16]Sashank J. Reddi, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Fast incremental method for smooth nonconvex optimization. CDC 2016: 1971-1977 - [c15]Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Stochastic Variance Reduction for Nonconvex Optimization. ICML 2016: 314-323 - [c14]Sashank J. Reddi, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization. NIPS 2016: 1145-1153 - [c13]Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabás Póczos, Alexander J. Smola, Eric P. Xing:
Variance Reduction in Stochastic Gradient Langevin Dynamics. NIPS 2016: 1154-1162 - [c12]Hongyi Zhang, Sashank J. Reddi, Suvrit Sra:
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds. NIPS 2016: 4592-4600 - [i12]Sashank J. Reddi, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Fast Incremental Method for Nonconvex Optimization. CoRR abs/1603.06159 (2016) - [i11]Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Stochastic Variance Reduction for Nonconvex Optimization. CoRR abs/1603.06160 (2016) - [i10]Sashank J. Reddi, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Fast Stochastic Methods for Nonsmooth Nonconvex Optimization. CoRR abs/1605.06900 (2016) - [i9]Hongyi Zhang, Sashank J. Reddi, Suvrit Sra:
Fast stochastic optimization on Riemannian manifolds. CoRR abs/1605.07147 (2016) - [i8]Sashank J. Reddi, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
Stochastic Frank-Wolfe Methods for Nonconvex Optimization. CoRR abs/1607.08254 (2016) - [i7]Sashank J. Reddi, Jakub Konecný, Peter Richtárik, Barnabás Póczos, Alexander J. Smola:
AIDE: Fast and Communication Efficient Distributed Optimization. CoRR abs/1608.06879 (2016) - 2015
- [c11]Sashank Jakkam Reddi, Barnabás Póczos, Alexander J. Smola:
Doubly Robust Covariate Shift Correction. AAAI 2015: 2949-2955 - [c10]Aaditya Ramdas, Sashank Jakkam Reddi, Barnabás Póczos, Aarti Singh, Larry A. Wasserman:
On the Decreasing Power of Kernel and Distance Based Nonparametric Hypothesis Tests in High Dimensions. AAAI 2015: 3571-3577 - [c9]Sashank J. Reddi, Aaditya Ramdas, Barnabás Póczos, Aarti Singh, Larry A. Wasserman:
On the High Dimensional Power of a Linear-Time Two Sample Test under Mean-shift Alternatives. AISTATS 2015 - [c8]Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants. NIPS 2015: 2647-2655 - [c7]Sashank J. Reddi, Barnabás Póczos, Alexander J. Smola:
Communication Efficient Coresets for Empirical Loss Minimization. UAI 2015: 752-761 - [c6]Sashank J. Reddi, Ahmed Hefny, Carlton Downey, Avinava Dubey, Suvrit Sra:
Large-scale randomized-coordinate descent methods with non-separable linear constraints. UAI 2015: 762-771 - [i6]Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabás Póczos, Alexander J. Smola:
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants. CoRR abs/1506.06840 (2015) - [i5]Aaditya Ramdas, Sashank J. Reddi, Barnabás Póczos, Aarti Singh, Larry A. Wasserman:
Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing. CoRR abs/1508.00655 (2015) - 2014
- [c5]Sashank J. Reddi, Barnabás Póczos:
k-NN Regression on Functional Data with Incomplete Observations. UAI 2014: 692-701 - [i4]Sashank J. Reddi, Aaditya Ramdas, Barnabás Póczos, Aarti Singh, Larry A. Wasserman:
Kernel MMD, the Median Heuristic and Distance Correlation in High Dimensions. CoRR abs/1406.2083 (2014) - [i3]Aaditya Ramdas, Sashank J. Reddi, Barnabás Póczos, Aarti Singh, Larry A. Wasserman:
On the High-dimensional Power of Linear-time Kernel Two-Sample Testing under Mean-difference Alternatives. CoRR abs/1411.6314 (2014) - 2013
- [c4]Sashank J. Reddi, Barnabás Póczos:
Scale Invariant Conditional Dependence Measures. ICML (3) 2013: 1355-1363 - 2012
- [c3]Sashank Jakkam Reddi, Emma Brunskill:
Incentive Decision Processes. UAI 2012: 418-427 - [c2]Ariel D. Procaccia, Sashank Jakkam Reddi, Nisarg Shah:
A Maximum Likelihood Approach For Selecting Sets of Alternatives. UAI 2012: 695-704 - [i2]Sashank Jakkam Reddi, Emma Brunskill:
Incentive Decision Processes. CoRR abs/1210.4877 (2012) - [i1]Ariel D. Procaccia, Sashank Jakkam Reddi, Nisarg Shah:
A Maximum Likelihood Approach For Selecting Sets of Alternatives. CoRR abs/1210.4882 (2012) - 2010
- [c1]Sashank Jakkam Reddi, Sunita Sarawagi, Sundar Vishwanathan:
MAP estimation in Binary MRFs via Bipartite Multi-cuts. NIPS 2010: 955-963
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
load content from web.archive.org
Privacy notice: By enabling the option above, your browser will contact the API of web.archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
Tweets on dblp homepage
Show tweets from on the dblp homepage.
Privacy notice: By enabling the option above, your browser will contact twitter.com and twimg.com to load tweets curated by our Twitter account. At the same time, Twitter will persistently store several cookies with your web browser. While we did signal Twitter to not track our users by setting the "dnt" flag, we do not have any control over how Twitter uses your data. So please proceed with care and consider checking the Twitter privacy policy.
last updated on 2021-01-22 23:24 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint