default search action
P. Sadayappan
Ponnuswamy Sadayappan
Person information
- affiliation: University of Utah, Salt Lake City, UT, USA
- affiliation (former): Ohio State University, Columbus, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j66]Dhabaleswar K. Panda, Vipin Chaudhary, Eric Fosler-Lussier, Raghu Machiraju, Amit Majumdar, Beth Plale, Rajiv Ramnath, Ponnuswamy Sadayappan, Neelima Savardekar, Karen Tomko:
Creating intelligent cyberinfrastructure for democratizing AI. AI Mag. 45(1): 22-28 (2024) - [c281]Chendi Li, Yufan Xu, Sina Mahdipour Saravani, Ponnuswamy Sadayappan:
Accelerated Auto-Tuning of GPU Kernels for Tensor Computations. ICS 2024: 549-561 - [e8]Gabriel Rodríguez, P. Sadayappan, Aravind Sukumaran-Rajam:
Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction, CC 2024, Edinburgh, United Kingdom, March 2-3, 2024. ACM 2024 [contents] - [i16]Tripti Agarwal, Harvey Dam, Dorra Ben Khalifa, Matthieu Martel, P. Sadayappan, Ganesh Gopalakrishnan:
What Operations can be Performed Directly on Compressed Arrays, and with What Error? CoRR abs/2406.11209 (2024) - [i15]Ashim Gupta, Sina Mahdipour Saravani, P. Sadayappan, Vivek Srikumar:
An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers. CoRR abs/2406.11307 (2024) - 2023
- [j65]Eric Heisler, Aadesh Deshmukh, Sandip Mazumder, Ponnuswamy Sadayappan, Hari Sundar:
Multi-discretization domain specific language and code generation for differential equations. J. Comput. Sci. 68: 101981 (2023) - [j64]Nicolas Tollenaere, Guillaume Iooss, Stéphane Pouget, Hugo Brunie, Christophe Guillon, Albert Cohen, P. Sadayappan, Fabrice Rastello:
Autotuning Convolutions Is Easier Than You Think. ACM Trans. Archit. Code Optim. 20(2): 20:1-20:24 (2023) - [c280]Jon Roose, Miheer Vaidya, Ponnuswamy Sadayappan, Sivasankaran Rajamanickam:
TenSQL: An SQL Database Built on GraphBLAS. HPEC 2023: 1-8 - [c279]Han D. Tran, Siddharth Saurav, P. Sadayappan, Sandip Mazumder, Hari Sundar:
Scalable parallelization for the solution of phonon Boltzmann Transport Equation. ICS 2023: 215-226 - [c278]Süreyya Emre Kurt, Jinghua Yan, Aravind Sukumaran-Rajam, Prashant Pandey, P. Sadayappan:
Communication Optimization for Distributed Execution of Graph Neural Networks. IPDPS 2023: 512-523 - [c277]M. Emin Ozturk, Omid Asudeh, Gerald Sabin, P. Sadayappan, Aravind Sukumaran-Rajam:
A Performance Portability Study Using Tensor Contraction Benchmarks. IPDPS Workshops 2023: 591-600 - [c276]Lizhi Xiang, Miao Yin, Chengming Zhang, Aravind Sukumaran-Rajam, P. Sadayappan, Bo Yuan, Dingwen Tao:
TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition. PPoPP 2023: 260-273 - [c275]Martin Kong, Raneem Abu Yosef, Atanas Rountev, P. Sadayappan:
Automatic Generation of Distributed-Memory Mappings for Tensor Computations. SC 2023: 64:1-64:13 - [c274]Tripti Agarwal, Harvey Dam, Ponnuswamy Sadayappan, Ganesh Gopalakrishnan, Dorra Ben Khalifa, Matthieu Martel:
What Operations can be Performed Directly on Compressed Arrays, and with What Error? SC Workshops 2023: 252-262 - [i14]Eric Heisler, Siddharth Saurav, Aadesh Deshmukh, Sandip Mazumder, Ponnuswamy Sadayappan, Hari Sundar:
Automating GPU Scalability for Complex Scientific Models: Phonon Boltzman Transport Equation. CoRR abs/2305.19400 (2023) - 2022
- [c273]Yufan Xu, Qiwei Yuan, Erik Curtis Barton, Rui Li, P. Sadayappan, Aravind Sukumaran-Rajam:
Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs. PACT 2022: 252-264 - [c272]Lizhi Xiang, P. Sadayappan, Aravind Sukumaran-Rajam:
High-Performance Architecture Aware Sparse Convolutional Neural Networks for GPUs. PACT 2022: 265-278 - [c271]Yufan Xu, Saurabh Raje, Atanas Rountev, Gerald Sabin, Aravind Sukumaran-Rajam, P. Sadayappan:
Training of deep learning pipelines on memory-constrained GPUs via segmented fused-tiled execution. CC 2022: 104-116 - [c270]Miheer Vaidya, Aravind Sukumaran-Rajam, Atanas Rountev, P. Sadayappan:
Comprehensive Accelerator-Dataflow Co-design Optimization for Convolutional Neural Networks. CGO 2022: 325-335 - [c269]Süreyya Emre Kurt, Saurabh Raje, Aravind Sukumaran-Rajam, P. Sadayappan:
Sparsity-Aware Tensor Decomposition. IPDPS 2022: 952-962 - [c268]Philip Munksgaard, Troels Henriksen, Ponnuswamy Sadayappan, Cosmin E. Oancea:
Memory Optimizations in an Array Language. SC 2022: 31:1-31:15 - [d4]Philip Munksgaard, Troels Henriksen, Ponnuswamy Sadayappan, Cosmin E. Oancea:
futhark-mem-sc22. Version v0.1.8. Zenodo, 2022 [all versions] - [d3]Philip Munksgaard, Troels Henriksen, Ponnuswamy Sadayappan, Cosmin E. Oancea:
futhark-mem-sc22. Version v1.0.0. Zenodo, 2022 [all versions] - [d2]Philip Munksgaard, Troels Henriksen, Ponnuswamy Sadayappan, Cosmin E. Oancea:
futhark-mem-sc22. Version v1.1.0. Zenodo, 2022 [all versions] - [d1]Philip Munksgaard, Troels Henriksen, Ponnuswamy Sadayappan, Cosmin E. Oancea:
futhark-mem-sc22. Version v1.1.1. Zenodo, 2022 [all versions] - [i13]Lizhi Xiang, Miao Yin, Chengming Zhang, Aravind Sukumaran-Rajam, P. Sadayappan, Bo Yuan, Dingwen Tao:
TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition. CoRR abs/2211.03715 (2022) - [i12]Paolo Bientinesi, David A. Ham, Furong Huang, Paul H. J. Kelly, P. Sadayappan, Edward Stow:
Tensor Computations: Applications and Optimization (Dagstuhl Seminar 22101). Dagstuhl Reports 12(3): 1-14 (2022) - 2021
- [c267]Rui Li, Yufan Xu, Aravind Sukumaran-Rajam, Atanas Rountev, P. Sadayappan:
Analytical characterization and design space exploration for optimization of CNNs. ASPLOS 2021: 928-942 - [c266]Auguste Olivry, Guillaume Iooss, Nicolas Tollenaere, Atanas Rountev, P. Sadayappan, Fabrice Rastello:
IOOpt: automatic derivation of I/O complexity bounds for affine programs. PLDI 2021: 1187-1202 - [c265]Rui Li, Yufan Xu, Aravind Sukumaran-Rajam, Atanas Rountev, P. Sadayappan:
Efficient Distributed Algorithms for Convolutional Neural Networks. SPAA 2021: 439-442 - [i11]Rui Li, Yufan Xu, Aravind Sukumaran-Rajam, Atanas Rountev, P. Sadayappan:
Analytical Characterization and Design Space Exploration for Optimization of CNNs. CoRR abs/2101.09808 (2021) - [i10]Rui Li, Yufan Xu, Aravind Sukumaran-Rajam, Atanas Rountev, P. Sadayappan:
Efficient distributed algorithms for Convolutional Neural Networks. CoRR abs/2105.13480 (2021) - 2020
- [c264]Gordon Euhyun Moon, J. Austin Ellis, Aravind Sukumaran-Rajam, Srinivasan Parthasarathy, P. Sadayappan:
ALO-NMF: Accelerated Locality-Optimized Non-negative Matrix Factorization. KDD 2020: 1758-1767 - [c263]Auguste Olivry, Julien Langou, Louis-Noël Pouchet, P. Sadayappan, Fabrice Rastello:
Automated derivation of parametric data movement lower bounds for affine programs. PLDI 2020: 808-822 - [c262]Jinsung Kim, Ajay Panyala, Bo Peng, Karol Kowalski, P. Sadayappan, Sriram Krishnamoorthy:
Scalable heterogeneous execution of a coupled-cluster model with perturbative triples. SC 2020: 79 - [c261]Süreyya Emre Kurt, Aravind Sukumaran-Rajam, Fabrice Rastello, P. Sadayappan:
Efficient tiled sparse matrix multiplication through matrix signatures. SC 2020: 87 - [c260]Troels Henriksen, Sune Hellfritzsch, Ponnuswamy Sadayappan, Cosmin E. Oancea:
Compiling generalized histograms for GPU. SC 2020: 97 - [e7]Ponnuswamy Sadayappan, Bradford L. Chamberlain, Guido Juckeland, Hatem Ltaief:
High Performance Computing - 35th International Conference, ISC High Performance 2020, Frankfurt/Main, Germany, June 22-25, 2020, Proceedings. Lecture Notes in Computer Science 12151, Springer 2020, ISBN 978-3-030-50742-8 [contents]
2010 – 2019
- 2019
- [c259]Jiankai Sun, Bortik Bandyopadhyay, Armin Bashizade, Jiongqian Liang, P. Sadayappan, Srinivasan Parthasarathy:
ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation. AAAI 2019: 265-272 - [c258]Jinsung Kim, Aravind Sukumaran-Rajam, Vineeth Thumma, Sriram Krishnamoorthy, Ajay Panyala, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan:
A Code Generator for High-Performance Tensor Contractions on GPUs. CGO 2019: 85-95 - [c257]Israt Nisa, Jiajia Li, Aravind Sukumaran-Rajam, Richard W. Vuduc, P. Sadayappan:
Load-Balanced Sparse MTTKRP on GPUs. IPDPS 2019: 123-133 - [c256]Prashant Singh Rawat, Miheer Vaidya, Aravind Sukumaran-Rajam, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan:
On Optimizing Complex Stencils on GPUs. IPDPS 2019: 641-652 - [c255]Changwan Hong, Aravind Sukumaran-Rajam, Israt Nisa, Kunal Singh, P. Sadayappan:
Adaptive sparse tiling for sparse matrix multiplication. PPoPP 2019: 300-314 - [c254]Gordon Euhyun Moon, Denis Newman-Griffis, Jinsung Kim, Aravind Sukumaran-Rajam, Eric Fosler-Lussier, P. Sadayappan:
Parallel Data-Local Training for Optimizing Word2Vec Embeddings for Word and Graph Embeddings. MLHPC@SC 2019: 44-55 - [c253]Israt Nisa, Jiajia Li, Aravind Sukumaran-Rajam, Prashant Singh Rawat, Sriram Krishnamoorthy, P. Sadayappan:
An efficient mixed-mode representation of sparse tensors. SC 2019: 49:1-49:25 - [c252]Rui Li, Aravind Sukumaran-Rajam, Richard Veras, Tze Meng Low, Fabrice Rastello, Atanas Rountev, P. Sadayappan:
Analytical cache modeling and tilesize optimization for tensor contractions. SC 2019: 74:1-74:13 - [e6]Michèle Weiland, Guido Juckeland, Carsten Trinitis, Ponnuswamy Sadayappan:
High Performance Computing - 34th International Conference, ISC High Performance 2019, Frankfurt/Main, Germany, June 16-20, 2019, Proceedings. Lecture Notes in Computer Science 11501, Springer 2019, ISBN 978-3-030-20655-0 [contents] - [i9]Israt Nisa, Jiajia Li, Aravind Sukumaran-Rajam, Richard W. Vuduc, P. Sadayappan:
Load-Balanced Sparse MTTKRP on GPUs. CoRR abs/1904.03329 (2019) - [i8]Gordon Euhyun Moon, Aravind Sukumaran-Rajam, Srinivasan Parthasarathy, P. Sadayappan:
PL-NMF: Parallel Locality-Optimized Non-negative Matrix Factorization. CoRR abs/1904.07935 (2019) - [i7]Auguste Olivry, Julien Langou, Louis-Noël Pouchet, P. Sadayappan, Fabrice Rastello:
Automated Derivation of Parametric Data Movement Lower Bounds for Affine Programs. CoRR abs/1911.06664 (2019) - 2018
- [j63]Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, P. Sadayappan:
Analytical modeling of cache behavior for affine programs. Proc. ACM Program. Lang. 2(POPL): 32:1-32:26 (2018) - [j62]Prashant Singh Rawat, Miheer Vaidya, Aravind Sukumaran-Rajam, Mahesh Ravishankar, Vinod Grover, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan:
Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations. Proc. IEEE 106(11): 1902-1920 (2018) - [c251]Israt Nisa, Aravind Sukumaran-Rajam, Süreyya Emre Kurt, Changwan Hong, P. Sadayappan:
Sampled Dense Matrix Multiplication for High-Performance Machine Learning. HiPC 2018: 32-41 - [c250]Changwan Hong, Aravind Sukumaran-Rajam, Bortik Bandyopadhyay, Jinsung Kim, Süreyya Emre Kurt, Israt Nisa, Shivani Sabhlok, Ümit V. Çatalyürek, Srinivasan Parthasarathy, P. Sadayappan:
Efficient sparse-matrix multi-vector product on GPUs. HPDC 2018: 66-79 - [c249]Gordon Euhyun Moon, Israt Nisa, Aravind Sukumaran-Rajam, Bortik Bandyopadhyay, Srinivasan Parthasarathy, P. Sadayappan:
Parallel Latent Dirichlet Allocation on GPUs. ICCS (2) 2018: 259-272 - [c248]Jinsung Kim, Aravind Sukumaran-Rajam, Changwan Hong, Ajay Panyala, Rohit Kumar Srivastava, Sriram Krishnamoorthy, P. Sadayappan:
Optimizing Tensor Contractions in CCSD(T) for Efficient Execution on GPUs. ICS 2018: 96-106 - [c247]Jyothi Vedurada, Arjun Suresh, Aravind Sukumaran-Rajam, Jinsung Kim, Changwan Hong, Ajay Panyala, Sriram Krishnamoorthy, V. Krishna Nandivada, Rohit Kumar Srivastava, P. Sadayappan:
TTLG - An Efficient Tensor Transposition Library for GPUs. IPDPS 2018: 578-588 - [c246]Israt Nisa, Charles Siegel, Aravind Sukumaran-Rajam, Abhinav Vishnu, P. Sadayappan:
Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUs. IPDPS Workshops 2018: 1056-1065 - [c245]Changwan Hong, Aravind Sukumaran-Rajam, Jinsung Kim, Prashant Singh Rawat, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan:
GPU code optimization using abstract kernel emulation and sensitivity analysis. PLDI 2018: 736-751 - [c244]Prashant Singh Rawat, Fabrice Rastello, Aravind Sukumaran-Rajam, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan:
Register optimizations for stencils on GPUs. PPoPP 2018: 168-182 - [c243]Changwan Hong, Aravind Sukumaran-Rajam, Jinsung Kim, Prashant Singh Rawat, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan:
Performance modeling for GPUs using abstract kernel emulation. PPoPP 2018: 397-398 - [c242]Prashant Singh Rawat, Aravind Sukumaran-Rajam, Atanas Rountev, Fabrice Rastello, Louis-Noël Pouchet, P. Sadayappan:
Associative instruction reordering to alleviate register pressure. SC 2018: 46:1-46:13 - [i6]Jiankai Sun, Bortik Bandyopadhyay, Armin Bashizade, Jiongqian Liang, P. Sadayappan, Srinivasan Parthasarathy:
ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation. CoRR abs/1811.00839 (2018) - 2017
- [c241]Changwan Hong, Aravind Sukumaran-Rajam, Jinsung Kim, P. Sadayappan:
MultiGraph: Efficient Graph Processing on GPUs. PACT 2017: 27-40 - [c240]Prashant Singh Rawat, Aravind Sukumaran-Rajam, Atanas Rountev, Fabrice Rastello, Louis-Noël Pouchet, P. Sadayappan:
POSTER: Statement Reordering to Alleviate Register Pressure for Stencils on GPUs. PACT 2017: 158-159 - [c239]Gordon Euhyun Moon, Aravind Sukumaran-Rajam, P. Sadayappan:
Parallel LDA with Over-Decomposition. HiPC Workshops 2017: 25-31 - [c238]Süreyya Emre Kurt, Vineeth Thumma, Changwan Hong, Aravind Sukumaran-Rajam, P. Sadayappan:
Characterization of Data Movement Requirements for Sparse Matrix Computations on GPUs. HiPC 2017: 283-293 - [c237]Rakshith Kunchum, Ankur Chaudhry, Aravind Sukumaran-Rajam, Qingpeng Niu, Israt Nisa, P. Sadayappan:
On improving performance of sparse matrix-matrix multiplication on GPUs. ICS 2017: 14:1-14:11 - [c236]Wenlei Bao, Prashant Singh Rawat, Martin Kong, Sriram Krishnamoorthy, Louis-Noël Pouchet, P. Sadayappan:
Efficient Cache Simulation for Affine Computations. LCPC 2017: 65-85 - [c235]Israt Nisa, Aravind Sukumaran-Rajam, Rakshith Kunchum, P. Sadayappan:
Parallel CCD++ on GPU for Matrix Factorization. GPGPU@PPoPP 2017: 73-83 - [c234]Samyam Rajbhandari, Fabrice Rastello, Karol Kowalski, Sriram Krishnamoorthy, P. Sadayappan:
Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis. PPoPP 2017: 327-340 - 2016
- [j61]Humayun Arafat, James Dinan, Sriram Krishnamoorthy, Pavan Balaji, P. Sadayappan:
Work stealing for GPU-accelerated parallel programs in a global address space framework. Concurr. Comput. Pract. Exp. 28(13): 3637-3654 (2016) - [j60]Qingpeng Niu, James Dinan, Sravya Tirukkovalur, Anouar Benali, Jeongnim Kim, Lubos Mitas, Lucas K. Wagner, P. Sadayappan:
Global-view coefficients: a data management solution for parallel quantum Monte Carlo applications. Concurr. Comput. Pract. Exp. 28(13): 3655-3671 (2016) - [j59]Wenlei Bao, Changwan Hong, Sudheer Chunduri, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan:
Static and Dynamic Frequency Scaling on Multicore CPUs. ACM Trans. Archit. Code Optim. 13(4): 51:1-51:26 (2016) - [c233]Prashant Singh Rawat, Changwan Hong, Mahesh Ravishankar, Vinod Grover, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan:
Resource Conscious Reuse-Driven Tiling for GPUs. PACT 2016: 99-111 - [c232]Lukasz Domagala, Duco van Amstel, Fabrice Rastello, P. Sadayappan:
Register allocation and promotion through combined instruction scheduling and loop unrolling. CC 2016: 143-151 - [c231]Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, Robert J. Harrison, P. Sadayappan:
On fusing recursive traversals of K-d trees. CC 2016: 152-162 - [c230]Sanket Tavarageri, Wooil Kim, Josep Torrellas, P. Sadayappan:
Compiler Support for Software Cache Coherence. HiPC 2016: 341-350 - [c229]Wooil Kim, Sanket Tavarageri, P. Sadayappan, Josep Torrellas:
Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy. IPDPS 2016: 555-565 - [c228]Rajkumar Kettimuthu, Gagan Agrawal, P. Sadayappan, Ian T. Foster:
Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers. IPDPS 2016: 1113-1122 - [c227]Changwan Hong, Wenlei Bao, Albert Cohen, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, J. Ramanujam, P. Sadayappan:
Effective padding of multidimensional arrays to avoid cache conflict misses. PLDI 2016: 129-144 - [c226]Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan:
PolyCheck: dynamic verification of iteration space transformations on affine programs. POPL 2016: 539-554 - [c225]Prashant Singh Rawat, Changwan Hong, Mahesh Ravishankar, Vinod Grover, Louis-Noël Pouchet, P. Sadayappan:
Effective resource management for enhancing performance of 2D and 3D stencils on GPUs. GPGPU@PPoPP 2016: 92-102 - [c224]Martin Kong, Louis-Noël Pouchet, P. Sadayappan, Vivek Sarkar:
PIPES: a language and compiler for task-based programming on distributed-memory clusters. SC 2016: 456-467 - [c223]Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, Robert J. Harrison, P. Sadayappan:
A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment. SC 2016: 468-479 - [c222]Timothy Carpenter, Fabrice Rastello, P. Sadayappan, Anastasios Sidiropoulos:
Brief Announcement: Approximating the I/O Complexity of One-Shot Red-Blue Pebbling. SPAA 2016: 161-163 - 2015
- [j58]Arash Ashari, Naser Sedaghati, John Eisenlohr, P. Sadayappan:
A model-driven blocking strategy for load balanced sparse matrix-vector multiplication on GPUs. J. Parallel Distributed Comput. 76: 3-15 (2015) - [j57]Keshav Pingali, J. Ramanujam, P. Sadayappan:
Introduction to the Special Issue on PPoPP'12. ACM Trans. Parallel Comput. 1(2): 9:1-9:2 (2015) - [c221]Naznin Fauzia, Louis-Noël Pouchet, P. Sadayappan:
Characterizing and enhancing global memory data coalescing on GPUs. CGO 2015: 12-22 - [c220]Naser Sedaghati, Te Mu, Louis-Noël Pouchet, Srinivasan Parthasarathy, P. Sadayappan:
Automatic Selection of Sparse Matrix Representation on GPUs. ICS 2015: 99-108 - [c219]Tobias Grosser, Jagannathan Ramanujam, Louis-Noël Pouchet, P. Sadayappan, Sebastian Pop:
Optimistic Delinearization of Parametrically Sized Arrays. ICS 2015: 351-360 - [c218]Ponnuswamy Sadayappan, Ray-Bing Chen:
iWAPT Invited Talks. IPDPS Workshops 2015: 1202-1203 - [c217]Martin Kong, Louis-Noël Pouchet, Ponnuswamy Sadayappan:
A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC. IPDPS Workshops 2015: 1241-1250 - [c216]Venmugil Elango, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, P. Sadayappan:
On Characterizing the Data Access Complexity of Programs. POPL 2015: 567-580 - [c215]Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan:
Distributed memory code generation for mixed Irregular/Regular computations. PPoPP 2015: 65-75 - [c214]Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan:
On optimizing machine learning workloads via kernel fusion. PPoPP 2015: 173-182 - [c213]Prashant Singh Rawat, Martin Kong, Thomas Henretty, Justin Holewinski, Kevin Stock, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan:
SDSLc: a multi-target domain-specific compiler for stencil computations. WOLFHPC@SC 2015: 6:1-6:10 - [c212]Rajkumar Kettimuthu, Gayane Vardoyan, Gagan Agrawal, P. Sadayappan, Ian T. Foster:
An elegant sufficiency: load-aware differentiated scheduling of data transfers. SC 2015: 46:1-46:12 - 2014
- [j56]Sriram Krishnamoorthy, J. Ramanujam, P. Sadayappan:
Introduction to the JPDC Special Issue on Domain-Specific Languages and High-Level Frameworks for High-Performance Computing. J. Parallel Distributed Comput. 74(12): 3175 (2014) - [j55]Tobias Grosser, Sven Verdoolaege, Albert Cohen, P. Sadayappan:
The Relation Between Diamond Tiling and Hexagonal Tiling. Parallel Process. Lett. 24(3) (2014) - [j54]Martin Kong, Antoniu Pop, Louis-Noël Pouchet, R. Govindarajan, Albert Cohen, P. Sadayappan:
Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs. ACM Trans. Archit. Code Optim. 11(4): 61:1-61:30 (2014) - [j53]Venmugil Elango, Naser Sedaghati, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, Radu Teodorescu, P. Sadayappan:
On Using the Roofline Model with Lower Bounds on Data Movement. ACM Trans. Archit. Code Optim. 11(4): 67:1-67:23 (2014) - [j52]Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan:
Automatic parallelization of a class of irregular loops for distributed memory systems. ACM Trans. Parallel Comput. 1(1): 7:1-7:37 (2014) - [c211]S. M. Faisal, Srinivasan Parthasarathy, P. Sadayappan:
Global graphs: A middleware for large scale graph processing. IEEE BigData 2014: 33-40 - [c210]Rajkumar Kettimuthu, Gayane Vardoyan, Gagan Agrawal, P. Sadayappan:
Modeling and Optimizing Large-Scale Wide-Area Data Transfers. CCGRID 2014: 196-205 - [c209]Tobias Grosser, Albert Cohen, Justin Holewinski, P. Sadayappan, Sven Verdoolaege:
Hybrid Hexagonal/Classical Tiling for GPUs. CGO 2014: 66 - [c208]Qingpeng Niu, Pai-Wei Lai, S. M. Faisal, Srinivasan Parthasarathy, P. Sadayappan:
A fast implementation of MLR-MCL algorithm on multi-core processors. HiPC 2014: 1-10 - [c207]Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, Sriram Krishnamoorthy, P. Sadayappan:
CAST: Contraction Algorithm for Symmetric Tensors. ICPP 2014: 261-272 - [c206]Humayun Arafat, Sriram Krishnamoorthy, P. Sadayappan:
Checksumming Strategies for Data in Volatile Memories. ICPP Workshops 2014: 245-254 - [c205]Wenlei Bao, Sanket Tavarageri, Füsun Özgüner, P. Sadayappan:
PWCET: Power-Aware Worst Case Execution Time Analysis. ICPP Workshops 2014: 439-447 - [c204]Arash Ashari, Naser Sedaghati, John Eisenlohr, P. Sadayappan:
An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs. ICS 2014: 273-282 - [c203]Shoaib Kamil, Saman P. Amarasinghe, P. Sadayappan:
WOSC 2014: second workshop on optimizing stencil computations. SPLASH (Companion Volume) 2014: 89-90 - [c202]Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noël Pouchet, Fabrice Rastello, J. Ramanujam, P. Sadayappan:
A framework for enhancing data reuse via associative reordering. PLDI 2014: 65-76 - [c201]Sanket Tavarageri, Sriram Krishnamoorthy, P. Sadayappan:
Compiler-assisted detection of transient memory errors. PLDI 2014: 204-215 - [c200]