default search action
Piotr Luszczek
Person information
- affiliation: University of Tennessee, Knoxville, TN, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j50]Piotr Luszczek, Ahmad Abdelfattah, Hartwig Anzt, Atsushi Suzuki, Stanimire Tomov:
Batched sparse and mixed-precision linear algebra interface for efficient use of GPU hardware accelerators in scientific applications. Future Gener. Comput. Syst. 160: 359-374 (2024) - 2023
- [j49]Piotr Luszczek, Wissam M. Sid-Lakhdar, Jack J. Dongarra:
Combining multitask and transfer learning with deep Gaussian processes for autotuning-based performance engineering. Int. J. High Perform. Comput. Appl. 37(3-4): 229-244 (2023) - [c82]Piotr Luszczek, Tokey Tahmid:
Towards the FAIR Asset Tracking Across Models, Datasets, and Performance Evaluation Scenarios. HPEC 2023: 1-6 - [c81]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Using Additive Modifications in LU Factorization Instead of Pivoting. ICS 2023: 14-24 - [c80]Wissam M. Sid-Lakhdar, Sébastien Cayrols, Daniel Bielich, Ahmad Abdelfattah, Piotr Luszczek, Mark Gates, Stanimire Tomov, Hans Johansen, David B. Williams-Young, Timothy A. Davis, Jack J. Dongarra, Hartwig Anzt:
PAQR: Pivoting Avoiding QR factorization. IPDPS 2023: 322-332 - [c79]Ahmad Abdelfattah, Stanimire Tomov, Piotr Luszczek, Hartwig Anzt, Jack J. Dongarra:
GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure. SC Workshops 2023: 1670-1679 - [i10]Riley Murray, James Demmel, Michael W. Mahoney, N. Benjamin Erichson, Maksim Melnichenko, Osman Asif Malik, Laura Grigori, Piotr Luszczek, Michal Derezinski, Miles E. Lopes, Tianyu Liang, Hengrui Luo, Jack J. Dongarra:
Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software. CoRR abs/2302.11474 (2023) - [i9]Maksim Melnichenko, Oleg Balabanov, Riley Murray, James Demmel, Michael W. Mahoney, Piotr Luszczek:
CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT). CoRR abs/2311.08316 (2023) - [i8]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes. CoRR abs/2312.09376 (2023) - 2022
- [j48]Cody J. Balos, Piotr Luszczek, Sarah Osborn, James M. Willenbring, Ulrike Meier Yang:
Challenges of and Opportunities for a Large Diverse Software Team. Comput. Sci. Eng. 24(3): 16-24 (2022) - [j47]Dmitry A. Zaitsev, Tatiana R. Shmeleva, Piotr Luszczek:
Aggregation of clans to speed-up solving linear systems on parallel architectures. Int. J. Parallel Emergent Distributed Syst. 37(2): 198-219 (2022) - [j46]Seonmyeong Bak, Colleen Bertoni, Swen Boehm, Reuben D. Budiardja, Barbara M. Chapman, Johannes Doerfert, Markus Eisenbach, Hal Finkel, Oscar R. Hernandez, Joseph Huber, Shintaro Iwasaki, Vivek Kale, Paul R. C. Kent, JaeHyuk Kwack, Meifeng Lin, Piotr Luszczek, Ye Luo, Buu Pham, Swaroop Pophale, Kiran Ravikumar, Vivek Sarkar, Thomas Scogland, Shilei Tian, P. K. Yeung:
OpenMP application experiences: Porting to accelerated nodes. Parallel Comput. 109: 102856 (2022) - [j45]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Accelerating Restarted GMRES With Mixed Precision Arithmetic. IEEE Trans. Parallel Distributed Syst. 33(4): 1027-1037 (2022) - [c78]James Demmel, Jack J. Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye S. Li, Piotr Luszczek, Weslley S. Pereira, E. Jason Riedy, Cindy Rubio-González:
Proposed Consistent Exception Handling for the BLAS and LAPACK. Correctness@SC 2022: 1-9 - [c77]Piotr Luszczek, Cade Brown:
Surrogate ML/AI Model Benchmarking for FAIR Principles' Conformance. HPEC 2022: 1-5 - [c76]Wissam M. Sid-Lakhdar, Mohsen Aznaveh, Piotr Luszczek, Jack J. Dongarra:
Deep Gaussian process with multitask and transfer learning for performance optimization. HPEC 2022: 1-7 - [c75]Ichitaro Yamazaki, Christian Glusa, Jennifer A. Loe, Piotr Luszczek, Sivasankaran Rajamanickam, Jack J. Dongarra:
High-Performance GMRES Multi-Precision Benchmark: Design, Performance, and Challenges. PMBS@SC 2022: 112-122 - [c74]Neil Lindquist, Mark Gates, Piotr Luszczek, Jack J. Dongarra:
Threshold Pivoting for Dense LU Factorization. ScalAH@SC 2022: 34-42 - [c73]Yaohung M. Tsai, Piotr Luszczek, Jack J. Dongarra:
Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices1. ScalAH@SC 2022: 43-50 - [c72]Jeyan Thiyagalingam, Gregor von Laszewski, Junqi Yin, Murali Emani, Juri Papay, Gregg Barrett, Piotr Luszczek, Aristeidis Tsaris, Christine R. Kirkpatrick, Feiyi Wang, Tom Gibbs, Venkatram Vishwanath, Mallikarjun Shankar, Geoffrey C. Fox, Tony Hey:
AI Benchmarking for Science: Efforts from the MLCommons Science Working Group. ISC Workshops 2022: 47-64 - [e4]Ana Lucia Varbanescu, Abhinav Bhatele, Piotr Luszczek, Marc Baboulin:
High Performance Computing - 37th International Conference, ISC High Performance 2022, Hamburg, Germany, May 29 - June 2, 2022, Proceedings. Lecture Notes in Computer Science 13289, Springer 2022, ISBN 978-3-031-07311-3 [contents] - [e3]Hartwig Anzt, Amanda Bienz, Piotr Luszczek, Marc Baboulin:
High Performance Computing. ISC High Performance 2022 International Workshops - Hamburg, Germany, May 29 - June 2, 2022, Revised Selected Papers. Lecture Notes in Computer Science 13387, Springer 2022, ISBN 978-3-031-23219-0 [contents] - [d3]Neil Lindquist, Mark Gates, Piotr Luszczek, Jack J. Dongarra:
Software for "Threshold Pivoting for dense LU Factorization". Version 2. Zenodo, 2022 [all versions] - [d2]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Software for "Threshold Pivoting in LU Factorizations". Version 1. Zenodo, 2022 [all versions] - [i7]James Demmel, Jack J. Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye S. Li, Piotr Luszczek, Weslley da Silva Pereira, E. Jason Riedy, Cindy Rubio-González:
Proposed Consistent Exception Handling for the BLAS and LAPACK. CoRR abs/2207.09281 (2022) - 2021
- [j44]Adam Spannaus, Kody J. H. Law, Piotr Luszczek, Farzana Nasrin, Cassie Putman Micucci, Peter K. Liaw, Louis Joseph Santodonato, David J. Keffer, Vasileios Maroulas:
Materials Fingerprinting Classification. Comput. Phys. Commun. 266: 108019 (2021) - [j43]Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin C. Carson, Terry Cojean, Jack J. Dongarra, Alyson Fox, Mark Gates, Nicholas J. Higham, Xiaoye S. Li, Jennifer A. Loe, Piotr Luszczek, Srikara Pranesh, Siva Rajamanickam, Tobias Ribizel, Barry F. Smith, Kasia Swirydowicz, Stephen J. Thomas, Stanimire Tomov, Yaohung M. Tsai, Ulrike Meier Yang:
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4) (2021) - [j42]Jack J. Dongarra, Mark Gates, Piotr Luszczek, Stanimire Tomov:
Translational process: Mathematical software perspective. J. Comput. Sci. 52: 101216 (2021) - [j41]Ahmad Abdelfattah, Timothy B. Costa, Jack J. Dongarra, Mark Gates, Azzam Haidar, Sven Hammarling, Nicholas J. Higham, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Mawussi Zounon:
A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines. ACM Trans. Math. Softw. 47(3): 21:1-21:23 (2021) - [c71]Seonmyeong Bak, Oscar R. Hernandez, Mark Gates, Piotr Luszczek, Vivek Sarkar:
Task-graph scheduling extensions for efficient synchronization and communication. ICS 2021: 88-101 - [e2]Bradford L. Chamberlain, Ana Lucia Varbanescu, Hatem Ltaief, Piotr Luszczek:
High Performance Computing - 36th International Conference, ISC High Performance 2021, Virtual Event, June 24 - July 2, 2021, Proceedings. Lecture Notes in Computer Science 12728, Springer 2021, ISBN 978-3-030-78712-7 [contents] - [e1]Heike Jagode, Hartwig Anzt, Hatem Ltaief, Piotr Luszczek:
High Performance Computing - ISC High Performance Digital 2021 International Workshops, Frankfurt am Main, Germany, June 24 - July 2, 2021, Revised Selected Papers. Lecture Notes in Computer Science 12761, Springer 2021, ISBN 978-3-030-90538-5 [contents] - [i6]Adam Spannaus, Kody J. H. Law, Piotr Luszczek, Farzana Nasrin, Cassie Putman Micucci, Peter K. Liaw, Louis Joseph Santodonato, David J. Keffer, Vasileios Maroulas:
Materials Fingerprinting Classification. CoRR abs/2101.05808 (2021) - 2020
- [c70]Dmitry Zaitsev, Piotr Luszczek:
Docker container based PaaS cloud computing comprehensive benchmarks using LAPACK. CMIS 2020: 323-337 - [c69]Piotr Luszczek, Yaohung M. Tsai, Neil Lindquist, Hartwig Anzt, Jack J. Dongarra:
Scalable Data Generation for Evaluating Mixed-Precision Solvers. HPEC 2020: 1-6 - [c68]Yu Pei, Qinglei Cao, George Bosilca, Piotr Luszczek, Victor Eijkhout, Jack J. Dongarra:
Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime. IPDPS Workshops 2020: 721-729 - [c67]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques. ScalA@SC 2020: 35-43 - [c66]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Improving the Performance of the GMRES Method Using Mixed-Precision Techniques. SMC 2020: 51-66 - [d1]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Software for Linear Algebra Targeting Exascale (SLATE) with a Recursive Butterfly Transform based solver. Zenodo, 2020 - [i5]Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin C. Carson, Terry Cojean, Jack J. Dongarra, Mark Gates, Thomas Grützmacher, Nicholas J. Higham, Xiaoye Sherry Li, Neil Lindquist, Yang Liu, Jennifer A. Loe, Piotr Luszczek, Pratik Nayak, Srikara Pranesh, Sivasankaran Rajamanickam, Tobias Ribizel, Barry Smith, Kasia Swirydowicz, Stephen J. Thomas, Stanimire Tomov, Yaohung M. Tsai, Ichitaro Yamazaki, Ulrike Meier Yang:
A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic. CoRR abs/2007.06674 (2020) - [i4]Neil Lindquist, Piotr Luszczek, Jack J. Dongarra:
Improving the Performance of the GMRES Method using Mixed-Precision Techniques. CoRR abs/2011.01850 (2020) - [i3]Seonmyeong Bak, Oscar R. Hernandez, Mark Gates, Piotr Luszczek, Vivek Sarkar:
Task-Graph Scheduling Extensions for Efficient Synchronization and Communication. CoRR abs/2011.03196 (2020)
2010 – 2019
- 2019
- [j40]Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Panruo Wu, Ichitaro Yamazaki, Asim YarKhan, Maksims Abalenkovs, Negin Bagherpour, Sven Hammarling, Jakub Sístek, David Stevens, Mawussi Zounon, Samuel D. Relton:
PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP. ACM Trans. Math. Softw. 45(2): 16:1-16:35 (2019) - [c65]Piotr Luszczek, Ichitaro Yamazaki, Jack J. Dongarra:
Increasing Accuracy of Iterative Refinement in Limited Floating-Point Arithmetic on Half-Precision Accelerators. HPEC 2019: 1-6 - [c64]Anthony Danalis, Heike Jagode, Thomas Hérault, Piotr Luszczek, Jack J. Dongarra:
Software-Defined Events through PAPI. IPDPS Workshops 2019: 363-372 - 2018
- [j39]Joseph Dorris, Asim YarKhan, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Task based Cholesky decomposition on Xeon Phi architectures using OpenMP. Int. J. Comput. Sci. Eng. 17(3): 310-323 (2018) - [j38]Jack J. Dongarra, Mark Gates, Jakub Kurzak, Piotr Luszczek, Yaohung M. Tsai:
Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators. Proc. IEEE 106(11): 2040-2055 (2018) - [j37]Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki:
The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale. SIAM Rev. 60(4): 808-865 (2018) - [j36]Piotr Luszczek, Jakub Kurzak, Ichitaro Yamazaki, David J. Keffer, Vasileios Maroulas, Jack J. Dongarra:
Autotuning Techniques for Performance-Portable Point Set Registration in 3D. Supercomput. Front. Innov. 5(4): 42-61 (2018) - 2017
- [j35]Jack J. Dongarra, Stanimire Tomov, Piotr Luszczek, Jakub Kurzak, Mark Gates, Ichitaro Yamazaki, Hartwig Anzt, Azzam Haidar, Ahmad Abdelfattah:
With Extreme Computing, the Rules Have Changed. Comput. Sci. Eng. 19(3): 52-62 (2017) - [j34]Asim YarKhan, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Porting the PLASMA Numerical Library to the OpenMP Standard. Int. J. Parallel Program. 45(3): 612-633 (2017) - [j33]Jakub Kurzak, Piotr Luszczek, Ichitaro Yamazaki, Yves Robert, Jack J. Dongarra:
Design and Implementation of the PULSAR Programming System for Large Scale Computing. Supercomput. Front. Innov. 4(1): 4-26 (2017) - [c63]Piotr Luszczek, Jakub Kurzak, Ichitaro Yamazaki, David J. Keffer, Jack J. Dongarra:
Scaling point set registration in 3D across thread counts on multicore and hardware accelerator platforms through autotuning for large scale analysis of scientific point clouds. IEEE BigData 2017: 2893-2902 - [c62]Piotr Luszczek, Jakub Kurzak, Ichitaro Yamazaki, Jack J. Dongarra:
Towards numerical benchmark for half-precision floating point arithmetic. HPEC 2017: 1-5 - [c61]Ichitaro Yamazaki, Mark Hoemmen, Piotr Luszczek, Jack J. Dongarra:
Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives. IPDPS Workshops 2017: 1118-1127 - [c60]Mark Gates, Jakub Kurzak, Piotr Luszczek, Yu Pei, Jack J. Dongarra:
Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices. IPDPS Workshops 2017: 1408-1417 - [p3]Hartwig Anzt, Jack J. Dongarra, Mark Gates, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki:
Bringing High Performance Computing to Big Data Algorithms. Handbook of Big Data Technologies 2017: 777-806 - [i2]Micah Beck, Terry Moore, Piotr Luszczek:
Interoperable Convergence of Storage, Networking and Computation. CoRR abs/1706.07519 (2017) - 2016
- [j32]Ahmad Abdelfattah, Hartwig Anzt, Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Asim YarKhan:
Linear algebra software for large-scale accelerated multicore computing. Acta Numer. 25: 1-160 (2016) - [j31]Jack J. Dongarra, Michael A. Heroux, Piotr Luszczek:
High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems. Int. J. High Perform. Comput. Appl. 30(1): 3-10 (2016) - [c59]Chris J. Newburn, Gaurav Bansal, Michael Wood, Luis Crivelli, Judit Planas, Alejandro Duran, Paulo Souza, Leonardo Borges, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra, Hartwig Anzt, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Ichitaro Yamazaki, Jesús Labarta:
Heterogeneous Streaming. IPDPS Workshops 2016: 611-620 - [c58]Yulu Jia, Piotr Luszczek, Jack J. Dongarra:
Hessenberg Reduction with Transient Error Resilience on GPU-Based Hybrid Architectures. IPDPS Workshops 2016: 653-662 - [c57]Piotr Luszczek, Mark Gates, Jakub Kurzak, Anthony Danalis, Jack J. Dongarra:
Search Space Generation and Pruning System for Autotuners. IPDPS Workshops 2016: 1545-1554 - [c56]Yaohung M. Tsai, Piotr Luszczek, Jakub Kurzak, Jack J. Dongarra:
Performance-Portable Autotuning of OpenCL Kernels for Convolutional Layers of Deep Neural Networks. MLHPC@SC 2016: 9-18 - [c55]Joseph Dorris, Jakub Kurzak, Piotr Luszczek, Asim YarKhan, Jack J. Dongarra:
Task-Based Cholesky Decomposition on Knights Corner Using OpenMP. ISC Workshops 2016: 544-562 - 2015
- [j30]Simplice Donfack, Jack J. Dongarra, Mathieu Faverge, Mark Gates, Jakub Kurzak, Piotr Luszczek, Ichitaro Yamazaki:
A survey of recent developments in parallel implementations of Gaussian elimination. Concurr. Comput. Pract. Exp. 27(5): 1292-1309 (2015) - [j29]Hartwig Anzt, Blake Haugen, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Experiences in autotuning matrix multiplication for energy minimization on GPUs. Concurr. Comput. Pract. Exp. 27(17): 5096-5113 (2015) - [j28]Hartwig Anzt, Stanimire Tomov, Piotr Luszczek, William B. Sawyer, Jack J. Dongarra:
Acceleration of GPU-based Krylov solvers via data transfer reduction. Int. J. High Perform. Comput. Appl. 29(3): 366-383 (2015) - [j27]Jack J. Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, Stanimire Tomov:
HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi. Sci. Program. 2015: 502593:1-502593:11 (2015) - [j26]Jack J. Dongarra, Maksims Abalenkovs, Ahmad Abdelfattah, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Asim YarKhan:
Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems. Supercomput. Front. Innov. 2(4): 67-86 (2015) - [c54]Azzam Haidar, Asim YarKhan, Chongxiao Cao, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Flexible Linear Algebra Development and Scheduling with Cholesky Factorization. HPCC/CSS/ICESS 2015: 861-864 - [c53]Azzam Haidar, Stanimire Tomov, Piotr Luszczek, Jack J. Dongarra:
MAGMA embedded: Towards a dense linear algebra library for energy efficient extreme computing. HPEC 2015: 1-6 - [c52]Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Optimization for performance and energy for batched matrix computations on GPUs. GPGPU@PPoPP 2015: 59-69 - [c51]Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Towards batched linear solvers on accelerated hardware platforms. PPoPP 2015: 261-262 - [c50]Azzam Haidar, Yulu Jia, Piotr Luszczek, Stanimire Tomov, Asim YarKhan, Jack J. Dongarra:
Weighted dynamic scheduling with many parallelism grains for offloading of numerical workloads to multiple varied accelerators. ScalA@SC 2015: 5:1-5:8 - [c49]Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU cluster. SC 2015: 59:1-59:12 - [c48]Théo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs. SC 2015: 60:1-60:11 - [c47]Azzam Haidar, Tingxing Tim Dong, Stanimire Tomov, Piotr Luszczek, Jack J. Dongarra:
A Framework for Batched and GPU-Resident Factorization Algorithms Applied to Block Householder Transformations. ISC 2015: 31-47 - 2014
- [j25]Anthony Danalis, Piotr Luszczek, Gabriel Marin, Jeffrey S. Vetter, Jack J. Dongarra:
BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis. Comput. J. 57(7): 1002-1016 (2014) - [j24]Jack J. Dongarra, Mathieu Faverge, Hatem Ltaief, Piotr Luszczek:
Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting. Concurr. Comput. Pract. Exp. 26(7): 1408-1431 (2014) - [j23]Piotr Luszczek, Jakub Kurzak, Jack J. Dongarra:
Looking back at dense linear algebra software. J. Parallel Distributed Comput. 74(7): 2548-2560 (2014) - [j22]Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime. Parallel Process. Lett. 24(4) (2014) - [j21]Jack J. Dongarra, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Asim YarKhan:
Model-Driven One-Sided Factorizations on Multicore Accelerated Systems. Supercomput. Front. Innov. 1(1): 85-115 (2014) - [c46]Tingxing Dong, Azzam Haidar, Piotr Luszczek, James Austin Harris, Stanimire Tomov, Jack J. Dongarra:
LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU. HPCC/CSS/ICESS 2014: 157-160 - [c45]Blake Haugen, Jakub Kurzak, Asim YarKhan, Piotr Luszczek, Jack J. Dongarra:
Parallel Simulation of Superscalar Scheduling. ICPP 2014: 121-130 - [c44]Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack J. Dongarra:
Unified Development for Mixed Multi-GPU and Multi-coprocessor Environments Using a Lightweight Runtime Environment. IPDPS 2014: 491-500 - [c43]Hartwig Anzt, William B. Sawyer, Stanimire Tomov, Piotr Luszczek, Ichitaro Yamazaki, Jack J. Dongarra:
Optimizing Krylov Subspace Solvers on Graphics Processing Units. IPDPS Workshops 2014: 941-949 - [c42]Azzam Haidar, Piotr Luszczek, Jack J. Dongarra:
New Algorithm for Computing Eigenvectors of the Symmetric Eigenvalue Problem. IPDPS Workshops 2014: 1150-1159 - [c41]Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime. IPDPS Workshops 2014: 1495-1504 - [c40]Chongxiao Cao, Jack J. Dongarra, Peng Du, Mark Gates, Piotr Luszczek, Stanimire Tomov:
clMAGMA: high performance dense linear algebra with OpenCL. IWOCL 2014: 1:1-1:9 - [c39]Chongxiao Cao, Mark Gates, Azzam Haidar, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Jack J. Dongarra:
Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processors. ScalA@SC 2014: 61-68 - [c38]Azzam Haidar, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments. VECPAR 2014: 31-42 - [p2]Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki:
Accelerating Numerical Dense Linear Algebra Calculations with GPUs. Numerical Computations with GPUs 2014: 3-28 - 2013
- [j20]Peng Du, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Soft error resilient QR factorization for hybrid system with GPGPU. J. Comput. Sci. 4(6): 457-464 (2013) - [j19]Hatem Ltaief, Piotr Luszczek, Jack J. Dongarra:
High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures. ACM Trans. Math. Softw. 39(3): 16:1-16:22 (2013) - [j18]Jakub Kurzak, Piotr Luszczek, Mathieu Faverge, Jack J. Dongarra:
LU Factorization with Partial Pivoting for a Multicore System with Accelerators. IEEE Trans. Parallel Distributed Syst. 24(8): 1613-1621 (2013) - [c37]Guillaume Aupy, Mathieu Faverge, Yves Robert, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra:
Implementing a Systolic Algorithm for QR Factorization on Multicore Clusters with PaRSEC. Euro-Par Workshops 2013: 657-667 - [c36]Jakub Kurzak, Piotr Luszczek, Mark Gates, Ichitaro Yamazaki, Jack J. Dongarra:
Virtual Systolic Array for QR Decomposition. IPDPS 2013: 251-260 - [c35]Jack J. Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, Stanimire Tomov:
Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi. PPAM (1) 2013: 571-581 - [c34]Yulu Jia, Piotr Luszczek, George Bosilca, Jack J. Dongarra:
CPU-GPU hybrid bidiagonal reduction with soft error resilience. ScalA@SC 2013: 2:1-2:5 - [c33]Yulu Jia, George Bosilca, Piotr Luszczek, Jack J. Dongarra:
Parallel reduction to hessenberg form with algorithm-based fault tolerance. SC 2013: 88:1-88:11 - [c32]Azzam Haidar, Jakub Kurzak, Piotr Luszczek:
An improved parallel singular value algorithm and its implementation for multicore hardware. SC 2013: 90:1-90:12 - 2012
- [j17]Hatem Ltaief, Piotr Luszczek, Jack J. Dongarra:
Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency. Comput. Sci. Res. Dev. 27(4): 277-287 (2012) - [j16]Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory D. Peterson, Jack J. Dongarra:
From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming. Parallel Comput. 38(8): 391-407 (2012) - [j15]Anthony Danalis, Piotr Luszczek, Gabriel Marin, Jeffrey S. Vetter, Jack J. Dongarra:
BlackjackBench: portable hardware characterization. SIGMETRICS Perform. Evaluation Rev. 40(2): 74-79 (2012) - [c31]Jack J. Dongarra, Hatem Ltaief, Piotr Luszczek, Vincent M. Weaver:
Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures. CGC 2012: 274-281 - [c30]Hartwig Anzt, Piotr Luszczek, Jack J. Dongarra, Vincent Heuveline:
GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement. Euro-Par 2012: 908-919 - [c29]George Bosilca, Aurélien Bouteiller, Anthony Danalis, Thomas Hérault, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Scalable Dense Linear Algebra on Heterogeneous Hardware. High Performance Computing Workshop (2) 2012: 65-103 - [c28]Jack J. Dongarra, Piotr Luszczek:
Anatomy of a globally recursive embedded LINPACK benchmark. HPEC 2012: 1-6 - [c27]Vincent M. Weaver, Matt Johnson, Kiran Kasichayanula, James Ralph, Piotr Luszczek, Daniel Terpstra, Shirley Moore:
Measuring Energy and Power with PAPI. ICPP Workshops 2012: 262-268 - [c26]