


Остановите войну!
for scientists:
Torsten Hoefler
Torsten Höfler
Person information

- affiliation: ETH Zürich
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2022
- [j52]Torsten Hoefler, Ariel Hendel, Duncan Roweth:
The Convergence of Hyperscale Data Center and High-Performance Computing Networks. Computer 55(7): 29-37 (2022) - [j51]Torsten Hoefler:
Benchmarking Data Science: 12 Ways to Lie With Statistics and Performance on Parallel Computers. Computer 55(8): 49-56 (2022) - [j50]Marcin Copik
, Tobias Grosser, Torsten Hoefler, Paolo Bientinesi, Benjamin Berkels:
Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration. IEEE Trans. Parallel Distributed Syst. 33(3): 523-535 (2022) - [c219]Andrea Cossettini, Konstantin Taranov, Christian Vogt, Michele Magno, Torsten Hoefler, Luca Benini:
A RDMA Interface for Ultra-Fast Ultrasound Data-Streaming over an Optical Link. DATE 2022: 80-83 - [c218]Niels Gleinig, Torsten Hoefler:
Circuits for Measurement Based Quantum State Preparation. DATE 2022: 328-333 - [c217]Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler:
Fast Arbitrary Precision Floating Point on FPGA. FCCM 2022: 1-9 - [c216]Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Dominik Werle, Andreas Reiter, Michael Selzer, Anne Koziolek, Torsten Hoefler:
Performance-detective: automatic deduction of cheap and accurate performance models. ICS 2022: 3:1-3:13 - [c215]Alexandru Calotoiu, Tal Ben-Nun, Grzegorz Kwasniewski, Johannes de Fine Licht, Timo Schneider, Philipp Schaad, Torsten Hoefler:
Lifting C semantics for dataflow optimization. ICS 2022: 17:1-17:13 - [c214]Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler:
A data-centric optimization framework for machine learning. ICS 2022: 36:1-36:13 - [c213]Andrei Lascu, Alastair F. Donaldson, Tobias Grosser, Torsten Hoefler:
Metamorphic Fuzzing of C++ Libraries. ICST 2022: 35-46 - [c212]Niels Gleinig, Maciej Besta, Torsten Hoefler:
I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication. IPDPS 2022: 36-46 - [c211]András Strausz, Flavio Vella, Salvatore Di Girolamo, Maciej Besta, Torsten Hoefler:
Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching. IPDPS 2022: 291-301 - [c210]Shigang Li, Torsten Hoefler:
Near-optimal sparse allreduce for distributed deep learning. PPoPP 2022: 135-149 - [c209]Konstantin Taranov, Steve Byan, Virendra J. Marathe, Torsten Hoefler:
KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks. SIGMOD Conference 2022: 2191-2204 - [c208]Niels Gleinig, Torsten Hoefler:
The Red-Blue Pebble Game on Trees and DAGs with Large Input. SIROCCO 2022: 135-153 - [i106]Shigang Li, Torsten Hoefler:
Near-Optimal Sparse Allreduce for Distributed Deep Learning. CoRR abs/2201.07598 (2022) - [i105]Konstantin Taranov, Benjamin Rothenberger, Daniele De Sensi, Adrian Perrig, Torsten Hoefler:
NeVerMore: Exploiting RDMA Mistakes in NVMe-oF Storage Applications. CoRR abs/2202.08080 (2022) - [i104]András Strausz, Flavio Vella, Salvatore Di Girolamo, Maciej Besta, Torsten Hoefler:
Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching. CoRR abs/2202.13976 (2022) - [i103]Marcin Copik, Alexandru Calotoiu, Konstantin Taranov, Torsten Hoefler:
FaasKeeper: a Blueprint for Serverless Services. CoRR abs/2203.14859 (2022) - [i102]Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler:
Fast Arbitrary Precision Floating Point on FPGA. CoRR abs/2204.06256 (2022) - [i101]Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer, Thomas C. Schulthess, Torsten Hoefler:
Productive Performance Engineering for Weather and Climate Modeling with Python. CoRR abs/2205.04148 (2022) - [i100]Lukas Gianinazzi, Tal Ben-Nun, Saleh Ashkboos, Yves Baumann, Piotr Luczynski, Torsten Hoefler:
The spatial computer: A model for energy-efficient parallel computation. CoRR abs/2205.04934 (2022) - [i99]Maciej Besta, Torsten Hoefler:
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis. CoRR abs/2205.09702 (2022) - [i98]Alexandros Nikolaos Ziogas, Grzegorz Kwasniewski, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
Deinsum: Practically I/O Optimal Multilinear Algebra. CoRR abs/2206.08301 (2022) - [i97]Salvatore Di Girolamo, Daniele De Sensi, Konstantin Taranov, Milos Malesevic, Maciej Besta, Timo Schneider, Severin Kistler, Torsten Hoefler:
Building Blocks for Network-Accelerated Distributed File Systems. CoRR abs/2206.10007 (2022) - [i96]Saleh Ashkboos, Langwen Huang, Nikoli Dryden, Tal Ben-Nun, Peter Dueben, Lukas Gianinazzi, Luca Kummer, Torsten Hoefler:
ENS-10: A Dataset For Post-Processing Ensemble Weather Forecast. CoRR abs/2206.14786 (2022) - [i95]Philipp Schaad, Tal Ben-Nun, Torsten Hoefler:
Boosting Performance Optimization with Interactive Data Movement Visualization. CoRR abs/2207.07433 (2022) - [i94]Kartik Lakhotia, Maciej Besta, Laura Monroe, Kelly Isham, Patrick Iff, Torsten Hoefler, Fabrizio Petrini:
PolarFly: A Cost-Effective and Flexible Low-Diameter Topology. CoRR abs/2208.01695 (2022) - 2021
- [j49]Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, Alexandra Peste:
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 22: 241:1-241:124 (2021) - [j48]Arjun Pitchanathan, Christian Ulmann, Michel Weber, Torsten Hoefler, Tobias Grosser:
FPL: fast Presburger arithmetic through transprecision. Proc. ACM Program. Lang. 5(OOPSLA): 1-26 (2021) - [j47]Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beránek, Kacper Janda, Tobias Holenstein, Sebastian Leisinger, Peter Tatkowski, Esref Özdemir, Adrian Balla, Marcin Copik, Philipp Lindenberger, Marek Konieczny
, Onur Mutlu, Torsten Hoefler:
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra. Proc. VLDB Endow. 14(11): 1922-1936 (2021) - [j46]Edgar Solomonik, James Demmel, Torsten Hoefler:
Communication Lower Bounds of Bilinear Algorithms for Symmetric Tensor Contractions. SIAM J. Sci. Comput. 43(5): A3328-A3356 (2021) - [j45]Tobias Gysi, Christoph Müller, Oleksandr Zinenko
, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser:
Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation. ACM Trans. Archit. Code Optim. 18(4): 51:1-51:23 (2021) - [j44]Fabian Schuiki
, Florian Zaruba
, Torsten Hoefler, Luca Benini
:
Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores. IEEE Trans. Computers 70(2): 212-227 (2021) - [j43]Florian Zaruba
, Fabian Schuiki
, Torsten Hoefler, Luca Benini
:
Snitch: A Tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads. IEEE Trans. Computers 70(11): 1845-1860 (2021) - [j42]Maciej Besta
, Jens Domke
, Marcel Schneider, Marek Konieczny
, Salvatore Di Girolamo, Timo Schneider, Ankit Singla, Torsten Hoefler:
High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks. IEEE Trans. Parallel Distributed Syst. 32(4): 943-959 (2021) - [j41]Johannes de Fine Licht
, Maciej Besta
, Simon Meierhans
, Torsten Hoefler
:
Transformations of High-Level Synthesis Codes for High-Performance Computing. IEEE Trans. Parallel Distributed Syst. 32(5): 1014-1029 (2021) - [j40]Shigang Li
, Tal Ben-Nun
, Giorgi Nadiradze, Salvatore Di Girolamo, Nikoli Dryden, Dan Alistarh, Torsten Hoefler:
Breaking (Global) Barriers in Parallel Stochastic Optimization With Wait-Avoiding Group Averaging. IEEE Trans. Parallel Distributed Syst. 32(7): 1725-1739 (2021) - [c207]Dan Graur, Rodrigo Bruno
, Joschka Bischoff, Marcel Rieser, Wolfgang Scherr, Torsten Hoefler, Gustavo Alonso:
Hermes: Enabling efficient large-scale simulation in MATSim. ANT/EDI40 2021: 635-641 - [c206]Johannes de Fine Licht
, Andreas Kuster, Tiziano De Matteis
, Tal Ben-Nun, Dominic Hofer, Torsten Hoefler:
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems. CGO 2021: 315-326 - [c205]Niels Gleinig, Torsten Hoefler:
An Efficient Algorithm for Sparse Quantum State Preparation. DAC 2021: 433-438 - [c204]Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra. DATE 2021: 1787-1792 - [c203]Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F. P. O'Boyle, Hugh Leather:
ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations. ICML 2021: 2244-2253 - [c202]Alexandros Nikolaos Ziogas, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
NPBench: a benchmarking suite for high-performance NumPy. ICS 2021: 63-74 - [c201]Marcus Ritter
, Alexander Geiß
, Johannes Wehrstein, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, Felix Wolf:
Noise-Resilient Empirical Performance Modeling with Deep Neural Networks. IPDPS 2021: 23-34 - [c200]Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beránek
, Luca Benini, Torsten Hoefler:
A RISC-V in-network accelerator for flexible high-performance low-power packet processing. ISCA 2021: 958-971 - [c199]Maciej Besta, Raghavendra Kanakagiri, Grzegorz Kwasniewski, Rachata Ausavarungnirun, Jakub Beránek
, Konstantinos Kanellopoulos, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez-Luna, Jakub Golinowski, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Nils Blach, Marek Konieczny
, Onur Mutlu, Torsten Hoefler:
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems. MICRO 2021: 282-297 - [c198]Marcin Copik, Grzegorz Kwasniewski, Maciej Besta, Michal Podstawski, Torsten Hoefler:
SeBS: a serverless benchmark suite for function-as-a-service computing. Middleware 2021: 64-78 - [c197]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
Data Movement Is All You Need: A Case Study on Optimizing Transformers. MLSys 2021 - [c196]Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler:
Extracting clean performance models from tainted programs. PPoPP 2021: 403-417 - [c195]Grzegorz Kwasniewski, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Timo Schneider, Maciej Besta, Torsten Hoefler:
On the parallel I/O optimality of linear algebra kernels: near-optimal LU factorization. PPoPP 2021: 463-464 - [c194]Thomas Häner, Damian S. Steiger, Torsten Hoefler, Matthias Troyer:
Distributed quantum computing with QMPI. SC 2021: 16:1-16:13 - [c193]Shigang Li, Torsten Hoefler:
Chimera: efficiently training large-scale neural networks with bidirectional pipelines. SC 2021: 27:1-27:14 - [c192]Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, Torsten Hoefler:
Flare: flexible in-network allreduce. SC 2021: 35:1-35:16 - [c191]Grzegorz Kwasniewski, Marko Kabic, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Jens Eirik Saethre, André Gaillard, Timo Schneider, Maciej Besta, Anton Kozhevnikov, Joost VandeVondele, Torsten Hoefler:
On the parallel I/O optimality of linear algebra kernels: near-optimal matrix factorizations. SC 2021: 70:1-70:15 - [c190]Nikoli Dryden, Roman Böhringer, Tal Ben-Nun, Torsten Hoefler:
Clairvoyant prefetching for distributed machine learning I/O. SC 2021: 92:1-92:15 - [c189]Alexandros Nikolaos Ziogas, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis
, Johannes de Fine Licht, Luca Lavarini, Torsten Hoefler:
Productivity, portability, performance: data-centric Python. SC 2021: 95:1-95:13 - [c188]Konstantin Taranov, Salvatore Di Girolamo, Torsten Hoefler:
CoRM: Compactable Remote Memory over RDMA. SIGMOD Conference 2021: 1811-1824 - [c187]Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, Torsten Hoefler:
Parallel Algorithms for Finding Large Cliques in Sparse Graphs. SPAA 2021: 243-253 - [c186]Grzegorz Kwasniewski, Tal Ben-Nun, Lukas Gianinazzi, Alexandru Calotoiu, Timo Schneider, Alexandros Nikolaos Ziogas, Maciej Besta, Torsten Hoefler:
Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs. SPAA 2021: 328-339 - [c185]Konstantin Taranov, Rodrigo Bruno, Gustavo Alonso, Torsten Hoefler:
Naos: Serialization-free RDMA networking in Java. USENIX Annual Technical Conference 2021: 1-14 - [c184]Maksym Planeta, Jan Bierbaum, Leo Sahaya Daphne Antony, Torsten Hoefler, Hermann Härtig:
MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications. USENIX Annual Technical Conference 2021: 47-63 - [c183]Benjamin Rothenberger, Konstantin Taranov, Adrian Perrig, Torsten Hoefler:
ReDMArk: Bypassing RDMA Security Mechanisms. USENIX Security Symposium 2021: 4277-4292 - [i93]Roman Böhringer, Nikoli Dryden, Tal Ben-Nun, Torsten Hoefler:
Clairvoyant Prefetching for Distributed Machine Learning I/O. CoRR abs/2101.08734 (2021) - [i92]David Ittah, Thomas Häner, Vadym Kliuchnikov, Torsten Hoefler:
Enabling Dataflow Optimization for Quantum Programs. CoRR abs/2101.11030 (2021) - [i91]Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, Alexandra Peste:
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks. CoRR abs/2102.00554 (2021) - [i90]Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beránek, Kacper Janda, Tobias Holenstein, Sebastian Leisinger, Peter Tatkowski, Esref Özdemir, Adrian Balla, Marcin Copik, Philipp Lindenberger, Pavel Kalvoda, Marek Konieczny, Onur Mutlu, Torsten Hoefler:
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra. CoRR abs/2103.03653 (2021) - [i89]Maciej Besta, Raghavendra Kanakagiri, Grzegorz Kwasniewski, Rachata Ausavarungnirun, Jakub Beránek, Konstantinos Kanellopoulos, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez-Luna, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Marek Konieczny, Onur Mutlu, Torsten Hoefler:
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems. CoRR abs/2104.07582 (2021) - [i88]Thomas Häner, Damian S. Steiger, Torsten Hoefler, Matthias Troyer:
Distributed Quantum Computing with QMPI. CoRR abs/2105.01109 (2021) - [i87]Grzegorz Kwasniewski, Tal Ben-Nun, Lukas Gianinazzi, Alexandru Calotoiu, Timo Schneider, Alexandros Nikolaos Ziogas, Maciej Besta, Torsten Hoefler:
Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs. CoRR abs/2105.07203 (2021) - [i86]Maciej Besta, Marcel Schneider, Salvatore Di Girolamo, Ankit Singla, Torsten Hoefler:
Towards Million-Server Network Simulations on Just a Laptop. CoRR abs/2105.12663 (2021) - [i85]Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, Torsten Hoefler:
Motif Prediction with Graph Neural Networks. CoRR abs/2106.00761 (2021) - [i84]Lukas Gianinazzi, Maximilian Fries, Nikoli Dryden, Tal Ben-Nun, Maciej Besta, Torsten Hoefler:
Learning Combinatorial Node Labeling Algorithms. CoRR abs/2106.03594 (2021) - [i83]Marcin Copik, Konstantin Taranov, Alexandru Calotoiu, Torsten Hoefler:
RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing. CoRR abs/2106.13859 (2021) - [i82]Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, Torsten Hoefler:
Flare: Flexible In-Network Allreduce. CoRR abs/2106.15565 (2021) - [i81]Alexandros Nikolaos Ziogas, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis, Johannes de Fine Licht
, Luca Lavarini, Torsten Hoefler:
Productivity, Portability, Performance: Data-Centric Python. CoRR abs/2107.00555 (2021) - [i80]Shigang Li, Torsten Höfler:
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines. CoRR abs/2107.06925 (2021) - [i79]Grzegorz Kwasniewski, Marko Kabic, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Jens Eirik Saethre, André Gaillard, Timo Schneider, Maciej Besta, Anton Kozhevnikov, Joost VandeVondele, Torsten Hoefler:
On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations. CoRR abs/2108.09337 (2021) - [i78]Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, Torsten Hoefler:
Parallel Algorithms for Finding Large Cliques in Sparse Graphs. CoRR abs/2109.09663 (2021) - [i77]Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler:
A Data-Centric Optimization Framework for Machine Learning. CoRR abs/2110.10802 (2021) - [i76]Alexandru Calotoiu, Tal Ben-Nun, Grzegorz Kwasniewski, Johannes de Fine Licht, Timo Schneider, Philipp Schaad, Torsten Hoefler:
Lifting C Semantics for Dataflow Optimization. CoRR abs/2112.11879 (2021) - 2020
- [j39]Thomas Häner, Torsten Hoefler, Matthias Troyer:
Assertion-based optimization of Quantum programs. Proc. ACM Program. Lang. 4(OOPSLA): 133:1-133:20 (2020) - [j38]Tobias Grosser, Theodoros Theodoridis, Maximilian Falkenstein, Arjun Pitchanathan, Michael Kruse
, Manuel Rigger, Zhendong Su
, Torsten Hoefler:
Fast linear programming through transprecision computing on small and sparse data. Proc. ACM Program. Lang. 4(OOPSLA): 195:1-195:28 (2020) - [j37]Jesper Larsson Träff, Torsten Hoefler:
Special issue: Selected papers from EuroMPI 2019. Parallel Comput. 99: 102695 (2020) - [j36]Carlos Osuna, Tobias Wicky, Fabian Thuering, Torsten Hoefler, Oliver Fuhrer:
Dawn: a High-level Domain-Specific Language Compiler Toolchain for Weather and Climate Applications. Supercomput. Front. Innov. 7(2): 79-97 (2020) - [j35]Asif Ali Khan
, Hauke Mewes, Tobias Grosser
, Torsten Hoefler, Jerónimo Castrillón
:
Polyhedral Compilation for Racetrack Memories. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 39(11): 3968-3980 (2020) - [j34]Maciej Besta, Marc Fischer, Tal Ben-Nun, Dimitri Stanojevic, Johannes de Fine Licht, Torsten Hoefler:
Substream-Centric Maximum Matchings on FPGA. ACM Trans. Reconfigurable Technol. Syst. 13(2): 8:1-8:33 (2020) - [c182]Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry:
Augment Your Batch: Improving Generalization Through Instance Repetition. CVPR 2020: 8126-8135 - [c181]Andreas Kurth
, Samuel Riedel, Florian Zaruba, Torsten Hoefler, Luca Benini:
ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor. DAC 2020: 1-6 - [c180]Johannes de Fine Licht
, Grzegorz Kwasniewski, Torsten Hoefler:
Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis. FPGA 2020: 244-254 - [c179]Marcus Ritter
, Alexandru Calotoiu, Sebastian Rinke, Thorsten Reimann, Torsten Hoefler, Felix Wolf:
Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling. IPDPS 2020: 884-895 - [c178]Maciej Besta, Raghavendra Kanakagiri, Harun Mustafa
, Mikhail Karasikov, Gunnar Rätsch, Torsten Hoefler, Edgar Solomonik:
Communication-Efficient Jaccard similarity for High-Performance Distributed Genome Comparisons. IPDPS 2020: 1122-1132 - [c177]Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:
Taming unbalanced training workloads in deep learning with partial collective operations. PPoPP 2020: 45-61 - [c176]Yuyang Jin, Haojie Wang, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:
Identifying scalability bottlenecks for large-scale parallel programs with graph analysis. PPoPP 2020: 409-410 - [c175]Alexandr Nigay, Lukas Mosimann, Timo Schneider, Torsten Hoefler:
Communication and Timing Issues with MPI Virtualization. EuroMPI 2020: 11-20 - [c174]Maciej Besta, Marcel Schneider, Marek Konieczny
, Karolina Cynk, Erik Henriksson, Salvatore Di Girolamo, Ankit Singla, Torsten Hoefler:
FatPaths: routing in supercomputers and data centers when shortest paths fall short. SC 2020: 27 - [c173]Yuyang Jin, Haojie Wang, Teng Yu, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:
ScalAna: automating scaling loss detection with graph analysis. SC 2020: 28 - [c172]Daniele De Sensi, Salvatore Di Girolamo, Kim H. McMahon, Duncan Roweth, Torsten Hoefler:
An in-depth analysis of the slingshot interconnect. SC 2020: 35 - [c171]Tiziano De Matteis
, Johannes de Fine Licht
, Torsten Hoefler:
fBLAS: streaming linear algebra on FPGA. SC 2020: 59 - [c170]Alexandru Calotoiu, Markus Geisenhofer, Florian Kummer, Marcus Ritter, Jens Weber, Torsten Hoefler, Martin Oberlack, Felix Wolf:
Empirical Modeling of Spatially Diverging Performance. HUST/ProTools@SC 2020: 71-80 - [c169]Maciej Besta, Armon Carigiet, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Torsten Hoefler:
High-performance parallel graph coloring with strong guarantees on work, depth, and quality. SC 2020: 99 - [c168]Lukas Gianinazzi, Torsten Hoefler:
Parallel Planar Subgraph Isomorphism and Vertex Connectivity. SPAA 2020: 269-280 - [c167]Konstantin Taranov, Benjamin Rothenberger, Adrian Perrig, Torsten Hoefler:
sRDMA - Efficient NIC-based Authentication and Encryption for Remote Direct Memory Access. USENIX Annual Technical Conference 2020: 691-704 - [p2]Alexandru Calotoiu, Marcin Copik, Torsten Hoefler, Marcus Ritter
, Sergei Shudler, Felix Wolf:
ExtraPeak: Advanced Automatic Performance Modeling for HPC Applications. Software for Exascale Computing 2020: 453-482 - [i75]Tobias Gysi, Tobias Grosser, Laurin Brandner, Torsten Hoefler:
A Fast Analytical Model of Fully Associative Caches. CoRR abs/2001.01653 (2020) - [i74]Robert Gerstenberger, Maciej Besta, Torsten Hoefler:
Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided. CoRR abs/2001.07747 (2020) - [i73]Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads. CoRR abs/2002.10143 (2020) - [i72]Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Hugh Leather:
ProGraML: Graph-based Deep Learning for Program Optimization and Analysis. CoRR abs/2003.10536 (2020) - [i71]Shigang Li, Tal Ben-Nun, Dan Alistarh, Salvatore Di Girolamo, Nikoli Dryden, Torsten Hoefler:
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging. CoRR abs/2005.00124 (2020) - [i70]Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler:
Deep Learning for Post-Processing Ensemble Weather Forecasts. CoRR abs/2005.08748 (2020) - [i69]Tobias Gysi, Christoph Müller, Oleksandr Zinenko
, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser:
Domain-Specific Multi-Level IR Rewriting for GPU. CoRR abs/2005.13014 (2020) - [i68]Bryan A. Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, Kate Saenko:
Shapeshifter Networks: Cross-layer Parameter Sharing for Scalable and Effective Deep Learning. CoRR abs/2006.10598 (2020) - [i67]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
Data Movement Is All You Need: A Case Study on Optimizing Transformers. CoRR abs/2007.00072 (2020) - [i66]Lukas Gianinazzi, Torsten Hoefler:
Parallel Planar Subgraph Isomorphism and Vertex Connectivity. CoRR abs/2007.01199 (2020) - [i65]Maciej Besta, Jens Domke, Marcel Schneider, Marek Konieczny, Salvatore Di Girolamo, Timo Schneider, Ankit Singla, Torsten Hoefler:
High-Performance Routing with Multipathing and Path Diversity in Supercomputers and Data Centers. CoRR abs/2007.03776 (2020) - [i64]Daniele De Sensi, Salvatore Di Girolamo, Kim H. McMahon, Duncan Roweth, Torsten Hoefler:
An In-Depth Analysis of the Slingshot Interconnect. CoRR abs/2008.08886 (2020) - [i63]Maciej Besta, Armon Carigiet, Zur Vonarburg-Shmaria, Kacper Janda, Lukas Gianinazzi, Torsten Hoefler:
High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality. CoRR abs/2008.11321 (2020) - [i62]Yuyang Jin, Haojie Wang, Teng Yu, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:
ScalAna: Automating Scaling Loss Detection with Graph Analysis. CoRR abs/2009.01692 (2020) - [i61]Maksym Planeta, Jan Bierbaum, Leo Sahaya Daphne Antony, Torsten Hoefler, Hermann Härtig:
TardiS: Migrating Containers with RDMA Networks. CoRR abs/2009.06988 (2020) - [i60]Salvatore Di Girolamo, Andreas Kurth
, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beránek, Luca Benini, Torsten Hoefler:
PsPIN: A high-performance low-power architecture for flexible in-network compute. CoRR abs/2010.03536 (2020) - [i59]