default search action
SC 2023: Denver, CO, USA
- Dorian Arnold, Rosa M. Badia, Kathryn M. Mohror:
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023, Denver, CO, USA, November 12-17, 2023. ACM 2023
ACM Gordon Bell Finalists
- Sambit Das, Bikash Kanungo, Vishal Subramanian, Gourab Panigrahi, Phani Motamarri, David M. Rogers, Paul Zimmerman, Vikram Gavini:
Large-Scale Materials Modeling at Quantum Accuracy: Ab Initio Simulations of Quasicrystals and Interacting Extended Defects in Metallic Alloys. 1:1-1:12 - Boris Kozinsky, Albert Musaelian, Anders Johansson, Simon L. Batzner:
Scaling the Leading Accuracy of Deep Equivariant Models to Biomolecular Simulations of Realistic Size. 2:1-2:12 - Elia Merzari, Steven P. Hamilton, Thomas M. Evans, Misun Min, Paul F. Fischer, Stefan Kerkemeier, Jun Fang, Paul K. Romano, Yu-Hsiang Lan, Malachi Phillips, Elliott Biondo, Katherine Royston, Tim Warburton, Noel Chalmers, Thilina Rathnayake:
Exascale Multiphysics Nuclear Reactor Simulations for Advanced Designs. 3:1-3:11 - Yuhang Fu, Weiqi Shen, Jiahuan Cui, Yao Zheng, Guangwen Yang, Zhao Liu, Jifa Zhang, Tingwei Ji, Fangfang Xie, Xiaojing Lv, Hanyue Liu, Xu Liu, Xiyang Liu, Xiaoyu Song, Guocheng Tao, Yan Yan, Paul Tucker, Steven A. E. Miller, Shirui Luo, Seid Koric, Weimin Zheng:
Toward Exascale Computation for Turbomachinery Flows. 4:1-4:12 - Niclas Jansson, Martin Karp, Adalberto Perez, Timofey Mukha, Yi Ju, Jiahui Liu, Szilárd Páll, Erwin Laure, Tino Weinkauf, Jörg Schumacher, Philipp Schlatter, Stefano Markidis:
Exploring the Ultimate Regime of Turbulent Rayleigh-Bénard Convection Through Unprecedented Spectral-Element Simulations. 5:1-5:9 - Hatem Ltaief, Yuxi Hong, Leighton Wilson, Mathias Jacquelin, Matteo Ravasi, David Elliot Keyes:
Scaling the "Memory Wall" for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 Systems. 6:1-6:12
ACM Gordon Bell Climate Finalists
- Mark Taylor, Peter M. Caldwell, Luca Bertagna, Conrad Clevenger, Aaron Donahue, James G. Foucar, Oksana Guba, Benjamin R. Hillman, Noel Keen, Jayesh Krishna, Matthew R. Norman, Sarat Sreepathi, Christopher Terai, James B. White III, Andrew G. Salinger, Renata B. McCoy, Lai-yung Ruby Leung, David C. Bader, Danqing Wu:
The Simple Cloud-Resolving E3SM Atmosphere Model Running on the Frontier Exascale System. 7:1-7:11 - Takemasa Miyoshi, Arata Amemiya, Shigenori Otsuka, Yasumitsu Maejima, James Taylor, Takumi Honda, Hirofumi Tomita, Seiya Nishizawa, Kenta Sueki, Tsuyoshi Yamaura, Yutaka Ishikawa, Shinsuke Satoh, Tomoo Ushio, Kana Koike, Atsuya Uno:
Big Data Assimilation: Real-time 30-second-refresh Heavy Rain Forecast Using Fugaku During Tokyo Olympics and Paralympics. 8:1-8:10 - Shenghong Huang, Junshi Chen, Ziyu Zhang, Xiaoyu Hao, Jun Gu, Hong An, Chun Zhao, Yan Hu, Zhanming Wang, Longkui Chen, Yifan Luo, Jineng Yao, Yi Zhang, Yang Zhao, Zhihao Wang, Dongning Jia, Zhao Jin, Changming Song, Xisheng Luo, Xiaobin He, Dexun Chen:
Establishing a Modeling System in 3-km Horizontal Resolution for Global Atmospheric Circulation triggered by Submarine Volcanic Eruptions with 400 Billion Smoothed Particle Hydrodynamics. 9:1-9:12
Extreme-Scale Applications
- Wubing Wan, Lin Gan, Wenqiang Wang, Zekun Yin, Haodong Tian, Zhenguo Zhang, Yinuo Wang, Mengyuan Hua, Xiaohui Liu, Shengye Xiang, Zhongqiu He, Zijia Wang, Ping Gao, Xiaohui Duan, Weiguo Liu, Wei Xue, Haohuan Fu, Guangwen Yang, Xiaofei Chen, Zeyu Song, Yaojian Chen, Xin Liu, Wei Zhang:
69.7-PFlops Extreme Scale Earthquake Simulation with Crossing Multi-faults and Topography on Sunway. 10:1-10:15 - Yumeng Shi, Ningming Nie, Jue Wang, Kehao Lin, Chunbao Zhou, Shigang Li, Kehan Yao, Shunde Li, Yangde Feng, Yan Zeng, Fang Liu, Yangang Wang, Yue Gao:
Large-Scale Simulation of Structural Dynamics Computing on GPU Clusters. 11:1-11:14 - Shunde Li, Zongguo Wang, Lingkun Bu, Jue Wang, Zhikuang Xin, Shigang Li, Yangang Wang, Yangde Feng, Peng Shi, Yun Hu, Xuebin Chi:
ANT-MOC: Scalable Neutral Particle Transport Using 3D Method of Characteristics on Multi-GPU Systems. 12:1-12:13
Global Task Parallelism
- Rohan Yadav, Wonchan Lee, Melih Elibol, Manolis Papadakis, Taylor Lee-Patti, Michael Garland, Alex Aiken, Fredrik Kjolstad, Michael Bauer:
Legate Sparse: Distributed Sparse Computing in Python. 13:1-13:13 - Shumpei Shiina, Kenjiro Taura:
Itoyori: Reconciling Global Address Space and Global Fork-Join Task Parallelism. 14:1-14:15 - Thiago S. F. X. Teixeira, Alexandra Henzinger, Rohan Yadav, Alex Aiken:
Automated Mapping of Task-Based Programs onto Distributed and Heterogeneous Machines. 15:1-15:13
Graph Algorithms in HPC
- Zhe Pan, Shuibing He, Xu Li, Xuechen Zhang, Rui Wang, Gang Chen:
Efficient Maximal Biclique Enumeration on GPUs. 16:1-16:13 - Ghadeer Alabandi, William Sands, George Biros, Martin Burtscher:
A GPU Algorithm for Detecting Strongly Connected Components. 17:1-17:13 - Wang Feng, Shiyang Chen, Hang Liu, Yuede Ji:
PeeK: A Prune-Centric Approach for K Shortest Path Computation. 18:1-18:14
Sustainable Computing
- Baolin Li, Rohan Basu Roy, Daniel Wang, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari:
Toward Sustainable HPC: Carbon Footprint Estimation and Environmental Implications of HPC Systems. 19:1-19:15 - Baolin Li, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari:
Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service. 20:1-20:15 - Md. S. Q. Zulkar Nine, Tevfik Kosar, Muhammed Fatih Bulut, Jinho Hwang:
GreenNFV: Energy-Efficient Network Function Virtualization with Service Level Agreement Constraints. 21:1-21:12
Graph Frameworks and Databases
- Maciej Besta, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Nils Blach, Berke Egeli, George Mitenkov, Wojciech Chlapek, Marek T. Michalewicz, Hubert Niewiadomski, Jürgen Müller, Torsten Hoefler:
The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores. 22:1-22:18 - Wang Zhang, Zhan Shi, Ziyi Liao, Yiling Li, Yu Du, Yutong Wu, Fang Wang, Dan Feng:
Graph3PO: A Temporal Graph Data Processing Method for Latency QoS Guarantee in Object Cloud Storage System. 23:1-23:16 - Chun-Yi Liu, Wonil Choi, Soheil Khadirsharbiyani, Mahmut T. Kandemir:
MBFGraph: An SSD-based External Graph System for Evolving Graphs. 24:1-24:13
Resource Management
- Qiyang Ding, Pengfei Zheng, Shreyas Kudari, Shivaram Venkataraman, Zhao Zhang:
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning. 25:1-25:13 - Burak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Manuel Egele, Ayse K. Coskun:
Prodigy: Towards Unsupervised Anomaly Detection in Production HPC Systems. 26:1-26:14 - Jianru Ding, Henry Hoffmann:
DPS: Adaptive Power Management for Overprovisioned Systems. 27:1-27:14
GPU Middleware and System Software
- Konstantinos Parasyris, Giorgis Georgakoudis, Esteban Rangel, Ignacio Laguna, Johannes Doerfert:
Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay. 28:1-28:14 - Zane Fink, Konstantinos Parasyris, Giorgis Georgakoudis, Harshitha Menon:
HPAC-Offload: Accelerating HPC Applications with Portable Approximate Computing on the GPU. 29:1-29:14 - Wenyan Chen, Zizhao Mo, Huanle Xu, Kejiang Ye, Chengzhong Xu:
Interference-aware Multiplexing for Deep Learning in GPU Clusters: A Middleware Approach. 30:1-30:15
High Performance for Graph Operations
- James D. Trotter, Sinan Ekmekçibasi, Johannes Langguth, Tugba Torun, Emre Düzakin, Aleksandar Ilic, Didem Unat:
Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs. 31:1-31:13 - Tianhui Shi, Jidong Zhai, Haojie Wang, Qiqian Chen, Mingshu Zhai, Zixu Hao, Haoyu Yang, Wenguang Chen:
GraphSet: High Performance Graph Mining through Equivalent Set Transformations. 32:1-32:14 - Luk Burchard, Max Xiaohang Zhao, Johannes Langguth, Aydin Buluç, Giulia Guidi:
Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU. 33:1-33:16
Message Passing Innovations
- Jintao Peng, Jianbin Fang, Jie Liu, Min Xie, Yi Dai, Bo Yang, Shengguo Li, Zheng Wang:
Optimizing MPI Collectives on Shared Memory Multi-Cores. 34:1-34:15 - Trevor Steil, Tahsin Reza, Benjamin Priest, Roger Pearce:
Embracing Irregular Parallelism in HPC with YGM. 35:1-35:13 - Marcin Chrapek, Mikhail Khalilov, Torsten Hoefler:
HEAR: Homomorphically Encrypted Allreduce. 36:1-36:17
Training Graph Neural Networks
- Kaihua Fu, Quan Chen, Yuzhuo Yang, Jiuchen Shi, Chao Li, Minyi Guo:
BLAD: Adaptive Load Balanced Scheduling and Operator Overlap Pipeline For Accelerating The Dynamic GNN Training. 37:1-37:13 - Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu:
TANGO: re-thinking quantization for graph neural network training on GPUs. 38:1-38:14 - Hongkuan Zhou, Da Zheng, Xiang Song, George Karypis, Viktor K. Prasanna:
DistTGL: Distributed Memory-Based Temporal Graph Neural Network Training. 39:1-39:12
Applications in Materials Science and Biology
- Zhikun Wu, Yangjun Wu, Ying Liu, Honghui Shang, Yingxiang Gao, Zhongcheng Zhang, Yuyang Zhang, Yingchi Long, Xiaobing Feng, Huimin Cui:
Portable and Scalable All-Electron Quantum Perturbation Simulations on Exascale Supercomputers. 40:1-40:13 - Sayan Roychowdhury, Samreen T. Mahmud, Aristotle X. Martin, Peter Balogh, Daniel F. Puleri, John Gounley, Erik W. Draeger, Amanda Randles:
Enhancing Adaptive Physics Refinement Simulations Through the Addition of Realistic Red Blood Cell Counts. 41:1-41:13 - Yangjun Wu, Chu Guo, Yi Fan, Pengyu Zhou, Honghui Shang:
NNQS-Transformer: an Efficient and Scalable Neural Network Quantum States Approach for Ab initio Quantum Chemistry. 42:1-42:13
Data Compression
- Yafan Huang, Sheng Di, Xiaodong Yu, Guanpeng Li, Franck Cappello:
cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End Performance. 43:1-43:13 - Daoce Wang, Jesus Pulido, Pascal Grosset, Jiannan Tian, Sian Jin, Houjun Tang, Jean M. Sexton, Sheng Di, Kai Zhao, Bo Fang, Zarija Lukic, Franck Cappello, James P. Ahrens, Dingwen Tao:
AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications. 44:1-44:15 - Tao Lu, Yu Zhong, Zibin Sun, Xiang Chen, You Zhou, Fei Wu, Ying Yang, Yunxin Huang, Yafei Yang:
ADT-FSE: A New Encoder for SZ. 45:1-45:13
Handling Hardware Faults
- Juan-David Guerrero-Balaguera, Josie Esteban Rodriguez Condia, Fernando Fernandes dos Santos, Matteo Sonza Reorda, Paolo Rech:
Understanding the Effects of Permanent Faults in GPU's Parallelism Management and Control Units. 46:1-46:14 - Meng Wang, Jiajun Mao, Rajdeep Rana, John Bent, Serkay Olmez, Anjus George, Garrett Wilson Ransom, Jun Li, Haryadi S. Gunawi:
Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data Centers. 47:1-47:13 - Dongwhee Kim, Jaeyoon Lee, Wonyeong Jung, Michael B. Sullivan, Jungrae Kim:
Unity ECC: Unified Memory Protection Against Bit and Chip Errors. 48:1-48:16
Linear Algebra I
- Noel Chalmers, Jakub Kurzak, Damon McDougall, Paul T. Bauman:
Optimizing High-Performance Linpack for Exascale Accelerated Architectures. 49:1-49:12 - Yang Liu, Nan Ding, Piyush Sao, Samuel Williams, Xiaoye Sherry Li:
Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters. 50:1-50:15 - Xu Fu, Bingbin Zhang, Tengcheng Wang, Wenhao Li, Yuechen Lu, Enxin Yi, Jianqi Zhao, Xiaohan Geng, Fangying Li, Jingwen Zhang, Zhou Jin, Weifeng Liu:
PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems. 51:1-51:14
Exascale Computing
- Scott Atchley, Christopher Zimmer, John Lange, David E. Bernholdt, Verónica G. Melesse Vergara, Thomas Beck, Michael J. Brim, Reuben D. Budiardja, Sunita Chandrasekaran, Markus Eisenbach, Thomas M. Evans, Matthew Ezell, Nicholas Frontiere, Antigoni Georgiadou, Joe Glenski, Philipp Grete, Steven P. Hamilton, John K. Holmen, Axel Huebl, Daniel A. Jacobson, Wayne Joubert, Kim H. McMahon, Elia Merzari, Stan G. Moore, Andrew Myers, Stephen Nichols, Sarp Oral, Thomas Papatheodore, Danny Perez, David M. Rogers, Evan Schneider, Jean-Luc Vay, P. K. Yeung:
Frontier: Exploring Exascale. 52:1-52:16 - Nicholas Malaya, Bronson Messer, Joseph Glenski, Antigoni Georgiadou, Justin Lietz, Kalyana C. Gottiparthi, Marc Day, Jackie Chen, Jon S. Rood, Lucas Esclapez, James B. White III, Gustav R. Jansen, Nicholas Curtis, Stephen Nichols, Jakub Kurzak, Noel Chalmers, Chip Freitag, Paul T. Bauman, Alessandro Fanfarillo, Reuben D. Budiardja, Thomas Papatheodore, Nicholas Frontiere, Damon McDougall, Matthew R. Norman, Sarat Sreepathi, Philip C. Roth, Dmytro Bykov, Noah Wolfe, Paul Mullowney, Markus Eisenbach, Marc T. Henry de Frahan, Wayne Joubert:
Experiences readying applications for Exascale. 53:1-53:13 - Rongfen Lin, Xinhui Yuan, Wei Xue, Wanwang Yin, Jienan Yao, Junda Shi, Qiang Sun, Chaobo Song, Fei Wang:
5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway Supercomputer. 54:1-54:13
Training in HPC Machine Learning
- Mingzhen Li, Wencong Xiao, Hailong Yang, Biao Sun, Hanyu Zhao, Shiru Ren, Zhongzhi Luan, Xianyan Jia, Yi Liu, Yong Li, Wei Lin, Depei Qian:
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs. 55:1-55:14 - Ziming Liu, Shenggan Cheng, Haotian Zhou, Yang You:
Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training Efficiency. 56:1-56:13 - Lucas Thibaut Meyer, Marc Schouler, Robert Alexander Caulk, Alejandro Ribés, Bruno Raffin:
High Throughput Training of Deep Surrogates from Large Ensemble Runs. 57:1-57:16
Data Coordination
- Hyungro Lee, Luanzheng Guo, Meng Tang, Jesun Firoz, Nathan R. Tallent, Anthony Kougkas, Xian-He Sun:
Data Flow Lifecycles for Optimizing Workflow Coordination. 58:1-58:15 - J. Gregory Pauloski, Valérie Hayot-Sasson, Logan T. Ward, Nathaniel Hudson, Charlie Sabino, Matt Baughman, Kyle Chard, Ian T. Foster:
Accelerating Communications in Federated Applications with Transparent Object Proxies. 59:1-59:15 - Jacob Wahlgren, Gabin Schieffer, Maya B. Gokhale, Ivy Peng:
A Quantitative Approach for Adopting Disaggregated Memory in HPC Systems. 60:1-60:14
Quantum Computing
- Tirthak Patel, Daniel Silver, Devesh Tiwari:
GRAPHINE: Enhanced Neutral Atom Quantum Computing using Application-Specific Rydberg Atom Arrangement. 61:1-61:15 - Alan Robertson, Shuaiwen Song:
Mitigating Coupling Map Constrained Correlated Measurement Errors on Quantum Devices. 62:1-62:13 - Aditya Ranjan, Tirthak Patel, Harshitta Gandhi, Daniel Silver, William Cutler, Devesh Tiwari:
Experimental Evaluation of Xanadu X8 Photonic Quantum Computer: Error Measurement, Characterization and Implications. 63:1-63:13
Tensor Computation
- Martin Kong, Raneem Abu Yosef, Atanas Rountev, P. Sadayappan:
Automatic Generation of Distributed-Memory Mappings for Tensor Computations. 64:1-64:13 - Edward Hutter, Edgar Solomonik:
Application Performance Modeling via Tensor Completion. 65:1-65:14 - Maciej Besta, Pawel Renc, Robert Gerstenberger, Paolo Sylos Labini, Alexandros Nikolaos Ziogas, Tiancheng Chen, Lukas Gianinazzi, Florian Scheidl, Kalman Szenes, Armon Carigiet, Patrick Iff, Grzegorz Kwasniewski, Raghavendra Kanakagiri, Chio Ge, Sammy Jaeger, Jaroslaw Was, Flavio Vella, Torsten Hoefler:
High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor Formulations. 66:1-66:16
Topics in Cloud Computing
- Yiming Li, Laiping Zhao, Yanan Yang, Wenyu Qu:
Rethinking Deployment for Serverless Functions: A Performance-First Perspective. 67:1-67:14 - Huizhong Li, Yujie Chen, Xiang Shi, Xingqiang Bai, Nan Mo, Wenlin Li, Rui Guo, Zhang Wang, Yi Sun:
FISCO-BCOS: An Enterprise-grade Permissioned Blockchain System with High-performance. 68:1-68:17 - Kaijie Fan, Marco D'Antonio, Lorenzo Carpentieri, Biagio Cosenza, Federico Ficarelli, Daniele Cesarini:
SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving. 69:1-69:13
Architecture-Specific Optimization
- Pengyu Wang, Weiling Yang, Jianbin Fang, Dezun Dong, Chun Huang, Peng Zhang, Tao Tang, Zheng Wang:
Optimizing Direct Convolutions on ARM Multi-Cores. 70:1-70:13 - Mikhail Isaev, Nic McDonald, Larry Dennison, Richard W. Vuduc:
Calculon: a methodology and tool for high-level co-design of systems and large language models. 71:1-71:14 - Roberto L. Castro, Andrei Ivanov, Diego Andrade, Tal Ben-Nun, Basilio B. Fraguela, Torsten Hoefler:
VENOM: A Vectorized N: M Format for Unleashing the Power of Sparse Tensor Cores. 72:1-72:14
Linear Algebra II
- Yuechen Lu, Weifeng Liu:
DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication. 73:1-73:14 - David E. Keyes, Hatem Ltaief, Yuji Nakatsukasa, Dalal Sukkari:
High-Performance SVD Partial Spectrum Computation. 74:1-74:12 - Linghao Song, Fan Chen, Hai Li, Yiran Chen:
ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear Solvers. 75:1-75:15
Algorithms on GPUs
- Jingrong Zhang, Akira Naruse, Xipeng Li, Yong Wang:
Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Methods. 76:1-76:13 - Alex Fallin, Andres Gonzalez, Jarim Seo, Martin Burtscher:
A High-Performance MST Implementation for GPUs. 77:1-77:13 - Junmin Xiao, Chaoyang Shui, Di Cai, Kangyu Wang, Yunfei Pang, Mingyi Li, Hui Ma, Guangming Tan:
Adaptive Workload-Balanced Scheduling Strategy for Global Ocean Data Assimilation on Massive GPUs. 78:1-78:15
Applications of Machine Learning
- Yiyuan Li, Xiting Ju, Yi Xiao, Qilong Jia, Yongxiao Zhou, Simeng Qian, Rongfen Lin, Bin Yang, Shupeng Shi, Xin Liu, Jie Gao, Zhen Wang, Sha Liu, Jian Tan, Xuan Wang, Zhengding Hu, Limin Yan, Wei Xue:
Rapid simulations of atmospheric data assimilation of hourly-scale phenomena with modern neural networks. 79:1-79:13 - Arthur Feeney, Zitong Li, Ramin Bostanabad, Aparna Chandramowlishwaran:
Breaking Boundaries: Distributed Domain Decomposition with Scalable Physics-Informed Neural PDE Solvers. 80:1-80:15 - Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar:
FORGE: Pre-Training Open Foundation Models for Science. 81:1-81:13
Data Centers and Large Distributed Systems
- Cyrus Tanade, Emily Rakestraw, William Ladd, Erik W. Draeger, Amanda Randles:
Cloud Computing to Enable Wearable-Driven Longitudinal Hemodynamic Maps. 82:1-82:14 - Marcin Bienkowski, David Fuchssteiner, Stefan Schmid:
Optimizing Reconfigurable Optical Datacenters: The Power of Randomization. 83:1-83:11 - Theresa Pollinger, Alexander Van Craen, Christoph Niethammer, Marcel Breyer, Dirk Pflüger:
Leveraging the Compute Power of Two HPC Systems for Higher-Dimensional Grid-Based Simulations with the Widely-Distributed Sparse Grid Combination Technique. 84:1-84:14
Fault Tolerance and FPGA Codesign
- Ali Asgari Khoshouyeh, Florian Geissler, Syed Sha Qutub, Michael Paulitsch, Prashant J. Nair, Karthik Pattabiraman:
Structural Coding: A Low-Cost Scheme to Protect CNNs from Large-Granularity Memory Faults. 85:1-85:17 - Zhengyang He, Yafan Huang, Hui Xu, Dingwen Tao, Guanpeng Li:
Demystifying and Mitigating Cross-Layer Deficiencies of Soft Error Protection in Instruction Duplication. 86:1-86:13 - Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso:
Co-design Hardware and Algorithm for Vector Search. 87:1-87:15
Code Optimization
- Philipp Schaad, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Alexandros Nikolaos Ziogas, Torsten Hoefler:
FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs. 88:1-88:15 - Kazem Cheshmi, Michelle Strout, Maryam Mehri Dehnavi:
Runtime Composition of Iterations for Fusing Loop-carried Sparse Dependence. 89:1-89:15 - Xin You, Hailong Yang, Kelun Lei, Zhongzhi Luan, Depei Qian:
TrivialSpy: Identifying Software Triviality via Fine-grained and Dataflow-based Value Profiling. 90:1-90:13
Graph Analytics
- Pengmiao Zhang, Rajgopal Kannan, Viktor K. Prasanna:
Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph Analytics. 91:1-91:15 - Yiqian Liu, Noushin Azami, Avery Vanausdal, Martin Burtscher:
Choosing the Best Parallelization and Implementation Styles for Graph Analytics Codes: Lessons Learned from 1106 Programs. 92:1-92:14 - Abdullah Al Raqibul Islam, Dong Dai:
DGAP: Efficient Dynamic Graph Analysis on Persistent Memory. 93:1-93:13
High Performance I/O
- Zanhua Huang, Kaiyuan Hou, Ankit Agrawal, Alok N. Choudhary, Robert B. Ross, Wei-Keng Liao:
I/O in WRF: A Case Study in Modern Parallel I/O Techniques. 94:1-94:13 - Ed Karrels, Lei Huang, Yuhong Kan, Ishank Arora, Yinzhi Wang, Daniel S. Katz, William Gropp, Zhao Zhang:
Fine-grained Policy-driven I/O Sharing for Burst Buffers. 95:1-95:12 - Yingjin Qian, Wen Cheng, Lingfang Zeng, Xi Li, Marc-André Vef, Andreas Dilger, Siyao Lai, Shuichi Ihara, Yong Fan, André Brinkmann:
Xfast: Extreme File Attribute Stat Acceleration for Lustre. 96:1-96:12
Molecular Dynamics Applications and Accelerators
- Jianxiong Li, Tong Zhao, Zhuoqiang Guo, Shunchen Shi, Lijun Liu, Guangming Tan, Weile Jia, Guojun Yuan, Zhan Wang:
Enhance the Strong Scaling of LAMMPS on Fugaku. 97:1-97:13 - Chunshu Wu, Tong Geng, Anqi Guo, Sahan Bandara, Pouya Haghi, Chuan Liu, Ang Li, Martin C. Herbordt:
FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular Dynamics. 98:1-98:14 - Xiaohui Duan, Jin Wang, Ping Gao, Ming Ma, Lin Gan, Xin Liu, Haohuan Fu, Wei Xue, Dexun Chen, Guangwen Yang, Weiguo Liu:
Enabling Real World Scale Structural Superlubricity All-Atom Simulation on the Next-Generation Sunway Supercomputer. 99:1-99:14
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.