


default search action
ASPLOS 2025: Rotterdam, The Netherlands
- Lieven Eeckhout, Georgios Smaragdakis, Katai Liang, Adrian Sampson, Martha A. Kim, Christopher J. Rossbach:
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS 2025, Rotterdam, Netherlands, 30 March 2025 - 3 April 2025. ACM 2025, ISBN 979-8-4007-1079-7
V2 Research Papers
- Jinwoo Jeong
, Jeongseob Ahn
:
Accelerating LLM Serving for Multi-turn Dialogues with Efficient Resource Management. 1-15 - Kevin Nam
, Heon Hui Jung
, Hyunyoung Oh
, Yunheung Paek
:
Affinity-based Optimizations for TFHE on Processing-in-DRAM. 16-31 - Bo Fu
, Leo Tenenbaum
, David Adler
, Assaf Klein
, Arpit Gogia
, Alaa R. Alameldeen
, Marco Guarnieri
, Mark Silberstein
, Oleksii Oleksenko
, Gururaj Saileshwar
:
AMuLeT: Automated Design-Time Testing of Secure Speculation Countermeasures. 32-47 - Abhishek Vijaya Kumar
, Gianni Antichi
, Rachee Singh
:
Aqua: Network-Accelerated Memory Offloading for LLMs in Scale-Up GPU Domains. 48-62 - Shixin Zhao
, Yuming Li
, Bing Li
, Yintao He
, Mengdi Wang
, Yinhe Han
, Ying Wang
:
Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators. 63-78 - Shui Jiang
, Yi-Hua Chung
, Chih-Chun Chang
, Tsung-Yi Ho
, Tsung-Wei Huang
:
BQSim: GPU-accelerated Batch Quantum Circuit Simulation using Decision Diagram. 79-94 - Yue Dai
, Xulong Tang
, Youtao Zhang
:
Cascade: A Dependency-aware Efficient Training Framework for Temporal Graph Neural Network. 95-110 - Mayank Kabra
, Rakesh Nadig
, Harshita Gupta
, Rahul Bera
, Manos Frouzakis
, Vamanan Arulchelvan
, Yu Liang
, Haiyu Mao
, Mohammad Sadrosadati
, Onur Mutlu
:
CIPHERMATCH: Accelerating Homomorphic Encryption-Based String Matching via Memory-Efficient Data Packing and In-Flash Processing. 111-130 - Lian Liu
, Long Cheng
, Haimeng Ren
, Zhaohui Xu
, Yudong Pan
, Mengdi Wang
, Xiaowei Li
, Yinhe Han
, Ying Wang
:
COMET: Towards Practical W4A4KV4 LLMs Serving. 131-146 - Qichang Liu
, Yue Cheng
, Haiying Shen
, Ao Wang
, Bharathan Balaji
:
Concurrency-Informed Orchestration for Serverless Functions. 147-161 - Yongye Zhu
, Boru Chen
, Zirui Neil Zhao
, Christopher W. Fletcher
:
Controlled Preemption: Amplifying Side-Channel Attacks from Userspace. 162-177 - Jiashun Suo
, Xiaojian Liao
, Limin Xiao
, Li Ruan
, Jinquan Wang
, Xiao Su
, Zhisheng Huo
:
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory. 178-191 - Zhao Wang
, Yiqi Chen
, Cong Li
, Yijin Guan
, Dimin Niu
, Tianchan Guan
, Zhaoyang Du
, Xingda Wei
, Guangyu Sun
:
CTXNL: A Software-Hardware Co-designed Solution for Efficient CXL-Based Transaction Processing. 192-209 - Chloe Alverti
, Stratos Psomadakis
, Burak Ocalan
, Shashwat Jaiswal
, Tianyin Xu
, Josep Torrellas
:
CXLfork: Fast Remote Fork over CXL Fabrics. 210-226 - Sourav Mohapatra
, Vito Kortbeek
, Marco Antonio van Eerden
, Jochem Broekhoff
, Saad Ahmed
, Przemyslaw Pawelczak
:
Data Cache for Intermittent Computing Systems with Non-Volatile Main Memory. 227-243 - Georgian-Vlad Saioc
, I-Ting Angelina Lee
, Anders Møller
, Milind Chabbi
:
Dynamic Partial Deadlock Detection and Recovery via Garbage Collection. 244-259 - Xiao Xiong
, Zhaorui Chen
, Yue Liang
, Minghao Tian
, Jiaxing Shang
, Jiang Zhong
, Dajiang Liu
:
DynaX: Sparse Attention Acceleration with Dynamic X: M Fine-Grained Structured Pruning. 260-274 - Alexander Breuer
, Mark Blacher
, Max Engel
, Joachim Giesen
, Alexander Heinecke
, Julien Klaus
, Stefan Remke
:
Einsum Trees: An Abstraction for Optimizing the Execution of Tensor Expressions. 275-292 - Ayatallah Elakhras
, Jiahui Xu
, Martin Erhart
, Paolo Ienne
, Lana Josipovic
:
ElasticMiter: Formally Verified Dataflow Circuit Rewrites. 293-308 - Shutian Luo
, Jianxiong Liao
, Chenyu Lin
, Huanle Xu
, Zhi Zhou
, Chengzhong Xu
:
Embracing Imbalance: Dynamic Load Shifting among Microservice Containers in Shared Clusters. 309-324 - Jiawei Wang
, Nian Liu
, Arnau Casadevall-Saiz
, Yutao Liu
, Diogo Behrens
, Ming Fu
, Ning Jia
, Hermann Härtig
, Haibo Chen
:
Enabling Efficient Mobile Tracing with BTrace. 325-338 - Harsh Desai
, Xinye Wang
, Brandon Lucia
:
Energy-aware Scheduling and Input Buffer Overflow Prevention for Energy-harvesting Systems. 339-354 - Xinkai Wang
, Xiaofeng Hou
, Chao Li
, Yuancheng Li
, Du Liu
, Guoyao Xu
, Guodong Yang
, Liping Zhang
, Yuemin Wu
, Xiaopeng Yuan
, Quan Chen
, Minyi Guo
:
EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters. 355-372 - Berk Aydogmus
, Linsong Guo
, Danial Zuberi
, Tal Garfinkel
, Dean M. Tullsen
, Amy Ousterhout
, Kazem Taram
:
Extended User Interrupts (xUI): Fast and Flexible Notification without Polling. 373-389 - Shifan Xu
, Alvin Lu
, Yongshan Ding
:
Fat-Tree QRAM: A High-Bandwidth Shared Quantum Random Access Memory for Parallel Queries. 390-406 - Jarrett Minton
, Rajeev Balasubramonian
:
FLEXPROF: Flexible, Side-Channel-Free Memory Access. 407-420 - Yujie Wang
, Shiju Wang
, Shenhan Zhu
, Fangcheng Fu
, Xinyi Liu
, Xuefeng Xiao
, Huixia Li
, Jiashi Li
, Faming Wu
, Bin Cui
:
FlexSP: Accelerating Large Language Model Training via Flexible Sequence Parallelism. 421-436 - Chengsong Tan
, Alastair F. Donaldson
, John Wickerson
:
Formalising CXL Cache Coherence. 437-450 - Jiesong Liu
, Bin Ren
, Xipeng Shen
:
Generalizing Reuse Patterns for Efficient DNN on Microcontrollers. 451-466 - Annus Zulfiqar
, Ali Imran
, Venkat Kunaparaju
, Ben Pfaff
, Gianni Antichi
, Muhammad Shahbaz
:
Gigaflow: Pipeline-Aware Sub-Traversal Caching for Modern SmartNICs. 467-481 - Rhea Dutta
, Harish Dattatraya Dixit
, Rik van Riel
, Gautham Vunnam
, Sriram Sankar
:
Hardware Sentinel: Protecting Software Applications from Hardware Silent Data Corruptions. 482-497 - Luyang Li
, Heng Pan
, Xinchen Wan
, Kai Lv
, Zilong Wang
, Qian Zhao
, Feng Ning
, Qingsong Ning
, Shideng Zhang
, Zhenyu Li
, Layong Luo
, Gaogang Xie
:
Harmonia: A Unified Framework for Heterogeneous FPGA Acceleration in the Cloud. 498-514 - Samuel Alexander Stein, Shifan Xu
, Andrew W. Cross
, Theodore J. Yoder
, Ali Javadi-Abhari
, Chenxu Liu
, Kun Liu
, Zeyuan Zhou
, Charlie Guinn
, Yufei Ding
, Yongshan Ding
, Ang Li
:
HetEC: Architectures for Heterogeneous Quantum Error Correction Codes. 515-528 - Tingji Zhang
, Boris Grot
, Wenjian He
, Yashuai Lv
, Peng Qu
, Fang Su
, Wenxin Wang
, Guowei Zhang
, Xuefeng Zhang
, Youhui Zhang
:
Hierarchical Prefetching: A Software-Hardware Instruction Prefetcher for Server Applications. 529-544 - Wei Chen
, Zhi Zhang
, Xin Zhang
, Qingni Shen
, Yuval Yarom
, Daniel Genkin
, Chen Yan
, Zhe Wang
:
HyperHammer: Breaking Free from KVM-Enforced Isolation. 545-559 - Chenyuan Yang
, Zijie Zhao
, Lingming Zhang
:
KernelGPT: Enhanced Kernel Fuzzing via Large Language Models. 560-573 - Zhiyuan Fang
, Yuegui Huang
, Zicong Hong
, Yufeng Lyu
, Wuhui Chen
, Yue Yu
, Fan Yu
, Zibin Zheng
:
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline. 574-588 - Rishabh Jain
, Teyuh Chou
, Onur Kayiran
, John Kalamatianos
, Gabriel H. Loh
, Mahmut T. Kandemir
, Chita R. Das
:
Load and MLP-Aware Thread Orchestration for Recommendation Systems Inference on CPUs. 589-603 - Yan Sun
, Jongyul Kim
, Zeduo Yu
, Jiyuan Zhang
, Siyuan Chai
, Michael Jaemin Kim
, Hwayong Nam
, Jaehyun Park
, Eojin Na
, Yifan Yuan
, Ren Wang
, Jung Ho Ahn
, Tianyin Xu
, Nam Sung Kim
:
M5: Mastering Page Migration and Memory Management for CXL-based Tiered Memory Systems. 604-621 - Chang Liu
, Shuaihu Feng
, Yuan Li
, Dongsheng Wang
, Wenjian He
, Yongqiang Lyu
, Trevor E. Carlson
:
MDPeek: Breaking Balanced Branches in SGX with Memory Disambiguation Unit Side Channels. 622-638 - Yue Wu
, Namitha Liyanage
, Lin Zhong
:
Micro Blossom: Accelerated Minimum-Weight Perfect Matching Decoding for Quantum Error Correction. 639-654 - Weilin Cai
, Le Qin
, Jiayi Huang
:
MoC-System: Efficient Fault Tolerance for Sparse Mixture-of-Experts Model Training. 655-671 - Jianxing Xu
, Yuanbo Wen
, Zikang Liu
, Ruibai Xu
, Tingfeng Ruan
, Jun Bi
, Rui Zhang
, Di Huang
, Xinkai Song
, Yifan Hao
, Xing Hu
, Zidong Du
, Chongqing Zhao
, Jiang Jie
, Qi Guo
:
Mosaic: Exploiting Instruction-Level Parallelism on Deep Learning Accelerators with iTex Tessellation. 672-688 - Sotiris Apostolakis
, Chris Kennelly
, Xinliang David Li
, Parthasarathy Ranganathan
:
Necro-reaper: Pruning away Dead Memory Traffic in Warehouse-Scale Computers. 689-703 - Peiqing Chen
, Minghao Li
, Zishen Wan
, Yu-Shun Hsiao
, Minlan Yu
, Vijay Janapa Reddi
, Zaoxing Liu
:
OctoCache: Caching Voxels for Accelerating 3D Occupancy Mapping in Autonomous Systems. 704-718 - Zhanyuan Di
, Leping Wang
, En Shao
, Zhaojia Ma
, Ziyi Ren
, Feng Hua
, Lixian Ma
, Jie Zhao
, Guangming Tan
, Ninghui Sun
:
Optimizing Deep Learning Inference Efficiency through Block Dependency Analysis. 719-733 - Austin Ebel
, Karthik Garimella
, Brandon Reagen
:
Orion: A Fully Homomorphic Encryption Framework for Deep Learning. 734-749 - Zhen Jin, Yiquan Chen, Mingxu Liang, Yijing Wang, Guoju Fang, Ao Zhou, Keyao Zhang, Jiexiong Xu, Wenhai Lin, Yiquan Lin, Shushu Zhao, Wenkai Shi, Zhenhua He, Shishun Cai, Wenzhi Chen:
OS2G: A High-Performance DPU Offloading Architecture for GPU-based Deep Learning with Object Storage. 750-765 - Yintao He
, Haiyu Mao
, Christina Giannoula
, Mohammad Sadrosadati
, Juan Gómez-Luna
, Huawei Li
, Xiaowei Li
, Ying Wang
, Onur Mutlu
:
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System. 766-782 - Mahyar Emami
, Thomas Bourgeat
, James R. Larus
:
Parendi: Thousand-Way Parallel RTL Simulation. 783-797 - Ruihao Gong
, Shihao Bai
, Siyu Wu
, Yunqian Fan
, Zaijun Wang
, Xiuhong Li
, Hailong Yang
, Xianglong Liu
:
Past-Future Scheduler for LLM Serving under SLA Guarantees. 798-813 - Minkyung Park
, Jaeseung Choi
, Hyeonmin Lee
, Ted Taekyoung Kwon
:
Pave: Information Flow Control for Privacy-preserving Online Data Processing Services. 814-830 - Jubayer Mahmod
, Matthew Hicks
:
PhasePrint: Exposing Cloud FPGA Fingerprints by Inducing Timing Faults at Runtime. 831-844 - Jiajun Qin
, Tianhua Xia
, Cheng Tan
, Jeff Zhang
, Sai Qian Zhang
:
PICACHU: Plug-In CGRA Handling Upcoming Nonlinear Operations in LLMs. 845-861 - Yufeng Gu
, Alireza Khadem
, Sumanth Umesh
, Ning Liang
, Xavier Servot
, Onur Mutlu
, Ravi R. Iyer
, Reetuparna Das
:
PIM Is All You Need: A CXL-Enabled GPU-Free System for Large Language Model Inference. 862-881 - Yingtian Zhang
, Yan Kang
, Ziyu Ying
, Wanhang Lu
, Sijie Lan
, Huijuan Xu
, Kiwan Maeng
, Anand Sivasubramaniam
, Mahmut T. Kandemir
, Chita R. Das
:
Pirate: No Compromise Low-Bandwidth VR Streaming for Edge Devices. 882-896 - Aditya K. Kamath
, Ramya Prabhu
, Jayashree Mohan
, Simon Peter
, Ramachandran Ramjee
, Ashish Panwar
:
POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference. 897-912 - Jinyu Liu
, Wenjie Xiong
, G. Edward Suh
, Kiwan Maeng
:
Practical Federated Recommendation Model Learning Using ORAM with Controlled Privacy. 913-932 - Santiago Arranz Olmos
, Gilles Barthe
, Chitchanok Chuengsatiansup
, Benjamin Grégoire
, Vincent Laporte
, Tiago Oliveira
, Peter Schwabe
, Yuval Yarom
, Zhiyuan Zhang
:
Protecting Cryptographic Code Against Spectre-RSB: (and, in Fact, All Known Spectre Variants). 933-948 - Liang Qiao
, Jun Shi
, Xiaoyu Hao
, Xi Fang
, Sen Zhang
, Minfan Zhao
, Ziqi Zhu
, Junshi Chen
, Hong An
, Xulong Tang
, Bing Li
, Honghui Yuan
, Xinyang Wang
:
Pruner: A Draft-then-Verify Exploration Mechanism to Accelerate Tensor Program Tuning. 949-965 - Pingshi Yu
, Nicolas Wu
, Alastair F. Donaldson
:
Ratte: Fuzzing for Miscompilations in Multi-Level Compilers Using Composable Semantics. 966-981 - Zishen Wan
, Yuhang Du
, Mohamed Ibrahim
, Jiayi Qian
, Jason Jabbour
, Yang (Katie) Zhao
, Tushar Krishna
, Arijit Raychowdhury
, Vijay Janapa Reddi
:
ReCA: Integrated Acceleration for Real-Time and Efficient Cooperative Embodied Autonomous Agents. 982-997 - Ruihang Lai
, Junru Shao
, Siyuan Feng
, Steven Lyubomirsky
, Bohan Hou
, Wuwei Lin
, Zihao Ye
, Hongyi Jin
, Yuchen Jin
, Jiawei Liu
, Lesheng Jin
, Yaxing Cai
, Ziheng Jiang
, Yong Wu
, Sunghyun Park
, Prakalp Srivastava
, Jared Roesch
, Todd C. Mowry
, Tianqi Chen
:
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning. 998-1013 - Li-Chung Chiang
, Shih-Wei Li
:
Reload+Reload: Exploiting Cache and Memory Contention Side Channel on AMD SEV. 1014-1027 - Sayam Sethi
, Jonathan Mark Baker
:
RESCQ: Realtime Scheduling for Continuous Angle Quantum Error Correction Architectures. 1028-1043 - Tommy McMichen
, David Dlott
, Panitan Wongse-ammat
, Nathan Greiner
, Hussain Khajanchi
, Russ Joseph
, Simone Campanoni
:
Saving Energy with Per-Variable Bitwidth Speculation. 1044-1059 - Lorenz Hetterich
, Fabian Thomas
, Lukas Gerlach
, Ruiyi Zhang
, Nils Bernsdorf
, Eduard Ebert
, Michael Schwarz
:
ShadowLoad: Injecting State into Hardware Prefetchers. 1060-1075 - Hao Huang
, Yanqi Pan
, Wen Xia
, Xiangyu Zou
, Darong Yang
, Liang Shi
, Hongwei Du
:
Simplifying and Accelerating NOR Flash I/O Stack for RAM-Restricted Microcontrollers. 1076-1090 - Chrysanthos Pepi
, Bhargav Reddy Godala
, Krishnam Tibrewala
, Gino A. Chacon
, Paul V. Gratz
, Daniel A. Jiménez
, Gilles A. Pokam
, David I. August
:
Skia: Exposing Shadow Branches. 1091-1106 - Seonghun Son
, Daniel Moghimi
, Berk Gülmezoglu
:
SMaCk: Efficient Instruction Cache Attacks via Self-Modifying Code Conflicts. 1107-1123 - Sishuai Gong
, Wang Rui
, Deniz Altinbüken
, Pedro Fonseca
, Petros Maniatis
:
Snowplow: Effective Kernel Fuzzing with a Learned White-box Test Mutator. 1124-1138 - Yujie Wang
, Shenhan Zhu
, Fangcheng Fu
, Xupeng Miao
, Jie Zhang
, Juan Zhu
, Fan Hong
, Yong Li
, Bin Cui
:
Spindle: Efficient Distributed Training of Multi-Task Large Models via Wavefront Scheduling. 1139-1155 - Yuhang Zhou
, Zhibin Wang
, Guyue Liu
, Shipeng Li
, Xi Lin
, Zibo Wang
, Yongzhong Wang
, Fuchun Wei
, Jingyi Zhang
, Zhiheng Hu
, Yanlin Liu
, Chunsheng Li
, Ziyang Zhang
, Yaoyuan Wang
, Bin Zhou
, Wanchun Dou
, Guihai Chen
, Chen Tian
:
Squeezing Operator Performance Potential for the Ascend Architecture. 1156-1171 - Tong Xing
, Cong Xiong
, Tianrui Wei
, April Sanchez
, Binoy Ravindran
, Jonathan Balkind
, Antonio Barbalace
:
Stramash: A Fused-Kernel Operating System For Cache-Coherent, Heterogeneous-ISA Platforms. 1172-1188 - Yu Feng
, Zheng Liu
, Weikai Lin
, Zihan Liu
, Jingwen Leng
, Minyi Guo
, Zhezhi He
, Jieru Zhao
, Yuhao Zhu
:
StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination. 1189-1202 - Jinshu Liu
, Hamid Hadian
, Yuyue Wang
, Daniel S. Berger
, Marie Nguyen
, Xun Jian
, Sam H. Noh
, Huaicheng Li
:
Systematic CXL Memory Characterization and Performance Analysis at Scale. 1203-1217 - Chris Porter
, Sharjeel Khan
, Kangqi Ni
, Santosh Pande
:
Tackling ML-based Dynamic Mispredictions using Statically Computed Invariants for Attack Surface Reduction. 1218-1234 - Lei Cui
, Youquan Xian
, Peng Liu
, Longjin Lu
:
TaintEMU: Decoupling Tracking from Functional Domains for Architecture-Agnostic and Efficient Whole-System Taint Tracking. 1235-1250 - Dezhi Ran
, Zihe Song
, Wenyu Wang
, Wei Yang
, Tao Xie
:
TaOPT: Tool-Agnostic Optimization of Parallelized Automated Mobile UI Testing. 1251-1265 - Jovan Stojkovic
, Chaojie Zhang
, Íñigo Goiri
, Esha Choukse
, Haoran Qiu
, Rodrigo Fonseca
, Josep Torrellas
, Ricardo Bianchini
:
TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms. 1266-1281 - Dimitra Giantsidi
, Julian Pritzi
, Felix Gust
, Antonios Katsarakis
, Atsushi Koshiba
, Pramod Bhatotia
:
TNIC: A Trusted NIC Architecture: A hardware-network substrate for building high-performance trustworthy distributed systems. 1282-1301 - Xin Tan
, Yimin Jiang
, Yitao Yang
, Hong Xu
:
Towards End-to-End Optimization of LLM-based Applications with Ayo. 1302-1316 - Hyungseok Kim
, Soomin Kim
, Sang Kil Cha
:
Towards Sound Reassembly of Modern x86-64 Binaries. 1317-1333 - Yuan-Hsi Chou
, Tor M. Aamodt
:
Treelet Accelerated Ray Tracing on GPUs. 1334-1347 - Apoorve Mohan
, Robert Walkup
, Bengi Karacali
, Ming-Hung Chen
, Abdullah Kayi
, Liran Schour
, Shweta Salaria
, Sophia Wen
, I-Hsin Chung
, Abdul Alim
, Constantinos Evangelinos
, Lixiang Luo
, Marc Dombrowa
, Laurent Schares
, Ali Sydney
, Pavlos Maniotis
, Sandhya Koteshwara
, Brent Tang
, Joel Belog
, Rei Odaira
, Vasily Tarasov
, Eran Gampel
, Drew Thorstensen
, Talia Gershon
, Seetharami Seelam
:
Vela: A Virtualized LLM Training System with GPU Direct RoCE. 1348-1364 - Reto Achermann
, Em Chu
, Ryan Mehri
, Ilias Karimalis
, Margo I. Seltzer
:
Velosiraptor: Code Synthesis for Memory Translation. 1365-1381 - Hansung Kim
, Ruohan Richard Yan
, Joshua You
, Tieliang Vamber Yang
, Yakun Sophia Shao
:
Virgo: Cluster-level Matrix Unit Integration in GPUs for Scalability and Energy Efficiency. 1382-1399 - Konstantinos Kanellopoulos
, Konstantinos Sgouras
, F. Nisa Bostanci
, Andreas Kosmas Kakolyris
, Berkin Kerim Konar
, Rahul Bera
, Mohammad Sadrosadati
, Rakesh Kumar
, Nandita Vijaykumar
, Onur Mutlu
:
Virtuoso: Enabling Fast and Accurate Virtual Memory Research via an Imitation-based Operating System Simulation Methodology. 1400-1421

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.