


default search action
SoCC 2025: Virtual Event, USA
- Proceedings of the 2025 ACM Symposium on Cloud Computing, SoCC 2025, Online, USA, November 19-21, 2025. ACM 2025, ISBN 979-8-4007-2276-9

- Yanying Lin, Shuaipeng Wu, Shutian Luo, Hong Xu, Haiying Shen, Chong Ma, Min Shen, Le Chen, Chengzhong Xu, Lin Qu, Kejiang Ye:

Understanding Diffusion Model Serving in Production: A Top-Down Analysis of Workload, Scheduling, and Resource Efficiency. 1-15 - Haoyu Li, Jingkai Fu, Qing Li, Windsor Hsu, Asaf Cidon:

DFUSE: Strongly Consistent Write-Back Kernel Caching for Distributed Userspace File Systems. 16-28 - Shamiek Mangipudi, Pavel Chuprikov, Gerald Prendi, Patrick Eugster:

Confidential Analytics with Scylla. 29-44 - Bicheng Yang, Jingkai He, Dong Du, Yubin Xia, Haibo Chen:

Offloading Cloud-Native Infrastructure with XpuPod. 45-58 - Steven W. D. Chien, Kento Sato, Artur Podobas, Niclas Jansson, Stefano Markidis, Michio Honda:

ParaLog: Consistent Host-side Logging for Parallel Checkpoints. 59-73 - Tong Xing, Jiaxun Yang, Javier Picorel, Antonio Barbalace:

Rethinking Tiered Memory Management in Cloud Data Centers. 74-87 - Ruihao Li, Shagnik Pal, Vineeth Narayan Pullu, Prasoon Sinha, Jeeho Ryoo, Lizy K. John, Neeraja J. Yadwadkar:

Oneiros: KV Cache Optimization through Parameter Remapping for Multi-tenant LLM Serving. 88-101 - Amit Samanta, Yankai Jiang, Ryan Stutsman, Rohan Basu Roy:

Water Footprint of Datacenter Applications: Methodological Implications of Manufacturing, Operational, and Decommissioning Phases. 102-110 - Shiwei Zhang, Lansong Diao, Zisheng Meng, Siyu Wang, Wei Lin, Chuan Wu:

DyOrc: Efficient Serving of Dynamic Machine Learning Workflows. 111-124 - Chirag C. Shetty, Sarthak Chakraborty, Hubertus Franke, Larisa Shwartz, Chandra Narayanaswami, Indranil Gupta, Saurabh Jha:

CPU-Limits kill Performance: Time to rethink Resource Control. 125-133 - Liuzixuan Lin, Andrew A. Chien:

Middlebox: Unlocking Datacenter Growth and Grid Decarbonization. 134-148 - Wenda Tang, Yanan Yang, Jie Wu:

Metis: A Non-Clairvoyant, Workflow-Aware OS Scheduler for Serverless Applications. 149-162 - David Li, Angela Li:

Cloud-Native Digital Twin Orchestration for Real-Time Decision Optimization Using Fuzzy Constraints and Reinforcement Learning. 163-169 - Xiaojun Guo, Guangjie Xing, Hua Wang, Ke Zhou, Ming Xie, Fenqiang Yang, Min Fu, Bin Xu, Jianying Hu, Guangchao Yang:

DuoAdmit: Dual-Layer Cache Admission for Load-Balancing Hybrid-Redundancy Block Storage. 170-182 - Rohan Gandhi, Ankur Mallick:

PnM: Efficient Intra-Datacenter Calls Packing for Large Conferencing Services. 183-195 - Maziyar Nazari, Daniel Noland, Giulio Sidoretti, Erika Hunhoff, Tamara Silbergleit Lehman, Eric Keller:

THORN-ML: Transparent Hardware Offloaded Resilient Networks for RDMA based Distributed ML Workloads. 196-208 - Atsushi Koshiba, Charalampos Mainas, Pramod Bhatotia:

Funky: Cloud-Native FPGA Virtualization and Orchestration. 209-224 - Rui Wei, Hanfei Yu, Xikang Song, Jian Li, Devesh Tiwari, Ying Mao, Hao Wang:

Multi-Agent Reinforcement Learning with Serverless Computing. 225-239 - Haoxuan Yu, Sheng Yao, Wei Wang:

ZipBatch: Multi-Tenant GPU Batching with Dual-Resource Regulation. 240-254 - Yaoxuan Li, Pu Pang, Yecheng Yang, Quan Chen, Zhengxuan Yan, Guoyao Xu, Guodong Yang, Liping Zhang, Minyi Guo:

WDP: Mitigating Interference in CPU Sharing Through Wake-up Delay Driven Preemption for QoS-aware Co-location. 255-268 - Ruizhe Huang, Xinyu Wang, Zhida An, Hanwen Lei, Peng Jiang, Ziqi Zhang, Ding Li, Yao Guo, Xiangqun Chen, Yuntao Liu, Kang Zhou, Yuxin Ren, Ning Jia, Xinwei Hu:

Cost-Efficient Cloud Infrastructure with Hugepage-aware Memory Deduplication. 269-282 - Ioannis Zarkadas, Amanda Tomlinson, Asaf Cidon, Baris Kasikci, Ofir Weisse:

Snap & Replay: A new way to analyze uarch-scale performance bottlenecks for ML accelerators. 283-298 - Lei Liu, Yinling Zhang:

DRAM Failure Prediction with Correctable Error Spatial Patterns: A Hybrid Learning Approach. 299-306 - Mincheol Sung, Ruslan Nikolaev, Binoy Ravindran:

Scalable and Fault-Tolerant Storage and File System Services with Non-Blocking Synchronization for Private Clouds. 307-319 - Sabiha Afroz, Redwan Ibne Seraj Khan, Hadeel Albahar, Jingoo Han, Ali Raza Butt:

10Cache: Heterogeneous Resource-Aware Tensor Caching and Migration for LLM Training. 320-333 - Wenhao Lv, Hao Guo, Qing Wang, Youyou Lu, Jiwu Shu:

Accelerating Distributed Filesystem Metadata Service via Decoupling Directory Semantics from Metadata Indexing. 334-347 - Yi Hua, Xiulong Liu, Hao Xu, Chenyu Zhang, Gaowei Shi, Keqiu Li, Muhammad Shahzad, Guyue (Grace) Liu:

Orcas: A DAG-based Consensus Approach with Linear Communication Overhead. 348-360 - Kaiyu Huang, Hao Wu, Zhubo Shi, Han Zou, Minchen Yu, Qingjiang Shi:

AdaSpec: Adaptive Speculative Decoding for Fast, SLO-Aware Large Language Model Serving. 361-374 - Bing Li, Yuquan Ren, Xinyi Song, Zhilei Liu, Cong Xu, Jingyuan Zhang, Caixue Lin, Wu Xiang, Rui Shi:

From Bottleneck to Breakthrough: Optimizing Scheduling for Hyperscale Containerized Clusters. 375-387 - Amit Samanta, Ryan Stutsman, Rohan Basu Roy:

GridGreen: Integrating Serverless Computing in HPC Systems for Performance and Sustainability. 388-401 - Qingfu Wu, Pengfei Chen, Yilun Wang:

Defragmentation Scheduling with Deep Reinforcement Learning in Shared GPU Clusters. 402-415 - Li Wu, Walid A. Hanafy, Tarek F. Abdelzaher, David E. Irwin, Jesse Milzman, Prashant J. Shenoy:

FailLite: Failure-Resilient Model Serving for Resource-Constrained Edge Environments. 416-429 - Ranjitha K., Ankit Sharma, Malsawmsanga Sailo, Arun Siddardha, Amrit Kumar, Praveen Tammana, Pravein Govindan Kannan, Priyanka Naik:

PerfMon: Performance Monitoring of Host Network Stack. 430-442 - Iraklis Psaroudakis, Pooya Salehi, Jason Bryan, Francisco Fernández Castaño, Brendan Cully, Ankita Kumar, Henning Andersen, Thomas Repantis:

Serverless Elasticsearch: the Architecture Transformation from Stateful to Stateless. 443-455 - Lucas Lebow, Mason Dunkle, Christopher Siems, Jonathan Zarnstorff, Lewis Tseng:

Revisiting State Machine Replication in Practice: Lessons from Building an etcd-inspired System. 456-463 - Gaulthier Gain, Benoit Knott, Cyril Soldani, Laurent Mathy:

Memory Matters: Load-Time Deduplication for Unikernels. 464-478 - Hongyu Lei, Shiyu Di, Chunhua Li, Ke Zhou, Ming Xie, Fenqiang Yang, Jianping Zhu, Xiang Li, Kezhou Yan:

CoRe: Collaborative Replica Scheduling for Large-Scale Cloud Database Services. 479-492 - Shuang Zeng, Haitao Zhang, Zezhong Yan:

CoMPI: Coordinated Model Merging and Parallel Inference at Edge. 493-506 - Rajini Wijayawardana, Andrew A. Chien:

Scheduling Cloud VMs on Variable Capacity Datacenters. 507-520 - Yuanhang Chen, Xiaosong Chen, Wenyan Chen, Huanle Xu:

FedDance: Efficient Participant Selection for Federated Learning in Highly Dynamic Environments. 521-534 - Yazhuo Zhang, Jinqing Cai, Avani Wildani, Ana Klimovic:

Rethinking Web Cache Design for the AI Era. 535-542 - Devashish R. Purandare, Peter Alvaro, Avani Wildani, Darrell D. E. Long, Ethan L. Miller:

Valet: Efficient Data Placement on Modern SSDs. 543-556 - Shruti Dongare, Redwan Ibne Seraj Khan, Hadeel Albahar, Nannan Zhao, Diego Meléndez-Maita, Ali Reza Butt:

Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters. 557-570 - Debojeet Das, Kevin Prafull Baua, Aditya Kansara, Arghyadip Chakraborty, Dheeraj Kurukunda, Mythili Vutukuru, Purushottam Kulkarni:

FLASH: Fast Linked AF_XDP Sockets for High Performance Network Function Chains. 571-584 - Nenad Milosevic, Robert Soulé, Fernando Pedone:

The case for synchronous distributed protocols in public clouds. 585-599 - Sudipta Saha Shubha, Haiying Shen, Ganesh Ananthanarayanan:

CIS: Checkpointed Inference for Data Drift-Resilient Model Serving at Edge Servers. 600-613 - Huaifeng Zhang, Mohannad Alhanahnah, Philipp Leitner, Ahmed Ali-Eldin:

BLAFS: A Bloat-Aware Container File System. 614-628 - Yineng Yan, William Ruys, Hochan Lee, Ian Henriksen, Arthur Peters, Sean Stephens, Bozhi You, Henrique Fingler, Martin Burtscher, Milos Gligoric, Keshav Pingali, Mattan Erez, George Biros, Christopher J. Rossbach:

VLCs: Managing Parallelism with Virtualized Libraries. 629-643 - Serhii Ivanenko, Vasyl Lanko, Rudi Horn, Vojin Jovanovic, Rodrigo Bruno:

Hydra: Virtualized Multi-Language Runtime for High-Density Serverless Platforms. 644-658 - Davide Rovelli, Christian Faerber, Graham McKenzie, Ali Pahlevan, Sina Darabi, Patrick Jahnke, Patrick Eugster:

Nano-consensus: Ultra-fast, Quorum-less Coordination on the Wire. 659-672 - Ning Li, Hong Jiang, Hao Che, Zhijun Wang:

REEF: Energy-Efficient, Application-QoS-Aware Thread Processing in Oversubscribed Server Environments. 673-686 - Paul Elvinger, Foteini Strati, Natalie Enright Jerger, Ana Klimovic:

Understanding GPU Resource Interference One Level Deeper. 687-694 - Hyeon-Jun Jang, Sang-Jae Kim, Weikuan Yu, Hyun-Wook Jin:

Spatio-Temporal Resource Control for Cloud-Native GPU Provisioning. 695-707 - Shuwen Sun, Isaac Khor, Ji-Yong Shin, Peter Desnoyers:

A Fast, Efficient, and Strongly-Consistent Object Store. 708-721 - Yuzi Li, Zhigang Wang, Qinghua Zhang, Junfeng Zhao:

FedLTA: A Federated Long-Tail Alignment Framework via Global Class Anchors. 722-734 - Tao Luo, Kelvin K. W. Ng, Zhen Ping Khor, Sidharth Sankhe, Boon Thau Loo, Vincent Liu:

Multiplexed Heterogeneous LLM Serving via Stage-Aligned Parallelism. 735-747 - Joel Wolfrath, Daniel Frink, Abhishek Chandra:

SneakPeek: Data-Aware Model Selection and Scheduling for Inference Serving on the Edge. 748-761 - Talha Mehboob, Luanzheng Guo, Nathan R. Tallent, Michael Zink, David Irwin:

PowerTrip: Exploiting Federated Heterogeneous Datacenter Power for Distributed ML Training. 762-775 - Saransh Gupta, Umesh Deshpande, Travis Janssen, Swaminathan Sundararaman:

Symbiosis: Multi-Adapter Inference and Fine-Tuning. 776-789 - Romolo Marotta, Gabriele Russo Russo, Francesco Quaglia, Pierangelo di Sanzo:

A Bootstrapping Technique for Reducing the Costs of Machine Learning Models for Predicting Execution Times in IaaS Clouds. 790-802 - Yuzhuo Yang, Kaihua Fu, Quan Chen, Deze Zeng, Shuo Quan, Jie Wu, Minyi Guo:

FaaSGNN: Enabling Memory Efficient and Low Latency GNN Inference Services with Serverless Computing. 803-816 - Haoran Qiu, Anish Biswas, Zihan Zhao, Jayashree Mohan, Alind Khare, Esha Choukse, Íñigo Goiri, Zeyu Zhang, Haiying Shen, Chetan Bansal, Ramachandran Ramjee, Rodrigo Fonseca:

ModServe: Modality- and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving. 817-830 - Prasoon Sinha, Kostis Kaffes, Neeraja J. Yadwadkar:

ALAP: Intent-Based Serverless Computing via Delayed Decision-Making. 831-846 - Yuzheng Zhang, Renyu Yang, Junhong Liu, Weihan Jiang, Tianyu Ye, Yiqiao Liao, Penghao Zhang, Tiezi Zhang, Kun Shang, Tianyu Wo, Chunming Hu, Chengru Song, Jin Ouyang:

Cuckoo: Deadline-Aware Job Packing on Heterogeneous GPUs for DL Model Training. 847-859 - Lazar Cvetkovic, Ana Klimovic:

Towards a Lightweight Sidecar-based Service Mesh for Serverless. 860-866 - Davis Kazemaks, Laurens Versluis, Burcu Kulahcioglu Ozkan, Jérémie Decouchant:

Balancing Fairness and Performance in Multi-User Spark Workloads with Dynamic Scheduling. 867-880 - Yihui Zhang, Han Shen, Renyu Yang, Di Tian, Yuxi Luo, Menghao Zhang, Li Li, Chunming Hu, Tianyu Wo, Chengru Song, Jin Ouyang:

Cauchy: A Cost-Efficient LLM Serving System through Adaptive Heterogeneous Deployment. 881-893

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














