Accepted Papers
The following papers have been accepted for presentation at the 31st ACM SIGOPS Symposium on Operating Systems Principles (SOSP).
- Rearchitecting the Thread Model of In-Memory Key-Value Stores with μTPS
Youmin Chen (Shanghai Jiao Tong University), Jiwu Shu (Tsinghua University), Yanyan Shen, Linpeng Huang, Hong Mei (Shanghai Jiao Tong University) - Device-Assisted Live Migration of RDMA Devices
Artem Y. Polyakov, Gal Shalom, Asaf Schwartz, Aviad Yehezkel, Omri Ben David, Omri Kahalon, Ariel Shahar, Liran Liss (NVIDIA Corporation) - Prove It to the Kernel: Precise Extension Analysis via Proof-Guided Abstraction Refinement
Hao Sun, Zhendong Su (ETH Zurich) - eBPF Misbehavior Detection: Fuzzing with a Specification-Based Oracle
Tao Lyu, Kumar Kartikeya Dwivedi, Thomas Bourgeat, Mathias Payer (EPFL), Meng Xu (University of Waterloo), Sanidhya Kashyap (EPFL) - Mercury: Unlocking Multi-GPU Operator Optimization for LLMs via Remote Memory Scheduling
Yue Guan, Xinwei Qiang, Zaifeng Pan (UCSD), Daniels Johnson, Yuanwei Fang (Meta), Keren Zhou (George Mason University, OpenAI), Yuke Wang (Rice University), Wanlu Li (UCSD), Yufei Ding (UCSD, Meta), Adnan Aziz (Meta) - Pesto: Cooking up High Performance BFT Queries
Florian Suri-Payer (Cornell University), Neil Giridharan (UC Berkeley), Liam Arzola (UC San Diego), Shir Cohen, Lorenzo Alvisi (Cornell University), Natacha Crooks (UC Berkeley) - How to Copy Memory? Coordinated Asynchronous Copy as a First-Class OS Service
Jingkai He, Yunpeng Dong, Dong Du (Shanghai Jiao Tong University), Mo Zou, Zhitai Yu, Yuxin Ren, Ning Jia (Huawei Technologies), Yubin Xia, Haibo Chen (Shanghai Jiao Tong University) - Demeter: A Scalable and Elastic Tiered Memory Solution for Virtualized Cloud via Guest Delegation
Junliang Hu, Zhisheng Hu (The Chinese University of Hong Kong), Chun-Feng Wu (National Yang Ming Ciao Tung University), Ming-Chang Yang (The Chinese University of Hong Kong) - Moirai: Optimizing Placement of Data and Compute in Hybrid Clouds
Ziyue Qiu, Hojin Park (Carnegie Mellon University), Jing Zhao, Yu-Kai Wang, Arnav Balyan, Gurmeet Singh, Yangjun Zhang, Suqiang (Jack) Song (Uber), Gregory R. Ganger, George Amvrosiadis (Carnegie Mellon University) - Unlocking True Elasticity for the Cloud-Native Era with Dandelion
Tom Kuchler, Pinghe Li, Yazhuo Zhang, Lazar Cvetković, Boris Goranov, Tobias Stocker, Leon Thomm, Simone Kalbermatter, Tim Notter (ETH Zurich), Andrea Lattuada (MPI-SWS), Ana Klimovic (ETH Zurich) - Sleeping with One Eye Open: Fast, Sustainable Storage with Sandman
Yanbo Zhou (UC San Diego), Erci Xu (Shanghai Jiao Tong University), Anisa Su, Jim Harris, Adam Manzanares (Samsung Semiconductor), Steven Swanson (UC San Diego) - Spirit: Fair Allocation of Interdependent Resources in Remote Memory Systems
Seung-seob Lee, Jachym Putta (Yale University), Ziming Mao (UC Berkeley), Anurag Khandelwal (Yale University) - HedraRAG: Co-Optimizing Generation and Retrieval for Heterogeneous RAG Workflows
Zhengding Hu, Vibha Murthy, Zaifeng Pan, Wanlu Li (University of California San Diego), Xiaoyi Fang (RegAilator Inc), Yufei Ding (University of California San Diego), Yuke Wang (Rice University) - Scalable Address Spaces using Concurrent Interval Skiplist
Tae Woo Kim, Youngjin Kwon (KAIST), Jeehoon Kang (KAIST / FuriosaAI) - Characterizing Mobile SoC for Accelerating Heterogeneous LLM Inference
Le Chen (Shanghai Jiao Tong University), Dahu Feng (Tsinghua university), Erhu Feng (Shanghai Jiao Tong University), Yingrui Wang (SenseTime), Rong Zhao (Tsinghua University), Yubin Xia (Shanghai Jiao Tong University), Pinjie Xu (SenseTime Research), Haibo Chen (Shanghai JiaoTong University) - μFork: Supporting POSIX fork Within a Single-Address-Space OS
John Alistair Kressel (The University of Manchester), Hugo Lefeuvre (The University of British Columbia), Pierre Olivier (The University of Manchester) - DiffKV: Differentiated Memory Management for Large Language Models with Parallel KV Compaction
Yanqi Zhang, Yuwei Hu, Runyuan Zhao (Huawei), John C.S. Lui (The Chinese University of Hong Kong), Haibo Chen (Shanghai JiaoTong University) - Proto: A Guided Journey through Modern OS Construction
Wonkyo Choe, Rongxiang Wang, Afsara Benazir, Felix Xiaozhu Lin (University of Virginia) - Oasis: Pooling PCIe Devices Over CXL to Boost Utilization
Yuhong Zhong (Columbia University), Daniel S. Berger (Microsoft Azure and University of Washington), Pantea Zardoshti, Enrique Saurez (Microsoft Azure), Jacob Nelson, Dan R. K. Ports (Microsoft Research), Antonis Psistakis (University of Illinois Urbana-Champaign), Joshua Fried (MIT CSAIL), Asaf Cidon (Columbia University) - PhoenixOS: Concurrent OS-level GPU Checkpoint and Restore with Validated Speculation
Xingda Wei (Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University), Zhuobin Huang (National University of Singapore), Tianle Sun, Yingyi Hao, Rong Chen, Mingcong Han, Jinyu Gu, Haibo Chen (Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University) - Pie: A Programmable Serving System for Emerging LLM Applications
In Gim, Zhiyao Ma, Seung-seob Lee, Lin Zhong (Yale University) - Aegaeon: Effective GPU Pooling for Concurrent LLM Serving on the Market
Yuxing Xiang (Peking University), Xue Li, Kun Qian, Yufan Yang, Diwen Zhu, Wenyuan Yu, Ennan Zhai (Alibaba Group), Xuanzhe Liu, Xin Jin (Peking University), Jingren Zhou (Alibaba Group) - Aeolia: A Fast and Secure Userspace Interrupt-Based Storage Stack
Chuandong Li (Peking University and Zhongguancun Laboratory), Ran Yi (Peking University), Zonghao Zhang (Peking University and Zhongguancun Laboratory), Jing Liu (Microsoft Research), Changwoo Min (Igalia), Jie Zhang, Yingwei Luo, Xiaolin Wang (Peking University and Zhongguancun Laboratory), Zhenlin Wang (Michigan Tech), Diyu Zhou (Peking University) - Ghost in the Android Shell: Pragmatic Test-oracle Specification of a Production Hypervisor
Kayvan Memarian, Ben Simner, David Kaloper-Meršinjak, Thibaut Pérami, Peter Sewell (University of Cambridge) - LithOS: An Operating System for Efficient Machine Learning on GPUs
Patrick H. Coppock, Brian Zhang, Eliot H. Solomon, Vasileios Kypriotis (Carnegie Mellon University), Leon Yang, Bikash Sharma, Dan Schatzberg (Meta), Todd C. Mowry, Dimitrios Skarlatos (Carnegie Mellon University) - WASIT: Deep and Continuous Differential Testing of WebAssembly System Interface Implementations
Yage Hu, Wen Zhang, Botang Xiao, Qingchen Kong, Boyang Yi, Suxin Ji, Songlan Wang, Wenwen Wang (University of Georgia) - cache_ext: Customizing the Page Cache with eBPF
Tal Zussman, Ioannis Zarkadas, Jeremy Carin, Andrew Cheng (Columbia University), Hubertus Franke, Jonas Pfefferle (IBM Research), Asaf Cidon (Columbia University) - Atmosphere: Practical Verified Kernels with Rust and Verus
Xiangdong Chen, Zhaofeng Li, Jerry Zhang (University of Utah), Vikram Narayanan (Palo Alto Networks), Anton Burtsev (University of Utah) - AutoMan: Facilitating Verified Distributed Systems Development Through Automatic Code Generation and Manual Optimizations
Zihao Zhang, Ti Zhou, Christa Jenkins, Omar Chowdhury, Shuai Mu (Stony Brook University) - Jenga: Effective Memory Management for Serving LLM with Heterogeneity
Chen Zhang (Tsinghua University & UC Berkeley), Kuntai Du (University of Chicago), Shu Liu, Woosuk Kwon, Xiangxi Mo (UC Berkeley), Yufeng Wang (Independent Researcher), Xiaoxuan Liu (UC Berkeley), Kaichao You (Tsinghua University), Zhuohan Li (UC Berkeley), Mingsheng Long, Jidong Zhai (Tsinghua University), Joseph Gonzalez, Ion Stoica (UC Berkeley) - Mantle: Efficient Hierarchical Metadata Management for Cloud Object Storage Services
Jiahao Li (University of Science and Technology of China, Baidu (China) Co., Ltd), Biao Cao, Jielong Jian (Baidu (China) Co., Ltd), Cheng Li (The University of Science and Technology of China, Institute of Artificial Intelligence, Hefei Comprehensive National Science Center), Sen Han, Yiduo Wang, Yufei Wu (University of Science and Technology of China), Kang Chen (Tsinghua University), Zhihui Yin, Qiushi Chen, Jiwei Xiong, Jie Zhao, Fengyuan Liu, Yan Xing, Liguo Duan, Miao Yu, Ran Zheng (Baidu (China) Co., Ltd), Feng Wu (University of Science and Technology of China, Institute of Artificial Intelligence, Hefei Comprehensive National Science Center), Xianjun Meng (Baidu (China) Co., Ltd) - Fast End-to-End Performance Simulation of Accelerated Hardware-Software Stacks
Jiacheng Ma (EPFL), Jonas Kaufmann (Max Planck Institute for Software Systems (MPI-SWS)), Emilien Guandalino (EPFL), Rishabh Iyer (UC Berkeley), Thomas Bourgeat, George Candea (EPFL) - Analyzing and Enhancing ArckFS: An Anecdotal Example of Benefits of Artifact Evaluation
Jonguk Jeon, Subeen Park (KAIST), Sanidhya Kashyap (EPFL), Sudarsun Kannan (Rutgers University), Diyu Zhou (Peking University), Jeehoon Kang (KAIST / FuriosaAI) - The Design and Implementation of a Virtual Firmware Monitor Charly Castes (EPFL), François Costa (ETH Zurich), Neelu S. Kalani (EPFL), Timothy Roscoe (ETH Zurich), Nate Foster (Cornell and Jane Street), Thomas Bourgeat, Edouard Bugnion (EPFL)
- KNighter: Transforming Static Analysis with LLM-Synthesized Checkers
Chenyuan Yang, Zijie Zhao (University of Illinois at Urbana-Champaign), Zichen Xie (Zhejiang University), Haoyu Li (Shanghai Jiao Tong University), Lingming Zhang (University of Illinois at Urbana-Champaign) - Tock: From Research To Securing 10 Million Computers
Leon Schuermann (Princeton University), Brad Campbell (University of Virginia), Branden Ghena (Northwestern University), Philip Levis (Stanford University), Amit Levy (Princeton University), Pat Pannuto (University of California, San Diego) - IC-Cache: Efficient Large Language Model Serving via In-context Caching
Yifan Yu (University of Illinois Urbana-Champaign), Yu Gan, Nikhil Sarda, Lillian Tsai, Jiaming Shen, Yanqi Zhou (Google), Arvind Krishnamurthy (Google/Univ. of Washington), Fan Lai (University of Illinois Urbana-Champaign), Hank Levy (Google/Univ. of Washington), David Culler (Google) - Quilt: Resource-aware Merging of Serverless Workflows
Yuxuan Zhang, Sebastian Angel (University of Pennsylvania) - Running Consistent Applications Closer to Users with Radical for Lower Latency
Nicolaas Kaashoek (Princeton University), Oleg A. Golev (Sentient Foundation), Austin T. Li (Cornell University), Amit Levy, Wyatt Lloyd (Princeton University). - Orthrus: Efficient and Timely Detection of Silent User Data Corruption in the Cloud with Resource-Adaptive Computation Validation
Chenxiao Liu (University of Chinese Academy of Sciences), Zhenting Zhu (UCLA), Quanxi Li, Yanwen Xia (University of Chinese Academy of Sciences), Yifan Qiao (UC Berkeley), Xiangyun Deng (Peking University), Youyou Lu (Tsinghua University), Tao Xie (Peking University), Huimin Cui, Zidong Du (University of Chinese Academy of Sciences), Harry Xu (UCLA), Chenxi Wang (University of Chinese Academy of Sciences) - ORQ: Complex Analytics on Private Data with Strong Security Guarantees
Eli Baum, Sam Buxbaum (Boston University), Nitin Mathai (The University of Texas at Austin), Muhammad Faisal, Vasiliki Kalavri, Mayank Varia, John Liagouris (Boston University) - PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications
Kuntai Du (University of Chicago / TensorMesh, Inc.), Bowen Wang, Chen Zhang (Tsinghua University / UC Berkeley), Yiming Cheng (University of Chicago), Qing Lan, Hejian Sang (LinkedIn), Yihua Cheng, Jiayi Yao (University of Chicago / TensorMesh, Inc.), Xiaoxuan Liu, Yifan Qiao, Ion Stoica (UC Berkeley), Junchen Jiang (University of Chicago / TensorMesh, Inc.) - Mitigating Application Resource Overload with Targeted Task Cancellation
Yigong Hu (Boston University), Zeyin Zhang (Johns Hopkins University), Yicheng Liu (University of Michigan & University of California, Los Angeles), Yile Gu (University of Washington), Shuangyu Lei (University of Michigan), Baris Kasikci (University of Washington), Peng Huang (University of Michigan) - CortenMM: Efficient Memory Management with Strong Correctness Guarantees
Junyang Zhang (Peking University and Zhongguancun Laboratory), Xiangcan Xu, Yonghao Zou (Peking University), Zhe Tang (Peking University and Zhongguancun Laboratory), Xinyi Wan (Ant Group), Kang Hu, Siyuan Wang, Wenbo Xu, Di Wang (Peking University and Zhongguancun Laboratory), Hao Chen (CertiK), Lin Huang, Shoumeng Yan (Ant Group), Yuval Tamir (UCLA), Yingwei Luo, Xiaolin Wang, Huashan Yu (Peking University and Zhongguancun Laboratory), Zhenlin Wang (Michigan Tech), Hongliang Tian (Ant Group), Diyu Zhou (Peking University) - TRIP: Coercion-resistant Registration for E-Voting with Verifiability and Usability in Votegral
Louis-Henri Merino (EPFL), Simone Colombo (King's College London), Rene Reyes (Boston University), Alaleh Azhir (Harvard University), Shailesh Mishra, Pasindu Tennage (EPFL), Mohammad Amin Raeisi (Yale University), Haoqian Zhang, Jeff R. Allen (EPFL), Bernhard Tellenbach (Armasuisse), Vero Estrada-Galiñanes, Bryan Ford (EPFL) - Robust LLM Training Infrastructure at ByteDance
Borui Wan (The University of Hong Kong), Gaohong Liu, Zuquan Song, Jun Wang, Yun Zhang (ByteDance Seed), Guangming Sheng (The University of Hong Kong), Shuguang Wang, Houmin Wei, Chenyuan Wang, Weiqiang Lou, Xi Yang, Mofan Zhang, Kaihua Jiang, Cheng Ren, Xiaoyun Zhi, Menghan Yu, Zhuolin Zheng, Zhe Nan, Baoquan Zhong, Qinlong Wang, Huan Yu, Jinxin Chi, Wang Zhang, Yuhan Li, Zixian Du, Sida Zhao, Yongqiang Zhang, Jingzhe Tang, Zherui Liu (ByteDance Seed), Chuan Wu (The University of Hong Kong), Yanghua Peng, Haibin Lin, Wencong Xiao, Xin Liu, Liang Xiang (ByteDance Seed) - Sailor: Automating Distributed Training over Dynamic, Heterogeneous, and Geo-distributed Clusters
Foteini Strati, Zhendong Zhang, George Manos (ETH Zurich), Ixeia Sánchez Périz (unaffiliated), Qinghao Hu (MIT), Tiancheng Chen (ETH Zurich), Berk Buzcu (HES-SO), Song Han (MIT), Pamela Delgado (HES-SO), Ana Klimovic (ETH Zurich) - Tempo: Compiled Dynamic Deep Learning with Symbolic Dependence Graphs
Pedro F. Silvestre, Peter Pietzuch (Imperial College London) - Fawkes: Finding Data Durability Bugs in DBMSs via Recovered Data State Verification
Zhiyong Wu (Tsinghua University), Jie Liang (Beihang University), Jingzhou Fu, Wenqian Deng, Yu Jiang (Tsinghua University) - Scalable Far Memory: Balancing Faults and Evictions
Yueyang Pan (EPFL), Yash Lala (Yale University), Musa Unal, Yujie Ren (EPFL), Seung-seob Lee, Abhishek Bhattacharjee, Anurag Khandelwal (Yale University), Sanidhya Kashyap (EPFL) - KTransformers: Unleashing the Full Potential of CPU/GPU Hybrid Inference for MoE Models
Hongtao Chen, Weiyu Xie, Boxin Zhang (Tsinghua University), Jingqi Tang (Approaching.AI), Jiahao Wang (Approaching.Al, Hangzhou Dianzi University), Jianwei Dong, Shaoyuan Chen (Tsinghua University), Ziwei Yuan (Approaching.AI, University of Electronic Science and Technology of China), Chen Lin, Chengyu Qiu, Yuening Zhu (Tsinghua University), Qingliang Ou (Approaching.AI, Beijing University of Posts and Telecommunications), Jiaqi Liao (Approaching.AI, Beijing Institute of Technology), Xianglin Chen, Zhiyuan Ai (Approaching.AI), Yongwei Wu, Mingxing Zhang (Tsinghua University) - CHERIoT RTOS: An OS for Fine-Grained Memory-Safe Compartments on Low-Cost Embedded Devices
Saar Amar (Apple), Tony Chen (Microsoft), David Chisnall, Nathaniel Wesley Filardo (SCI Semiconductor), Ben Laurie (Google), Hugo Lefeuvre (The University of British Columbia), Kunyan Liu (Microsoft), Simon W. Moore (University of Cambridge), Robert Norton-Wright (SCI Semiconductor), Margo Seltzer (The University of British Columbia), Yucong Tao (Microsoft), Robert N. M. Watson (University of Cambridge), Hongyan Xia (ARM Ltd.) - Coyote v2: Raising the Level of Abstraction for Data Center FPGAs
Benjamin Ramhorst (ETH Zurich), Dario Korolija (AMD Research), Maximilian Jakob Heer, Jonas Dann, Luhao Liu, Gustavo Alonso (ETH Zurich) - COpter: Efficient Large-Scale Resource-Allocation via Continual Optimization
Suhas Jayaram Subramanya (Microsoft), Don Kurian Dennis (Meta), Virginia Smith, Gregory R. Ganger (Carnegie Mellon University) - SAND: A New Programming Abstraction for Video-based Deep Learning
Juncheol Ye, Seungkook Lee, Hwijoon Lim (KAIST), JiHyuk Lee (Chung-Ang unversity), Uitaek Hong, Youngjin Kwon, Dongsu Han (KAIST) - Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training
Yangtao Deng (The Chinese University of Hong Kong), Lei Zhang (ByteDance), Qinlong Wang, Xiaoyun Zhi (ByteDance Seed), Xinlei Zhang, Zhuo Jiang, Haohan Xu, Lei Wang (ByteDance), Zuquan Song, Gaohong Liu (ByteDance Seed), Yang Bai (ByteDance), Shuguang Wang, Wencong Xiao (ByteDance Seed), Jianxi Ye (ByteDance), Minlan Yu (Harvard University), Hong Xu (The Chinese University of Hong Kong) - DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism
Chenyu Jiang (The University of Hong Kong), Zhenkun Cai (Amazon Web Services, Inc.), Ye Tian (The University of Hong Kong), Zhen Jia, Yida Wang (Amazon Web Services, Inc.), Chuan Wu (The University of Hong Kong) - TrainVerify: Equivalence-Based Verification for Distributed LLM Training
Yunchi Lu (University of Michigan), Youshan Miao (Microsoft Research), Cheng Tan (Northeastern University), Peng Huang (University of Michigan), Yi Zhu, Xian Zhang, Fan Yang (Microsoft Research) - Tai Chi: A General High-Efficiency Scheduling Framework for SmartNICs in Hyperscale Clouds
Bang Di, Yun Xu, Kaijie Guo, Yibin Shen, Yu Li, Sanchuan Cheng, Hao Zheng, Fudong Qiu (Alibaba Cloud), Xiaokang Hu (Alibaba Group), Naixuan Guan, Dongdong Huang, Jinhu Li, Yi Wang, Yifang Yang, Jintao Li, Hang Yang, Chen Liang (Alibaba Cloud), Yilong Lv, Zikang Chen, Zhenwei Lu, Xiaohan Ma, Jiesheng Wu (Alibaba Group) - FlexGuard: Fast Mutual Exclusion Independent of Subscription
Victor Laforet (Inria), Sanidhya Kashyap (EPFL), Călin Iorgulescu (Oracle Labs), Julia Lawall, Jean-Pierre Lozi (Inria) - Loom: Efficient Capture and Querying of High-Frequency Telemetry
Franco Solleza (Brown University), Shihang Li (University of Washington), William Sun, Richard Tang, Malte Schwarzkopf (Brown University), Andrew Crotty (Northwestern University), David Cohen (Intel), Nesime Tatbul (Intel Labs and MIT), Stan Zdonik (Brown University) - Tiga: Accelerating Geo-Distributed Transactions with Synchronized Clocks Jinkun Geng (Stanford University), Shuai Mu (Stony Brook University), Anirudh Sivaraman (New York University), Balaji Prabhakar (Stanford University)
- METIS: Fast Quality-Aware RAG Systems with Configuration Adaptation Siddhant Ray (University of Chicago), Rui Pan (Princeton University), Zhuohan Gu (University of Chicago), Kuntai Du (University of Chicago/ TensorMesh, Inc.), Shaoting Feng (University of Chicago), Ganesh Ananthanarayanan (Microsoft), Ravi Netravali (Princeton University), Junchen Jiang (University of Chicago/ TensorMesh, Inc.)
- TickTock: Verified Isolation in a Production Embedded OS Vivien Rindisbacher, Evan Johnson, Nico Lehmann, Tyler Potyondy, Pat Pannuto, Stefan Savage, Deian Stefan, Ranjit Jhala (University of California, San Diego)
- Managing Scalable Direct Storage Accesses for GPUs with GoFS Shaobo Li, Yirui Eric Zhou, Yuqi Xue, Yuan Xu, Jian Huang (University of Illinois Urbana-Champaign)
- Optimistic Recovery for High-Availability Software via Partial Process State Preservation
Yuzhuo Jing, Yuqi Mai, Angting Cai, Yi Chen, Wanning He, Xiaoyang Qian, Peter M. Chen, Peng Huang (University of Michigan)