Schedule

This is a tentative schedule subject to change.

Note: all times here are displayed in PDT

Wednesday, November 20, 2024

7:30 AM
8:00 AM

Breakfast

8:00 AM
8:10 AM

Opening Remarks

8:10 AM
9:10 AM

Keynote 1

  • Demikernel and the Future of Datacenter Operating Systems

    Irene Zhang (Microsoft Research)


9:10 AM
10:30 AM

Session 1 (Systems Supporting Machine Learning I: Scheduling)

  • Hops: Fine-grained heterogeneous sensing, efficient and fair Deep Learning cluster scheduling system

    Qinghe Wang (School of Artificial Intelligence, Anhui University), Futian Wang (Anhui University), Xinwei Zheng (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center)


  • Queue Management for SLO-Oriented Large Language Model Serving

    Archit Patke, Dhemath Reddy (University of Illinois at Urbana-Champaign), Saurabh Jha (IBM Research), Haoran Qiu (University of Illinois at Urbana-Champaign), Christian Pinto (IBM Research Europe), Chandra Narayanaswami (IBM Research), Zbigniew Kalbarczyk, Ravishankar K. Iyer (University of Illinois at Urbana-Champaign)


  • KALE: Elastic GPU Scheduling for Online DL Model Training

    Ziyang Liu, Renyu Yang (Beihang University), Jin Ouyang (Kuaishou Inc.), Weihan Jiang, Tianyu Ye, Menghao Zhang (Beihang University), Sui Huang, Jiaming Huang, Chengru Song, Di Zhang (Kuaishou Inc.), Tianyu Wo, Chunming Hu (Beihang University)


  • FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling

    Redwan Ibne Seraj Khan (Virginia Tech), Arnab K. Paul (BITS Pilani, KK Birla Goa Campus, India), Xun (Steve) Jian (Virginia Tech), Yue Cheng (University of Virginia), Ali R. Butt (Virginia Tech)


10:30 AM
11:00 AM

Break

11:00 AM
12:30 PM

Session 2: Machine Learning Supporting Systems

  • SQLStateGuard: Statement-Level SQL Injection Defense Based on Learning-Driven Middleware

    Xin Liu, Yuanyuan Huang, Tianyi Wang (Lanzhou University), Song Li (Zhejiang University), Weina Niu (University of Electronic Science and Technology of China), Jun Shen (University of Wollongong), Qingguo Zhou (Lanzhou University), Xiaokang Zhou (Kansai University)


  • Vista: Machine Learning based Database Performance Troubleshooting Framework in Amazon RDS

    Vikramank Singh (amazon), Zhao Song (nan), Balakrishnan (Murali) Narayanaswamy, Kapil Eknath Vaidya, Tim Kraska (amazon)


  • Building AI Agents for Autonomous Clouds: Challenges and Design Principles

    Manish Shetty (University of California, Berkeley and Microsoft), Yinfang Chen (University of Illinois Urbana-Champaign and Microsoft), Gagan Somashekar, Minghua Ma (Microsoft), Yogesh Simmhan (Indian Institute of Science and Microsoft), Xuchao Zhang (Microsoft), Jonathan Mace (Microsoft Research), Dax Vandevoorde (Agnes Scott College and Microsoft Research), Pedro Las-Casas (Microsoft Research), Shachee Mishra Gupta (Microsoft), Suman Nath (Microsoft Research), Chetan Bansal, Saravan Rajmohan


  • Zero-SAD: Zero-Shot Learning Using Synthetic Abnormal Data for Abnormal Behavior Detection on Private Cloud

    Jae-Seok Kim, Seon-Jin Hwang, Joonho Seo, Jinmyeong Shin, Yoon-Ho Choi, Jae-Seok Kim (Pusan National University)


  • Forecasting Algorithms for Intelligent Resource Scaling: An Experimental Analysis

    Yanlei Diao, Dominik Horn (Amazon), Andreas Kipf (Technische Universität Nürnberg), Oleksandr Shchur, Ines Benito, Wenjian Dong, Davide Pagano, Pascal Pfiel, Vikram Nathan, Balakrishnan Narayanaswamy, Tim Kraska (Amazon)


12:30 PM
1:30 PM

Lunch

1:30 PM
3:30 PM

Session 3 (Speed and Scale in Serverless)

  • Snapipeline: Accelerating Snapshot Startup for FaaS Containers

    Yuqiao Lan (Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences), Xiaohui Peng (Institute of Computing technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences), Yifan Wang (Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences)


  • En4S: Enabling SLOs in Serverless Storage Systems

    Minghao Xie, Chen Qian, Heiner Litz (University of California Santa Cruz)


  • Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-Loading

    Yifan Sui (Shanghai Jiao Tong University), Hanfei Yu (Stevens Institute of Technology), Yitao Hu (Tianjin University), Jianxun Li (Shanghai Jiao Tong University), Hao Wang (Stevens Institute of Technology)


  • Faascale: Scaling MicroVM Vertically for Serverless Computing with Memory Elasticity

    Xinmin Zhang, Qiang He, Hao Fan, Song Wu (Huazhong University of Science and Technology), Xinmin kingdo (nan)


  • Rethinking the Networking Stack for Serverless Environments: A Sidecar Approach

    Vishwanath Seshagiri (Emory University), Abhinav Gupta, Vahab Jabrayilov (Columbia University), Avani Wildani (Emory University and Cloudflare), Kostis Kaffes, Abhinav Gupta (Columbia University), Vishwanath Seshagiri (Emory University)


  • Process-as-a-Service: Unifying Elastic and Stateful Clouds with Serverless Processes

    Marcin Copik, Alexandru Calotoiu (ETH Zurich, Switzerland), Gyorgy Rethy (Oracle Labs, Switzerland), Roman Böhringer (OpenCore GmbH, Switzerland), Rodrigo Bruno (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal), Torsten Hoefler (ETH Zurich, Switzerland)


3:30 PM
4:00 PM

Break

4:00 PM
6:00 PM

Session 4: The Elastic Cloud

  • AutoBurst: Autoscaling Burstable Instances for Cost-effective Latency SLOs

    Rubaba Hasan, Timothy Zhu, Bhuvan Urgaonkar (The Pennsylvania State University)


  • Is It Time To Put Cold Starts In The Deep Freeze?

    Carlos Segarra, Ivan Durev, Peter Pietzuch (Imperial College London)


  • Towards Swap-Free, Continuous Ballooning for Fast, Cloud-Based Virtual Machine Migrations

    Kevin Alarcón Negy, Tycho Nightingale, Hakim Weatherspoon, Zhiming Shen (Exostellar, inc.)


  • PCLive: Pipelined Restoration of Application Containers for Reduced Service Downtime

    Shiv Bhushan Tripathi, Debadatta Mishra (Indian Institute of Technology Kanpur, India)


  • Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters

    Smita Vijayakumar, Anil Madhavapeddy, Evangelia Kalyvianaki (University of Cambridge)


  • Krios: Scheduling Abstractions and Mechanisms for Enabling a LEO Compute Cloud

    Vaibhav Bhosale, Ada Gavrilovska, Ketan Bhardwaj (Georgia Institute of Technology)


Thursday, November 21, 2024

7:30 AM
8:00 AM

Breakfast

8:00 AM
9:00 AM

Women in Systems Meetup

Please join us during breakfast for a Women in Systems Meetup. Learn more about the event here.

9:00 AM
10:15 AM

Session 1: When Things Go Wrong in the Cloud

  • Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud Systems

    P. C. Sruthi, Zinan Guo, Deming Chu (Purdue University), Zhengyan Chen (University of Georgia), Yongle Zhang (Purdue University)


  • Deoxys: A Causal Inference Engine for Unhealthy Node Mitigation in Large-scale Cloud Infrastructure

    Chaoyun Zhang (Microsoft), Randolph Yao (Microsoft Azure), Si Qin (Microsoft Research), Ze Li, Shekhar Agrawal (Microsoft), Binit Mishra, Tri Tran (Microsoft Azure), Minghua Ma (Microsoft Research), Qingwei Lin (Microsoft Research, Beijing, China), Murali Chintalapati (Microsoft), Dongmei Zhang (Microsoft Research)


  • INS:Identifying and Mitigating Performance Interference in Clouds via Interference-Sensitive Paths

    Ziwei Huang, Mengyao Xie, Shibo Tang (State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences), Zihao Chang (State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Zhongguancun Laboratory), Zhicheng Yao, Yungang Bao (State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences), Sa Wang (State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Zhongguancun Laboratory)


  • TailClipper: Reducing Tail Response Time of Distributed Services Through System-Wide Scheduling

    Nathan Ng (University of Massachusetts Amherst), Abel Souza (University of California Santa Cruz), Ahmed Ali-Eldin (Chalmers University of Technology), David Irwin, Don Towsley, Prashant Shenoy, Abel Souza (University of Massachusetts Amherst)


10:15 AM
10:45 AM

Break

10:45 AM
12:15 PM

Session 2: Systems Supporting Machine Learning II

  • On-demand and Parallel Checkpoint/Restore for GPU Applications

    Yanning Yang, Dong Du (Shanghai Jiao Tong University), Haitao Song (Shanghai Artificial Intelligence Research Institute), Yubin Xia (Shanghai Jiao Tong University)


  • Shared Mixture of Experts

    Umesh Deshpande, Travis Janssen, Mudhakar Srivatsa, Swaminathan Sundararaman (IBM Research)


  • FaPES: Enabling Efficient Elastic Scaling for Serverless Machine Learning Platforms

    Xiaoyang Zhao (The University of Hong Kong), Siran Yang, Jiamang Wang, Lansong Diao, Lin Qu (Alibaba Group), Chuan Wu (The University of Hong Kong)


  • KACE: Kernel-Aware Colocation for Efficient GPU Sharing

    Bing-Shiun Han, Tathagata Paul, Zhenhua Liu, Anshul Gandhi (Stony Brook University)


  • Pack: Towards Communication-Efficient Homomorphic Encryption in Federated Learning

    Zeyuan Zuo (University of Hong Kong), Ningxin Su, Baochun Li (University of Toronto), Teng Zhang (University of Hong Kong)


12:15 PM
1:15 PM

Lunch

1:15 PM
3:15 PM

Session 3: The Green Cloud

  • InferCool: Enhancing AI Inference Cooling through Transparent, Non-Intrusive Task Reassignment

    Qiangyu Pei (Huazhong University of Science and Technology), Lin Wang (Paderborn University), Dong Zhang, Bingheng Yan (Inspur Data Co., Ltd.), Chen Yu (Huazhong University of Science and Technology), Fangming Liu (Huazhong University of Science and Technology & Peng Cheng Laboratory)


  • CDN-Shifter: Leveraging Spatial Workload Shifting to Decarbonize Content Delivery Networks

    Jorge Murillo, Walid A. Hanafy, David Irwin (University of Massachusetts Amherst), Ramesh Sitaraman (University of Massachusetts Amherst and Akamai Tech), Prashant Shenoy (University of Massachusetts Amherst)


  • Accountable Carbon Footprints and Energy Profiling For Serverless Functions

    Prateek Sharma, Alexander Fuerst (Indiana University Bloomington)


  • The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware Scheduling

    Noman Bashir, Varun Gohil (MIT), Mohammad Shahrad (University of British Columbia), David Irwin (University of Massachusetts, Amherst), Anagha Belavadi Subramanya, Elsa Olivetti, Christina Delimitrou (MIT)


  • Exploring the Efficiency of Renewable Energy-based Modular Data Centers at Scale

    Jinghan Sun, Zibo Gong (UIUC), Anup Agarwal (CMU), Shadi Noghabi, Ranveer Chandra (Microsoft Research), Marc Snir, Jian Huang (UIUC)


  • The Hidden Carbon Footprint of Serverless Computing

    Rohan Basu Roy (Northeastern University), Raghavendra Kanakagiri (IIT Tirupati), Yankai Jiang, Devesh Tiwari (Northeastern University)


3:15 PM
3:45 PM

Break

3:45 PM
5:45 PM

Session 4: The Basics

  • uIO: Lightweight and Extensible Unikernels

    Masanori Misono, Peter Okelmann, Charalampos Mainas, Pramod Bhatotia (TU Munich)


  • Racos: Improving Erasure Coding State Machine Replication using Leaderless Consensus

    Jonathan Zarnstorff, Lucas Lebow (Unaffiliated), Christopher Siems, Dillon Remuck, Colin Ruiz (Clark University), Lewis Tseng (UMass Lowell)


  • Occam’s Razor for Distributed Protocols

    Ziliang Lai, Fan Cui (The Chinese University of Hong Kong), Hua Fan (Alibaba Cloud), Eric Lo (The Chinese University of Hong Kong), Wenchao Zhou, Feifei Li (Alibaba Cloud)


  • VWeiST: A Scalable and Efficient Proof-of-Stake Blockchain Consensus

    Hang Xiong, Cheng Qu, Jing Li (University of Science and Technology of China)


  • Securing a Multiprocessor KVM Hypervisor with Rust

    Yu-Hsun Chiang, Wei-Lin Chang, Shih-Wei Li, Jan-Ting Du (National Taiwan University)


  • SURE: Secure Unikernels Make Serverless Computing Rapid and Efficient

    Federico Parola (Politecnico di Torino), Shixiong Qi, Anvaya Bheemanakone Narappa, K. K. Ramakrishnan (University of California, Riverside), Fulvio Risso (Politecnico di Torino)


Friday, November 22, 2024

7:30 AM
8:00 AM

Breakfast

8:00 AM
9:00 AM

Keynote 2

  • The New Memory Wall and how it changes database system design

    Anastasia Ailamaki (EPFL)


9:00 AM
10:15 AM

Session 1 (Bits on Disk)

  • TianMen: a DPU-based storage network offloading structure for disaggregated datacenters

    Weiyue Zhao (State Key Laboratory of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China. University of Chinese Academy of Sciences, Beijing, China.), Jingya Wu, Wenyan Lu, Xiaowei Li, Guihai Yan (State Key Laboratory of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.)


  • H2C-Dedup: Reducing I/O and GC Amplification for QLC SSDs from the Deduplication Metadata Perspective

    Yunsheng Dong, Boju Chen, Yanqi Pan, Xiangyu Zou, Wen Xia (Harbin Institute of Technology, Shenzhen)


  • RomeFS: A CXL-SSD Aware File System Exploiting Synergy of Memory-Block Dual Paths

    Yekang Zhan, Haichuan Hu, Xiangrui Yang, Shaohua Wang, Qiang Cao (Huazhong University of Science and Technology), Hong Jiang (UT Arlington), Jie Yao, yekang zhan, Haichuan Hu (Huazhong University of Science and Technology)


  • SmartGraph: A Framework for Graph Processing in Computational Storage

    Soheil Khadirsharbiyani (Pennsylvania State University), Nima Elyasi (Meta), Armin Haj Aboutalebi (Nvidia), Changho Choi (Samsung Semiconductor Inc.), Chun-Yi Liu (The Pennsylvania State University), Mahmut Kandemir (Pennsylvania State University)


10:15 AM
10:30 AM

Break

10:30 AM
12:30 PM

Session 2 (In the Cloud)

  • ConMonitor: Lightweight Container Protection with Virtualization and VM Functions

    Shaowen Xu (Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences), Qihang Zhou (Institute of Information Engineering, Chinese Academy of Sciences), Zhicong Zhang, Xiaoqi Jia (Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences), Donglin Liu (Sinochem Energy-Tech Co., Ltd.), Heqing Huang, Haichao Du, Zhenyu Song (Institute of Information Engineering, Chinese Academy of Sciences)


  • ByteMQ: A Cloud-native Streaming Data Layer in ByteDance

    Yancan Mao, Ruohang Yin (National University of Singapore), Liyuan Lei, Peng Ye, Shengfu Zou, Shizheng Tang, Yunzhe Guo, Ye Yuan, Xiaochen Yu, Bo Wan, Yunfei Gong, Changli Gao, Guanghui Zhang, Jian Shen, Rui Shi (Bytedance Inc.), Richard T. B. Ma, Ruohang Yin (National University of Singapore)


  • Dynamic Idle Resource Leasing To Safely Oversubscribe Capacity At Meta

    Nishant Gupta, Iyswarya Narayanan, Shivam Handa, Sayak Chakraborti, Pankit Thapar, Baohua Shan, Ariel Rao, Yuanlai Liu, Pengyuan Wang, Yuqing Wu, Qingyi Gao, Chris Chao-Chun Cheng, Sihan You, Louis Huang, Jingyuan Fan, Kenny Yu, Kevin Lin, Tengfei Mu, Parth Malani, Haiying Wang, Trey Lu, Peter Zhang (Meta)


  • Byways: High-Performance, Isolated Network Functions for Multi-Tenant Cloud Servers

    Xinyu Han, Yuan Gao, Gabriel Parmer, Timothy Wood (George Washington University)


  • Cloud-native Workflow Scheduling using a Hybrid Priority Rule, Dynamic Resource Allocation, and Dynamic Task Partition

    Jungeun Shin (University of Illinois at Urbana-Champaign), Diana Arroyo, Asser Tantawi, Chen Wang, Alaa Youssef (IBM Research), Rakesh Nagi (Singapore University of Technology and Design)


  • Streamlining Cloud-Native Application Development and Deployment with Robust Encapsulation

    Pawissanutt Lertpongrujikorn (University of North Texas), Hai Duc Nguyen (Argonne National Laboratory and University of Chicago), Mohsen Amini Salehi, Pawissanutt Lertpongrujikorn (University of North Texas)


12:30 PM
1:30 PM

Lunch

1:30 PM
3:30 PM

Session 3: Algorithms and Applications

  • Komet: A Serverless Platform for Low-Earth Orbit Edge Services

    Tobias Pfandzelter, David Bermbach (Technische Universität Berlin & Einstein Center Digital Future)


  • A Data Optimizer for Region-Aware Self-describing Files in Scientific Computing

    Yanjie Song, Tianyuan Wu, Yuanhao Li, Guancheng Li, Yuchen Liu, Shu Yin (ShanghaiTech University), Wei Xue (Tsinghua University), Junchao Wang (China Meteorological Administration)


  • Rethinking State Management in Actor Systems for Cloud-Native Applications

    Yijian Liu, Rodrigo Nunes Laigner, Yongluan Zhou (University of Copenhagen)


  • IncBoost: Scaling Incremental Graph Processing for Edge Deletions and Weight Updates

    Xizhe Yin, Zhijia Zhao, Rajiv Gupta (University of California, Riverside)


  • Memory Management in Complex Join Queries: A Re-evaluation Study

    Shiva Jahangiri (Santa Clara University), Michael Carey (University of California, Irvine), Johann-Christoph Freytag (Humboldt-Universität zu Berlin)


  • FAAStloop: Optimizing Loop-Based Applications for Serverless Computing

    Shruti Mohanty, Vivek M. Bhasi, Myungjun Son, Mahmut Kandemir, Chita Das (Pennsylvania State University), Vivek Bhasi (The Pennsylvania State University, University Park)


3:30 PM
4:00 PM

Break

4:00 PM
5:45 PM

Session 4: Systems Supporting Machine Learning III: Training

  • Distributed Training of Large Language Models on AWS Trainium

    Xinwei Fu, Yida Wang, Zhen Zhang, Ron Diamant, Randy Huang, Rahul Solanki, Fei Wu, Mohammad El-Shabani, Haozheng Fan, Guangtai Huang (Amazon Web Services)


  • Near-Lossless Gradient Compression for Data-Parallel Distributed DNN Training

    Xue Li (Alibaba Group), Cheng Guo (Tsinghua University), Kun Qian (Alibaba Group), Menghao Zhang, Mengyu Yang (Unaffiliated), Mingwei Xu (Tsinghua University)


  • Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores

    Diana Petrescu, Arsany Guirguis (EPFL), Do Le Quoc, Javier Picorel (Huawei Munich Research Center), Rachid Guerraoui (EPFL), Florin Dinu (Huawei Munich Research Center)


  • Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic Quantization

    Amey Agrawal (Georgia Institute of Technology), Sameer Reddy (Cisco Inc.), Satwik Bhattamishra (University of Oxford), Venkata Prabhakara Sarath Nookala (Meta Inc.), Vidushi Vashishth (Google Inc.), Kexin Rong, Alexey Tumanov (Georgia Institute of Technology)


  • ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks

    Ziji Shi, Jialin Li, Yang You (National University of Singapore)


5:45 PM

Closing