Schedule

This is a tentative schedule subject to change.

Note: all times here are displayed in PDT

Day 1

7:30 AM
8:00 AM

Breakfast

8:00 AM
8:10 AM

Opening Remarks

8:10 AM
9:10 AM

Keynote 1

  • TBD

    Irene Zhang (Microsoft Research)


9:10 AM
10:30 AM

Session 1 (Systems Supporting Machine Learning I: Scheduling)

  • Hops: Fine-grained heterogeneous sensing, efficient and fair Deep Learning cluster scheduling system


  • Queue Management for SLO-Oriented Large Language Model Serving


  • KALE: Elastic GPU Scheduling for Online DL Model Training


  • FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling


10:30 AM
11:00 AM

Break

11:00 AM
12:30 PM

Session 2: Machine Learning Supporting Systems

  • SQLStateGuard: Statement-Level SQL Injection Defense Based on Learning-Driven Middleware


  • Vista: Machine Learning based Database Performance Troubleshooting Framework in Amazon RDS


  • Building AI Agents for Autonomous Clouds: Challenges and Design Principles


  • Zero-SAD: Zero-Shot Learning Using Synthetic Abnormal Data for Abnormal Behavior Detection on Private Cloud


  • Forecasting Algorithms for Intelligent Resource Scaling: An Experimental Analysis


12:30 PM
1:30 PM

Lunch

1:30 PM
3:30 PM

Session 3 (Speed and Scale in Serverless)

  • Snapipeline: Accelerating Snapshot Startup for FaaS Containers


  • En4S: Enabling SLOs in Serverless Storage Systems


  • Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-Loading


  • Faascale: Scaling MicroVM Vertically for Serverless Computing with Memory Elasticity


  • Rethinking Networking Stack for Serverless Environments: A Sidecar Approach


  • Process-as-a-Service: Unifying Elastic and Stateful Clouds with Serverless Processes


3:30 PM
4:00 PM

Break

4:00 PM
6:00 PM

Session 4: The Elastic Cloud

  • AutoBurst: Autoscaling Burstable Instances for Cost-effective Latency SLOs


  • Is It Time To Put Cold Starts In The Deep Freeze?


  • Towards Swap-Free, Continuous Ballooning for Fast, Cloud-Based Virtual Machine Migrations


  • PCLive: Pipelined Restoration of Application Containers for Reduced Service Downtime


  • Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters


  • Krios: Scheduling Abstractions and Mechanisms for Enabling a LEO Compute Cloud


6:00 PM
8:00 PM

Dinner

Day 2

7:30 AM
8:00 AM

Breakfast

9:00 AM
10:15 AM

Session 1: When Things Go Wrong in the Cloud

  • Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud Systems


  • Deoxys: A Causal Inference Engine for Unhealthy Node Mitigation in Large-scale Cloud Infrastructure


  • INS:Identifying and Mitigating Performance Interference in Clouds via Interference-Sensitive Paths


  • TailClipper: Reducing Tail Response Time of Distributed Services Through System-Wide Scheduling


10:15 AM
10:45 AM

Break

10:45 AM
12:15 PM

Session 2: Systems Supporting Machine Learning II

  • On-demand and Parallel Checkpoint/Restore for GPU Applications


  • Shared Mixture of Experts


  • FaPES: Enabling Efficient Elastic Scaling for Serverless Machine Learning Platforms


  • KACE: Kernel-Aware Colocation for Efficient GPU Sharing


  • Pack: Towards Communication-Efficient Homomorphic Encryption in Federated Learning


12:15 PM
1:15 PM

Lunch

1:15 PM
3:15 PM

Session 3: The Green Cloud

  • InferCool: Enhancing AI Inference Cooling through Transparent, Non-Intrusive Task Reassignment


  • CDN-Shifter: Leveraging Spatial Workload Shifting to Decarbonize Content Delivery Networks


  • Accountable Carbon Footprints and Energy Profiling For Serverless Functions


  • The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware Scheduling


  • Exploring the Efficiency of Renewable Energy-based Modular Data Centers at Scale


  • The Hidden Carbon Footprint of Serverless Computing


3:15 PM
3:45 PM

Break

3:45 PM
5:45 PM

Session 4: The Basics

  • uIO: Lightweight and Extensible Unikernels


  • Racos: Improving Erasure Coding State Machine Replication using Leaderless Consensus


  • Occam’s Razor for Distributed Protocols


  • VWeiST: A Scalable and Efficient Proof-of-Stake Blockchain Consensus


  • Securing a Multiprocessor KVM Hypervisor with Rust


  • SURE: Secure Unikernels Make Serverless Computing Rapid and Efficient


6:00 PM
Onwards

Dinner

Day 3

7:30 AM
8:00 AM

Breakfast

8:00 AM
9:00 AM

Keynote 2

  • TBD

    Anastasia Ailamaki (EPFL)


9:00 AM
10:15 AM

Session 1 (Bits on Disk)

  • TianMen: a DPU-based storage network offloading structure for disaggregated datacenters


  • H2C-Dedup: Reducing I/O and GC Amplification for QLC SSDs from the Deduplication Metadata Perspective


  • RomeFS: A CXL-SSD Aware File System Exploiting Synergy of Memory-Block Dual Paths


  • SmartGraph: A Framework for Graph Processing in Computational Storage


10:15 AM
10:30 AM

Break

10:30 AM
12:30 PM

Session 2 (In the Cloud)

  • ConMonitor: Lightweight Container Protection with Virtualization and VM Functions


  • ByteMQ: A Cloud-native Streaming Data Layer in ByteDance


  • Dynamic Idle Resource Leasing To Safely Oversubscribe Capacity At Meta


  • Byways: High-Performance, Isolated Network Functions for Multi-Tenant Cloud Servers


  • Cloud-native Workflow Scheduling using a Hybrid Priority Rule, Dynamic Resource Allocation, and Dynamic Task Partition


  • Streamlining Cloud-Native Application Development and Deployment with Robust Encapsulation


12:30 PM
1:30 PM

Lunch

1:30 PM
3:30 PM

Session 3: Algorithms and Applications

  • Komet: A Serverless Platform for Low-Earth Orbit Edge Services


  • A Data Optimizer for Region-Aware Self-describing Files in Scientific Computing


  • Rethinking State Management in Actor Systems for Cloud-Native Applications


  • IncBoost: Scaling Incremental Graph Processing for Edge Deletions and Weight Updates


  • Memory Management in Complex Join Queries: A Re-evaluation Study


  • FAAStloop: Optimizing Loop-Based Applications for Serverless Computing


3:30 PM
4:00 PM

Break

4:00 PM
5:45 PM

Session 4: Systems Supporting Machine Learning III: Training

  • Distributed Training of Large Language Models on AWS Trainium


  • Near-Lossless Gradient Compression for Data-Parallel Distributed DNN Training


  • Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores


  • Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic Quantization


  • ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks


5:45 PM

Closing