Fresh Thinking Talk Speakers

Kostis Kaffes

Enabling Customizable and Performant Systems for Cloud Computing

Abstract

As the cloud grows and the post-Moore's Law era is here, it is imperative to continue scaling the performance and efficiency of modern systems. To achieve that, operating systems (OS) specialization is necessary as the one-size-fits-all approach of fundamental OS operations, such as scheduling, is incompatible with today's diverse application landscape. In this talk, I will first present Syrup, a framework that enables everyday application developers to specify custom scheduling policies easily and safely deploy them across different layers of the stack over existing operating systems like Linux, bringing the benefits of specialized scheduling to everyone. Then, I will discuss DBOS, a proposal for a radical redesign of the OS stack. DBOS argues in favor of storing all system state in a transactional database. That way, fundamental operations such as scheduling and inter-process communication can be implemented in a few lines of SQL code while having access to dramatically better analytics and provenance information than existing systems.

Bio

Kostis Kaffes is a software engineer at Google’s Systems Research Group. He is broadly interested in computer systems, cloud computing, and scheduling. He completed his Ph.D. in Electrical Engineering at Stanford University, advised by Christos Kozyrakis, where he focused on end-host, rack-scale, and cluster-scale scheduling for microsecond-scale tail latency. Kostis will be joining Columbia University as an Assistant Professor in July 2023.




Neeraja Yadwadkar

Systems for ML: It’s all about the choices

Abstract

Abstract - This talk focuses on the following fundamental question: What does Machine Learning (ML) need from Systems? I will use inference serving as an example to make a case that the “Systems for ML” research is primarily about making the right choices. Today, many applications rely on inference from machine learning models, especially neural networks. For instance, applications on Facebook issue tens-of-trillions of inference queries per day with different privacy, performance, accuracy, and cost constraints. Unfortunately, existing distributed inference serving systems ignore ease-of-use and hence result in significant cost inefficiency, especially at large scales. Existing systems force developers to manually search through thousands of model-variants — versions of already-trained models with differing hardware, resource footprints, latencies, costs, privacy, and accuracies — to meet the diverse application requirements. As requirements, query load, and applications themselves evolve over time, developers must make these decisions dynamically for each inference query to avoid excessive costs through naive autoscaling. To avoid navigating through the large and complex trade-off space of model-variants, developers often fix a variant across queries and replicate it when load increases. However, given the diversity across variants and hardware platforms across the cloud-edge spectrum, a lack of understanding of the trade-off space incurs significant costs. For applications to use machine learning, we must automate issues that affect ease-of-use, privacy, performance, and cost efficiency for users and providers. We argue for managed distributed inference serving for a variety of models, across the cloud-edge spectrum.

Bio

Neeraja is an assistant professor in the department of Electrical and Computer Engineering at UT Austin. She is a Cloud Computing Systems researcher, with a strong background in Machine Learning (ML). Most of her research straddles the boundaries of Systems and ML: using and developing ML techniques for systems, and building systems for ML. Before joining UT Austin, she was a postdoctoral fellow in the Computer Science department at Stanford University, and before that, received her PhD in Computer Science from UC Berkeley. She had previously earned a bachelors in Computer Engineering from the Government College of Engineering, Pune, India.