Keynote Speakers

Ana Klimovic (ETH Zurich)

Scalable Input Data Processing for Resource-Efficient Machine Learning

Abstract

Data is the lifeblood of machine learning. Yet, our system infrastructure for managing and preprocessing training data in ML jobs lags behind the vast advancements in hardware accelerators, software frameworks, and algorithms that optimize model training computations. The input data pipeline in an ML job is responsible for extracting data from storage, transforming data on-the-fly, and loading data to a training node (typically a GPU or TPU). As hardware accelerators continues to provide more FLOPS, feeding data at a sufficient rate to saturate accelerators is increasingly challenging. The high cost of accelerators compared to their CPU hosts makes it particularly important to ensure that they operate at high utilization. Hence, the input pipeline is critical to the end-to-end throughput and cost of ML jobs. In this talk, we will discuss the characteristics of real ML input pipelines from production workloads which have led to the trend of disaggregating input data processing from model training. I will present recent open-source systems such as tf.data service and Cachew, which leverage a disaggregated system architecture to scale-out and optimize data processing within and across jobs. These systems alleviate input bottlenecks and dramatically improve the training time and cost of ML jobs.

Bio

Ana Klimovic is an Assistant Professor in the Systems Group of the Computer Science Department at ETH Zurich. Her research interests span operating systems, computer architecture, and their intersection with machine learning. Ana's work focuses on computer system design for large-scale applications such as cloud computing services, data analytics, and machine learning. Before joining ETH in August 2020, Ana was a Research Scientist at Google Brain and completed her Ph.D. in Electrical Engineering at Stanford University.




Joseph M. Hellerstein (UC Berkeley)

Declaring the Era of Programmable Clouds

Abstract

For a decade or more, the main business of public cloud vendors has been to replace traditional enterprise computing systems with similar hosted services. This brought once-in-a-lifetime disruption to the business landscape. In comparison, the ambitions for innovation in cloud software engineering have been relatively modest. Given its revolutionary scale and potential, the cloud’s real impact on software innovation is likely yet to come. What are we waiting for? The main impediment seems to be the absence of a programming stack that unlocks the full power of the cloud for general-purpose programming and innovation at scale. In this talk I will go over key concerns for cloud programming, opportunities to learn from the success of declarative languages, and concrete examples in systems work. We’ll begin with replication, reviewing the CALM Theorem and the Anna KVS; the former connecting a program property (monotonicity) to a runtime guarantee (consistency), the latter showing the power and limitations of CALM design in practice. In that frame, we’ll look at connections to data consistency and isolation levels. Moving beyond read/write semantics, we will use CRDTs as an entry point and anti-pattern for richer distributed programming models. Finally, we’ll shift attention to partitioning, using the lens of Functional Dependencies to scale concrete examples of classical protocols like Paxos. Building on these results, I will overview Hydro, our nascent effort to develop a multi-level compiler stack a la LLVM, but one that is focused on distributed systems concerns. Challenges include architecting a low-latency asynchronous dataflow kernel for general-purpose services; IR designs that can leverage type-checkers like that of Rust for distributed properties; techniques to transpile (“lift”) sequential code to scalable alternatives; and the requirements for a compiler whose output is a live autoscaling service, not merely an executable. Joint work with colleagues at UC Berkeley and Sutter Hill Ventures.

Bio

Joseph M. Hellerstein's work focuses on data-centric systems and the way they drive computing. He is the Jim Gray Professor of Computer Science at UC Berkeley, an ACM Fellow, an Alfred P. Sloan Research Fellow and the recipient of four "Test of Time" awards for his research. MIT's Technology Review magazine included his work on cloud programming in their TR10 list of the 10 technologies "most likely to change our world". Hellerstein is also involved in the computing industry, currently as a Fellow at Sutter Hill Ventures and as co-founder of Aqueduct, which is developing new open source cloud technology for Prediction Infrastructure. Previously he co-founded Trifacta, the pioneering company in Data Preparation.




Kunle Olukotun (Stanford University)

Systems for ML and ML for Systems: A Virtuous Cycle

Abstract

This talk is about the virtuous interplay between machine learning (ML) and systems. I will show examples of how systems optimized for ML computation can be used to train more accurate and capable ML models and how these ML models can be used to improve upon the ad-hoc heuristics used in system design and management. These improved systems can then be used to train better ML models. The latest trend in ML is the development of Foundation models. Foundation models are large pretrained models that have obtained state-of-the-art quality in natural language processing, vision, speech, and other areas. These models are challenging to train and serve because they are characterized by billions of parameters, irregular data access (sparsity) and irregular control flow. I will explain how Reconfigurable Dataflow Accelerators (RDAs) can be designed to accelerate foundation models with these characteristics. SambaNova Systems is using RDA technology to achieve record-setting performance on foundation models. I will describe how the RDAs can also be used to build Taurus, an intelligent network data plane that enables ML models to be used to manage computer networks at full line-rate bandwidths. In particular, a Taurus prototype detects two orders of magnitude more events in a security application than a state-of-the-art system based on conventional network technology.

Bio

Kunle Olukotun is the Cadence Design Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is a pioneer in multicore processor design and the leader of the Stanford Hydra chip multiprocessor (CMP) research project. He founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multi-core processor, called Niagara, was acquired by Sun Microsystems and now powers Oracle's SPARC-based servers. In 2017, Olukotun co-founded SambaNova Systems, a Machine Learning and Artificial Intelligence company, and continues to lead as their Chief Technologist. Olukotun is the Director of the Pervasive Parallel Lab and a member of the Data Analytics tor What's Next (DAWN) Lab, developing infrastructure for usable machine learning. He is a member of the National Academy of Engineering, an ACM Fellow, and an IEEE Fellow for contributions to multiprocessors on a chip design and the commercialization of this technology. He also received the Harry H. Goode Memorial Award. Olukotun received his Ph.D. in Computer Engineering from The University of Michigan.




Thierry Cruanes (Snowflake)

Snowflake Data Cloud Architecture: How public changes everything

Abstract

The rise of the public cloud changed system architecture forever. In this talk, we will cover how the three pillars of the architecture allow the Snowflake system to scale analytic workloads and became foundational attributes of the Data Cloud. Scaling analytic workload on the data cloud requires an architecture separating compute and storage to leverage abundant cloud compute resources; building an ACID compliant database system on immutable storage to guarantee consistency; and delivering a scalable multi-tenant system as an easy to use and reliable service. To move beyond analytic and offer a global programmable data application platform, the architecture evolved to offer secured global user collaboration around shared data and embraced different programmatic models.

Bio

Thierry co-founded Snowflake and currently serves as Chief Technical Officer. Thierry is a leading expert in query optimization and parallel execution. He spent 13 years at Oracle focused on the optimization and parallelization layers in Oracle database. Before Oracle, he spent seven years at the IBM European Center of Applied Mathematics working on text and data mining technologies. Thierry has a PhD in Computer Science with a focus in Database Systems and holds over 100 patents.