Raghu Ramakrishnan (Microsoft)
Abstract
Azure Data Lake is Microsoft’s serverless Big Data platform and is designed to support multiple compute engines, multiple storage tiers, exabyte scale, and comprehensive security and data sharing. ADL builds on our internal experience with an exabyte-scale Big Data service (called Cosmos) and integrates deeply with the Hadoop ecosystem. It has two complementary parts, ADL Analytics and ADL Store. ADL Store (ADLS) is a fully-managed, elastic, scalable, and secure file system that supports Hadoop distributed file system (HDFS) and Cosmos semantics. ADL Analytics (ADLA) is a framework for delivering managed serverless analytics, including those based on our own Microsoft engines (e.g., Scope, U-SQL variants) and OSS engines (e.g., Hive, Spark), based on the standard Hadoop pattern of plugging into HDFS and YARN. In this talk, I will present an overview of ADL. I will cover ADLS architecture, design points including the Cosmos experience, and performance. I will also describe the work we are doing with the Apache OSS community on YARN and how that is leveraged in the ADLA framework.
Bio
Raghu Ramakrishnan is a Technical Fellow and CTO for Data at Microsoft. He also heads engineering for Big Data platforms and services. From 1987 to 2006, he was a professor at University of Wisconsin-Madison, where he wrote the widely-used text “Database Management Systems” and led a wide range of research projects in database systems (e.g., the CORAL deductive database, the DEVise data visualization tool, SQL extensions to handle sequence data) and data mining (scalable clustering, mining over data streams). In 1999, he founded QUIQ, a company that introduced a cloud-based question-answering service. He joined Yahoo! in 2006 as a Yahoo! Fellow, and over the next six years served as Chief Scientist for the Audience (portal), Cloud and Search divisions, driving content recommendation algorithms (CORE), cloud data stores (PNUTS), and semantic search (“Web of Things”). Ramakrishnan has received several awards, including the ACM SIGKDD Innovations Award, the SIGMOD 10-year Test-of-Time Award, the IIT Madras Distinguished Alumnus Award, the NSF Presidential Young Investigator Award, and the Packard Fellowship in Science and Engineering. He is a Fellow of the ACM and IEEE. He has served as Chair of ACM SIGMOD and the Board of the VLDB Foundation, and is on the Board of ACM SIGKDD.