Oak Ridge, TN
July 30, 2018
11:00 a.m.–12:00 p.m.
Building 5700, Room L204
Abstract: Over the last decade, owing to the increasing data volumes, data storage and processing is commonly done in large scale distributed systems. The collective capacity of multiple computing nodes to process data in parallel and serve users is higher as compared to single node storage and computing services; as a consequence, distributed algorithms are significantly more efficient than their centralized counterparts. However, the performance of distributed algorithms is fundamentally limited by two factors: (i) the overhead of communicating the data to the processing nodes, and (ii) the redundancy required to deal with faulty, or delayed nodes in the system. In this talk, we will explore ideas from information theory and the theory of error correcting codes in relieving these distributed computing bottlenecks. This talk will be delivered in two parts.
In the first part, we will overview novel error-correction techniques for making distributed computing systems faster and more efficient by overcoming the straggler effect - the phenomenon where a few excessively slow computing nodes limit the overall computation speed. I will describe some recent work related to error-correcting-code based computing methods for important distributed machine learning tasks, specifically, for distributed matrix-vector and matrix-matrix multiplications, and linear inverse problems.
In the second part of the talk, I will focus on the problem of emulating a consistent (linearizable) shared memory over a distributed asynchronous system. Consistent shared memory emulation is a well-known task in distributed computing relevant to modern cloud-based key value stores services, which are used by various applications including transactions, reservation systems, multi-player gaming and multi-processor programming. Motivated by technological trends where key-value stores are increasingly implemented in high speed memory, we will explore the use of information theory ideas to minimize the memory footprint (storage overhead) as well as the communication costs of consistent shared memory emulation.
Biography: Viveck Cadambe is an Assistant Professor in the Department of Electrical Engineering at Pennsylvania State University. Dr. Cadambe received his Ph.D. from the University of California, Irvine, in 2011. Between 2011 and 2014, he was a postdoctoral researcher, jointly with the Electrical and Computer Engineering (ECE) department at Boston University, and the Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT). Dr. Cadambe is a recipient of the 2009 IEEE Information Theory Society Best Paper Award, the 2014 IEEE International Symposium on Network Computing and Applications (NCA) Best Paper Award, the 2015 NSF CRII Award, the 2016 NSF Career Award and a finalist for the 2016 Bell Labs Prize. Since December 2014, he has served as an Associate Editor for the IEEE Transactions on Wireless Communications. His research uses tools of information theory, error correcting codes, and theory of distributed systems to understand fundamental engineering trade-offs in data communication, storage and computing systems.