Computer Systems Laboratory Retreat 2023
Friday, May 12th • Moakley House, Ithaca, NY
Enabling CPU-free Data Movement in the Cloud
Data movement has always been one of the biggest contributors to the datacenter tax, and today this tax becomes especially high. Modern servers are equipped with multiple 100-core CPUs capable of hosting tens of VMs running diverse sets of workloads, from interactive microservices to large language models. All these applications shuffle around a tremendous amount of data and often require low latency of communication. This leads to a high load on CPUs to handle data movement, which results in wasting plenty of cycles for operations not directly related to application processing. In my research, I am exploring the possibility of making data movement in datacenters to be CPU-free. This is particularly important in the clouds based on the pay-as-you-go model, where every CPU cycle costs money for users. Taking inspiration from the RDMA model, we design a new class of datacenter networking hardware capable of offloading entire end-to-end communication pipelines for general applications. We primarily target the offload of end-to-end RPC stacks, remote memory operations, and data transformations. In combination with emerging hardware technologies, such as Intel DSA, CXL interconnects, user interrupts, and novel data movement ISA extensions (e.g. MOVDIR64/ENQCMD in x86/64), we believe our proposed hardware will be able to support arbitrary-complex data movement patterns with as least CPU participation as possible. This will release a decent amount of cloud CPU resources for the tasks they have originally been designed for, i.e. for useful computations, and will also make data movement faster.
Bio: Nikita Lazarev is a fourth year PhD student in School of Electrical and Computer Engineering at Cornell University and School of Electrical Engineering and Computer Science at MIT. He works under the supervision of Profs. Christina Delimitrou and Zhiru Zhang. His research interests lie at the intersection of computer hardware and systems with applications in distributed systems and networking. His recent research focuses on efficient datacenter networking enabled by reconfigurable in-network hardware. Nikita obtained his undergraduate degree in Electrical Engineering from Bauman Moscow State Technical University and Master’s degree in Computer Science from EPFL. In the past, he interned at Microsoft Research India, Microsoft Research Cambridge, and Microsoft Research Redmond where he worked on FPGA-enabled low-precision ML, CPU-free datastore systems, and cloud native 5G networks respectively