Welcome to my webpage! I am an assistant professor in the Electrical and Computer Engineering Department at Cornell where I lead the SAIL group; I am also a member of
the Computer Systems Laboratory (CSL), and the John and Norma Balen Sesquicentennial Faculty Fellow. My main interests are in computer
architecture and computer systems. Specifically, I work on improving the resource efficiency of large-scale datacenters through QoS-aware scheduling and resource management
techniques. I am also interested in designing efficient server architectures, distributed performance debugging, and cloud security. Before joining Cornell, I earned a
Ph.D. in Electrical Engineering at Stanford University, where I worked with Christos Kozyrakis.
I had previously earned an M.S. in Electrical Engineering from Stanford (2011) and a Diploma in Electrical and Computer Engineering
from the National Technical University of Athens (2009).
I have been fortunate to receive a Sloan Faculty Research Award, two Google Faculty Research Awards (2019 and 2020), a Microsoft Research Faculty Fellowship, the 2020 IEEE TCCA Young Computer Architect Award,
an Intel Rising Star Award, a Google Research Award in Recognition of Technical Leadership and Achievements in Systems Research, a Facebook Faculty Research Award, the Cornell Excellence in Research Award, and the Cornell Excellence in Teaching Award. My work has also received five IEEE Micro TopPicks Awards, one TopPicks Honorable Mention, and several best paper awards.
You can find more information in my CV.
Contact Information
332 Rhodes Hall, Electrical and Computer Engineering
Cornell University, Ithaca, NY
E-mail: delimitrou@cornell.edu
I work in computer architecture, cloud systems, and applied machine learning.
We are working to redesign cloud hardware, both using hardware acceleration and alternative platform designs,
which address the challenges emerging cloud programming models, like microservices, introduce. First, microservices
drastically change the cloud resource bottlenecks. While previously the majority of the end-to-end latency went towards
useful computation, now a large fraction (if not the majority) of latency goes towards processing network requests.
A small amount of unpredictability in the network stack, e.g., due to queueing, can significantly degrade the end-to-end Quality-of-Service (QoS).
We have developed two approaches for cluster management in microservices.
In recent work we demonstrated that a data-driven approach is especially effective in performance debugging, because it leverages the massive amounts of tracing data collected in cloud systems to identify patterns signaling upcoming performance issues. We have proposed two approaches, one based on supervised learning, and one on unsupervised learning.
To address these issues, we designed Sage, in collaboration with Google, which relies entirely on unsupervised learning for root cause analysis.
Sage automatically builds the graph of dependencies between microservices using a Causal Bayesian Network (CBN) and explores scenarios that can solve a performance issue through a generative Graphical Variational AutoEncoder (GVAE). Sage achieves the same accuracy as supervised learning while being much more practical and scalable. It also easily accommodates changes to the application's design and deployment, which are frequent in microservices, by employing partial and incremental retraining.
In addition to microservices, serverless frameworks have also gained in popularity. While a similar concept to microservices,
serverless takes advantage of fine-grained parallelism and intermittent activity in applications to spawn a large number of
concurrent instances, improve performance, reduce overprovisioning, and simplify cloud maintenance. At the same time, serverless
is prone to poor performance due to data transfers, instantiation overheads, and contention due to multi-tenancy.
Since its publication, the paper describing DeathStarBench and the system implications of microservices has been recognized with
an IEEE Micro Top Pick 2020 award, awarded to the 12 papers from all computer architecture conferences
of the year based on contributions and potential for long-term impact. DeathStarBench is open-source software,
has over 20,000 unique clones on Github, and is used by tens of research groups and several cloud providers. Not only has this work enabled research projects
that would otherwise not have been possible, but it also aids the effort for reproducible research in computer engineering by providing a common reference point studies can compare against. During my Ph.D. I had worked extensively on improving resource efficiency in cloud systems.
Given the low datacenter utilization at the time, caused in part by overprovisioned resource reservations,
I built a number of systems that relied on machine learning (ML) to automate the scheduling and resource management process in the cloud.
Paragon---QoS-Aware Cloud Scheduling: Paragon is a QoS-aware datacenter scheduler that accounts for interference between co-scheduled workloads
and platform heterogeneity. The scheduler leverages fast classification techniques to
determine the interference and heterogeneity preferences of incoming applications, which only introduce minimal scheduling overheads.
Across large-scale cluster evaluations, Paragon significantly improved both performance and resource utilization compared to prior systems,
without introducing significant scheduling overheads at runtime.
[ASPLOS'13 paper] [TopPicks'14 paper] [TOCS'13 paper]
Quasar---ML-Driven Cluster Management: Traditionally,
users overprovision their resource reservations in the cloud to side-step performance unpredictability.
Quasar is a cluster manager that introduces a different interface between system and users.
Instead of specifying raw resources, the user only specifies a performance target a job must meet.
Quasar then leverages efficient data mining techniques to determine the resource preferences of
a new job, much like a movie recommendation
system finds similarities between previous and new users to recommend movies that they are likely to enjoy.
Quasar achieves both high cluster utilization and high per-application performance.
[ASPLOS'14 paper] [demo] [press]
Tarcil---Low-Latency Distributed Scheduling: Tarcil is a scheduler that addresses the disparity between sophisticated, but slow centralized schedulers and fast,
but low-quality distributed
schedulers. Tarcil uses sampling to lower the scheduling overheads and it accounts for the resource preferences of new jobs, to keep scheduling quality high.
It improves performance both for short and long jobs compared to centralized and distributed schedulers.
[SOCC'15 paper]
HCloud---Hybrid Cloud Provisioning: Paragon and Quasar assume that the cluster manager has full control over the entire system. Unfortunately, real life can be more
complicated, especially when the resources used are hosted on a public cloud provider. In this work, I designed a system that determines the most cost-efficient
instance type (reserved vs. on-demand) and size a job needs to satisfy its QoS constraints. I evaluated this system on a cluster with a few hundred servers on
Google Compute Engine.
[ASPLOS'16 paper]
iBench: Paragon and Quasar need to know the sensitivity of an incoming application to various types of interference. iBench is a benchmark suite that
consists of a set of microbenchmarks each of which puts pressure on a specific shared resource. iBench enables fast and practical characterization of the
interference an application tolerates in various resources and the interference it itself generates.
[IISWC'13 paper]
Datacenter Application Modeling: Previously, I had also worked on characterizing and modeling the behavior of large-scale datacenter applications.
I designed and implemented ECHO, a consice analytical model that captures and recreates the network traffic of
distributed datacenter applications. I also developed a modeling framework for storage workloads, which generates
synthetic load patterns similar to the original applications. Both modeling frameworks were validated against real datacenter applications from Microsoft,
and were used in a series of efficiency and cost optimization studies.
[IISWC'12 paper] [IISWC'11 paper] [CAL'12 paper] [TPCTC'11 paper]