I am interested in computer architecture, systems, and applied data mining.
My Ph.D. work has focused on improving the resource efficiency of large-scale datacenters.
Since traditional scaling techniques, e.g., using commodity computing or relying on Dennard's scaling are reaching the point of diminishing returns,
we must focus on using existing systems more efficiently.
Specifically, during my Ph.D. I have designed and built practical and scalable scheduling systems that improve system utilization,
without sacrificing application performance.
Our approach relied on three main insights: first, systems that manage resources must account for the interactions between hardware architectures and system software. Second, a user should only have to provide a high-level declarative description of application requirements, not how they should be achieved using low-level resources. Third, the system must quickly learn the resource preferences of a new, unknown application. Obtaining this information through application profiling is prohibitively expensive. To make the system practical, we leveraged efficient data mining techniques that take advantage of existing system knowledge to quickly make high quality scheduling decisions.
Below is a list of projects I have worked on in the past.
Paragon: Paragon is a QoS-aware datacenter scheduler that accounts for interference between co-scheduled workloads and platform heterogeneity. The scheduler leverages fast classification techniques to determine the interference and heterogeneity preferences of incoming applications, which only introduce minimal scheduling overheads. In a 1,000-server EC2 cluster Paragon improves system utilization by 47% compared to a traditional least-loaded scheduler and achieves 96% of optimal performance, while being scalable and lightweight.
[ASPLOS'13 paper] [TopPicks'14 paper] [TOCS'13 paper]
Quasar: Traditionally, datacenters have been plagued by low utilization, primarily due
to users overprovisioning resource reservations to side-step performance unpredictability.
Quasar is a cluster manager that introduces a different interface between system and users.
Instead of specifying raw resources, the user only specifies a performance target a job must meet.
Quasar then leverages efficient data mining techniques to determine the resource preferences of
a new job, much like a movie recommendation
system finds similarities between previous and new users to recommend movies that they are likely to enjoy.
Quasar achieves both high cluster utilization and high per-application performance.
[ASPLOS'14 paper] [demo] [press]
Tarcil: Tarcil is a scheduler that addresses the disparity between sophisticated, but slow centralized schedulers and fast, but low-quality distributed
schedulers. Tarcil uses sampling to lower the scheduling overheads and it accounts for the resource preferences of new jobs, to keep scheduling quality high.
It improves performance both for short and long jobs compared to centralized and distributed schedulers.
Cloud Provisioning: Paragon and Quasar assume that the cluster manager has full control over the entire system. Unfortunately, real life can be more
complicated, especially when the resources used are hosted on a public cloud provider. In this work, I designed a system that determines the most cost-efficient
instance type (reserved vs. on-demand) and size a job needs to satisfy its QoS constraints. I evaluated this system on a cluster with a few hundred servers on
Google Compute Engine.
iBench: Paragon and Quasar need to know the sensitivity of an incoming application to various types of interference. iBench is a benchmark suite that
consists of a set of microbenchmarks each of which puts pressure on a specific shared resource. iBench enables fast and practical characterization of the
interference an application tolerates in various resources and the interference it itself generates.
ARQ: Admission control is needed during periods of high load to prevent cluster overloading. ARQ is a multi-class admission control protocol that
ensures fast application dispatching and low head-of-line blocking.
Datacenter Application Modeling: Previously, I worked on characterizing and modeling the behavior of large-scale datacenter applications.
I designed and implemented ECHO, a consice analytical model that captures and recreates the network traffic of
distributed datacenter applications. I also developed a modeling framework for storage workloads, which generates
synthetic load patterns similar to the original applications. Both modeling frameworks were validated against real datacenter applications from Microsoft,
and were used in a series of efficiency and cost optimization studies.
[IISWC'12 paper] [IISWC'11 paper] [CAL'12 paper] [TPCTC'11 paper]