
Yao Wang
王瑶
Ph.D. Candidate, supervised by Prof. G. Edward Suh
Computer Systems Laboratory (CSL)
Electrical and Computer Engineering, Cornell University
Email: | y .a .o at csl...cornell...edu! (erase space, extra dot, !) |
Phone: | +1 (607) 379-4431 |
Office address: | 365C Upson Hall, Cornell University |
About Me
I got my PhD in October 2016 and now work in a startup called Waltz Networks. I was working with Prof. G. Edward Suh during my PhD study. Before joining Cornell, I received my bachelor's degree in Electronic Engineering from Tsinghua University, Beijing, China, in 2010.
Thesis
Efficient and Verifiable Timing Channel Protection for Multi-Core Processors
Research Interests
Hardware designs for timing channel protection
- On-Chip Networks(NOCS12')
- Cache(DAC16')
- Main Memory(HPCA14', HPCA16', HPCA17')
Information flow control
- SecVerilog(ASPLOS15')
Publications
- Secure Dynamic Memory Scheduling against Timing Channel Attacks
[PDF][Slides]
Yao Wang, Benjaming Wu, G. Edward Suh.
HPCA17', Austin, Texas, Feb 2017
- SecDCP: Secure Dynamic Cache Partitioning for Efficient Timing Channel Protection
[PDF][Slides][Poster]
Yao Wang, Andrew Ferraiuolo, Danfeng Zhang, Andrew C. Myers, G. Edward Suh.
DAC16', Austin, Texas, June 2016
- Lattice Priority Scheduling: Low-Overhead Timing Channel Protection for a Shared Memory Controller
[PDF]
Andrew Ferraiuolo, Yao Wang, Danfeng Zhang, Andrew C. Myers, G. Edward Suh.
HPCA16', Barcelona, Spain, March 2016
- A Hardware Design Language for Timing-Sensitive Information-Flow Security
[PDF]
Danfeng Zhang, Yao Wang, G. Edward Suh, Andrew C. Myers.
ASPLOS15', Istanbul, Turkey, March 2015
- Timing Channel Protection for a Shared Memory Controller
[PDF][Slides]
Yao Wang, Andrew Ferraiuolo, G. Edward Suh.
HPCA14', Orlando, Florida, February 2014
- Quadrisection-Based Task Mapping on Many-Core Processors for Energy-Efficient On-Chip Communication
[PDF][Slides]
Nithin Michael, Yao Wang, Edward G. Suh, Ao Tang.
NOCS13', Tempe, Arizona, April 2013
- Efficient Timing Channel Protection for On-chip Networks
[PDF][Slides]
Yao Wang, Edward G. Suh.
NOCS12', Lyngby, Denmark, May 2012
- Performance Evaluation of On-Chip Sensor Network (SENoC)
[PDF][Slides]
Yao Wang, Yu Wang, Jiang Xu, Huazhong Yang.
ICGCS10', Shanghai, China, June 2010
Projects
Full System Timing Channel Protection

With the emergence of new computing platform such as cloud computing, hardware resources
are extensively shared among processes, programs and users. The dynamic sharing of hardware
resources introduces interference between different security domains, which becomes a vector
for timing channels. Any hardware resources that are shared among different security domains
have the risk of timing channel attacks. These hardware resources includes but not limited to
core pipelines, branch predictors, caches, on-chip network, and memory.
In this project, we are working on efficient scheme to provide full system
protection against timing channels in shared hardware resources throughout the system...
A Hardware Design Language for Timing-Sensitive Information-Flow Security (ASPLOS'15)

Information flow control is a promising method for building future applications and computing
systems with strong security. However, information flows, especially timing channel leakage, are
extremely difficult to control using pure software approaches, because the timing behavior of
a program relies on the underlying hardware.
In this project, we are trying to combine software and hardware approaches to provide strong information
flow control. In the software level, we add annotations to the program to track information flow. These
annotations are then communicated to the hardware which enforces the information flow control through
its implementation. To verify the security of the implemented hardware, we extended Verilog with
annotations that support comprehensive, precise reasoning about information flows at the hardware level...
Timing Channel Protection for a Shared Memory Controller (HPCA14')

Modern computing systems are increasingly vulnerable to timing channel attacks due
to interference through extensive hardware resource sharing between programs, threads
and virtual machines. Previous research proposed some hardware techniques to deal with
timing channels through hardware resources such as cache, on-chip network. However, no
hardware techniques have looked into the timing channel through a shared memory controller.
In this project, we demonstrated the existence of timing channels through a shared memory
controller. Our solution, Temporal Partitioning solves this problem using several
methods:
- Per Security Domain (SD) based queuing structure
- Time Division Multiplexing (TDM) based scheduling algorithm
- Dead time to drain in-flight transactions
Energy-Efficient Task Mapping Algorithm for Many-Core Processors (NOCS13')

Many-core platforms with an on-chip interconnect network, sometimes referred to as Network-on-Chip (NoC), have emerged as a likely candidate for future microprocessor architecture. In order to efficiently utilize the large number of cores, however, application tasks need to be carefully mapped to processing cores (network nodes). In this project, we propose to apply a heuristic approach for VLSI layout called Quadrisection to the task mapping problem for a two-dimensional on-chip network such as meshes.
Efficient Timing Channel Protection for On-Chipe Networks (NOCS12')

On-chip network is often dynamically shared among applications that are concurrently
running on a chip- multiprocessor (CMP). In general, such shared resources imply that
applications can affect each other’s timing characteristics through interference in shared
resources. For example, in on-chip networks, multiple flows can compete for links and buffers.
We show that this interference is an attack vector through which a malicious application may
be able to infer data-dependent information about other applications (side channel attacks),
or two applications can exchange information covertly when direct communications are prohibited
(covert channel attacks).
To prevent these timing channel attacks, approaches such as Spatial
Network Partitioning (SNP) or Temporal Network Partitioning (TNP) can be used, but they are
very inefficient in bandwidth utilization. We propose an efficient scheme, called
Reversed Priority with Static Limits (RPSL)
which exploits one-way timing channel protection from high security level to low security level by:
- Giving high priority to packets from low security level in router arbitration
- Setting a static limit on bandwidth utilization for low security level
Routing Algorithm to Reduce Static Power of On-Chip Networks

Technology scaling indicates that future microprocessors will be power limited. Since microprocessors
will be power limited and only a portion of the silicon can be turned on, recent studies have proposed
specialized and adaptive cores for future multi-core systems. However, the on-chip networks for these
systems have been designed assuming that all nodes are simultaneously active.
In this project, we investigate
activating only a subset of routers in the network. Specifically, we propose an oblivious routing
algorithm, called "BackTrack", to maximize the number of unused routers which can then be turned off to save static power.
The basic idea is to have flows in both directions between each pair of network nodes use the same set of routers.
We proved this routing algorithm to be deadlock-free, and we show that this routing algorithm can reduce
power usage with little to no performance degradation on the network.
Performance Evaluation of On-Chip Sensor Network (SENoC) in MPSoC (ICGCS10')

As technology scaling, more processing units (PUs) are integrated in multiprocessor system-on-chip (MPSoC)
to achieve higher performance. Due to the higher variations resulted from reducing feature sizes and the
needs of lower power consumption, on-chip monitoring of environmental information, such as thermal, voltage,
and frequency, is becoming increasingly important. To address this need, sensors are integrated into
network-on-chip (NoC) to perform system monitoring. However, sensors which transfer their data through NOC
will compete with PUs for the limited bandwidth resources, thus communication between PUs will be delayed.
To evaluate the sensors' overhead on the regular data traffic, we implement a VC based NoC. The sensor data
are transferred through NoC together with the regular data. We study the average delay of regular data and
sensor data, respectively. We compare the experimental results with that of a NOC without sensors. The results
show that the overhead of sensors is negligible.
Academic Awards
- Cornell University Fellowship (Sep 2010)
- Outstanding Graduate Award, Tsinghua University (May 2010)
- Guanghua Scholarship, Tsinghua University (May 2009)
- HuangqianXiang Scholarship, Tsinghua University (May 2008)
- National Scholarship, Tsinghua University (May 2007)
Professional Activities
- External reviewer for HPCA2014, CCS2014
- Teaching Assistant for ENGRD 2300 (Fall 2012)
- Registration assistant for ISCA2012
Courses
- Fall 2015
- Spring 2014
- Fall 2013
- Spring 2013
- Spring 2013
- Fall 2011
- Fall 2011
- Spring 2011
- Spring 2011
- Fall 2010
- Fall 2010
|
CS5110 | Programming Languages and Logics |
|
CS5412 | Cloud Computing |
|
CS5120 | Introduction to Compilers |
|
ECE5430 | System Security |
|
ECE5750 | Advanced Microprocessor Architecture |
|
ECE5720 | Parallel Computer Architecture |
|
CS4410 | Operating Systems |
|
ECE5730 | Memory Systems |
|
ECE4740 | Digital VLSI |
|
ECE4750 | Computer Architecture |
|
CS5722 | Heuristic Methods for Optimization |