Zhiru Zhang -- Publications

2026

[C120] Revisiting Pre-Propagation GNNs: Robust Diffusion Operators and Hidden-State Re-Propagation
Zichao Yue, Zhiru Zhang
International Conference on Machine Learning (ICML), Jul. 2026.
[J25] Beyond the Accelerator: A Full-Stack HW/SW Co-Design Analysis for Recommendation System Inference
Zhanqiu Hu, Mark Zhao, Zhiru Zhang, Udit Gupta
IEEE Micro, 2026.
[C119] FlashDLM: Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion
Zhanqiu Hu, Jian Meng, Yash Akhauri, Mohamed S. Abdelfattah, Jae-sun Seo, Zhiru Zhang, Udit Gupta
International Conference on Learning Representations (ICLR), Apr. 2026.
[C118] HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization
Hongzheng Chen, Yingheng Wang, Yaohui Cai, Hins Hu, Jiajie Li, Shirley Huang, Chenhui Deng, Rongjian Liang, Shufeng Kong, Haoxing Ren, Samitha Samaranayake, Carla P Gomes, Zhiru Zhang
International Conference on Learning Representations (ICLR), Apr. 2026.
[C117] Tawa: Automatic Warp Specialization for Modern GPUs with Asynchronous References
Hongzheng Chen, Bin Fan, Alexander Collins, Bastian Hagedorn, Evghenii Gaburov, Masahiro Masuda, Matthew Brookhart, Chris Sullivan, Jason Knight, Zhiru Zhang, Vinod Grover
International Symposium on Code Generation and Optimization (CGO), Jan-Feb. 2026.

2025

[J24] PIMsynth: A Unified Compiler Framework for Bit-Serial Processing-In-Memory Architectures
Deyuan Guo, Mohammadhosein Gholamrezaei, Matthew Hofmann, Ashish Venkat, Zhiru Zhang, Kevin Skadron
IEEE Computer Architecture Letters (CAL), 2025.
[C116] e-boost: Boosted E-Graph Extraction with Adaptive Heuristics and Exact Solving (Best Paper Nominee)
Jiaqi Yin, Zhan Song, Chen Chen, Yaohui Cai, Zhiru Zhang, Cunxi Yu
International Conference on Computer Aided Design (ICCAD), Oct. 2025.
[C115] EqMap: FPGA LUT Remapping using E-Graphs
Matthew Hofmann, Berk Gokmen, Zhiru Zhang
International Conference on Computer Aided Design (ICCAD), Oct. 2025.
[C114] Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device
Niansong Zhang, Wenbo Zhu, Courtney Golden, Dan Ilan, Hongzheng Chen, Christopher Batten, Zhiru Zhang
International Symposium on Microarchitecture (MICRO), Oct. 2025.
[C113] ASPEN: LLM-Guided E-Graph Rewriting for RTL Datapath Optimization
Niansong Zhang, Chenhui Deng, Johannes M. Kuehn, Chia-Tung Ho, Cunxi Yu, Zhiru Zhang, Haoxing Ren
International Symposium on Machine Learning for CAD (MLCAD), Sep. 2025.
[C112] CirSTAG: Circuit Stability Analysis on Graph-based Manifolds (Best Paper Nominee)
Wuxinlin Cheng, Yihang Yuan, Chenhui Deng, Ali Aghdaei, Zhiru Zhang, Zhuo Feng
Design Automation Conference (DAC), Jun. 2025.
[C111] Graph Learning at Scale: Characterizing and Optimizing Pre-Propagation GNNs
Zichao Yue, Chenhui Deng, Zhiru Zhang
The Annual Conference on Machine Learning and Systems (MLSys), May 2025.
[J23] Vesper: A Versatile Sparse Linear Algebra Accelerator with Configurable Compute Patterns
Hanchen Jin, Zichao Yue, Zhongyuan Zhao, Yixiao Du, Chenhui Deng, Nitish Srivastava, Zhiru Zhang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), May 2025.
[C110] SmoothE: Differentiable E-Graph Extraction (Best Paper Award)
Yaohui Cai, Kaixiang Yang, Chenhui Deng, Cunxi Yu, Zhiru Zhang
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Apr. 2025.

[C109] Cypress: VLSI-Inspired PCB Placement with GPU Acceleration (Best Paper Award)
Niansong Zhang, Anthony Agnesina, Noor Shbat, Yuval Leader, Zhiru Zhang, Haoxing Ren
International Symposium on Physical Design (ISPD), Mar. 2025.
[C108] ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (Best Paper Nominee)
Jinming Zhuang, Shaojie Xiang, Hongzheng Chen, Niansong Zhang, Zhuoping Yang, Tony Mao, Zhiru Zhang, Peipei Zhou
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb/Mar. 2025.

2024

[C107] Rapid GPU-Based Pangenome Graph Layout
Jiajie Li, Jan-Niklas Schmelzle, Yixiao Du, Simon Heumos, Andrea Guarracino, Giulia Guidi, Pjotr Prins, Erik Garrison, Zhiru Zhang
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), Nov. 2024.
[C106] UniSparse: An Intermediate Language for General Sparse Format Customization
Jie Liu, Zhongyuan Zhao, Zijian Ding, Benjamin Brock, Hongbo Rong, Zhiru Zhang
Conference on Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), Oct. 2024.
[C105] FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search (Best Paper Award)
Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S. Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi, Quoc V. Le, Sheng Li
International Conference on Automated Machine Learning (AutoML), Sep. 2024.
[C104] Differentiable Combinatorial Scheduling at Scale
Mingju Liu, Yingjie Li, Jiaqi Yin, Zhiru Zhang, Cunxi Yu
International Conference on Machine Learning (ICML), Jul. 2024.
[C103] Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel, Yuzong Chen, Bahaa Kotb, Sushma Prasad, Gang Wu, Sheng Li, Mohamed S. Abdelfattah, Zhiru Zhang
International Conference on Machine Learning (ICML), Jul. 2024.
[C102] Sabre: Hardware-Accelerated Snapshot Compression for Serverless MicroVMs
Nikita Lazarev, Varun Gohil, James Tsai, Andy Anderson, Bhushan Chitlur, Zhiru Zhang, Christina Delimitrou
USENIX Symposium on Operating Systems Design and Implementation (OSDI), Jul. 2024.
[C101] Scalable, Programmable and Dense: The HammerBlade Open-Source RISC-V Manycore
Dai Cheol Jung, Max Ruttenberg, Paul Gao, Scott Davidson, Daniel Petrisko, Kangli Li, Aditya Kamath, Lin Cheng, Shaolin Xie, Peitian Pan, Zhongyuan Zhao, Zichao Yue, Bandhav Veluri, Sripathi Muralitharan, Adrian Sampson, Andrew Lumsdaine, Zhiru Zhang, Christopher Batten, Mark Oskin, Dustin Richmond, Michael B. Taylor
International Symposium on Computer Architecture (ISCA), Jul. 2024.
[J22] Pangenome Graph Layout by Path-Guided Stochastic Gradient Descent
Simon Heumos, Andrea Guarracino, Jan-Niklas M Schmelzle, Jiajie Li, Zhiru Zhang, Jörg Hagmann, Sven Nahnsen, Pjotr Prins, Erik Garrison
Bioinformatics, Jul. 2024.
[C100] Allo: A Programming Model for Composable Accelerator Design
Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Jun. 2024.
[C99] Less is More: Hop-wise Graph Attention for Scalable and Generalizable Learning on Circuits
Chenhui Deng, Zichao Yue, Cunxi Yu, Gokce Sarar, Ryan Carey, Rajeev Jain, Zhiru Zhang
Design Automation Conference (DAC), Jun. 2024.
[J21] Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference
Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang
ACM Transactions on Reconfigurable Technology and Systems (TRETS), May 2024. (FCCM'24 Journal Track)
[J20] Supporting a Virtual Vector Instruction Set on a Commercial Compute-in-SRAM Accelerator
Courtney Golden, Dan Ilan, Caroline Huang, Niansong Zhang, Zhiru Zhang, Christopher Batten
IEEE Computer Architecture Letters (CAL), Jan-Jun. 2024.
[C98] Polynormer: Polynomial-Expressive Graph Transformer in Linear Time
Chenhui Deng, Zichao Yue, Zhiru Zhang
International Conference on Learning Representations (ICLR), May 2024.
[C97] Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training
Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Apr./May 2024.
[C96] LibPreemptible: Enabling Fast, Adaptive, and Hardware-Assisted User-Space Scheduling
Yueying Li, Nikita Lazarev, David Koufaty, Tenny Yin, Andy Anderson, Zhiru Zhang, G. Edward Suh, Kostis Kaffes, Christina Delimitrou
International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2024.
[C95] Formal Verification of Source-to-Source Transformations for HLS (Best Paper Award)
Louis-Noël Pouchet, Emily Tucker, Niansong Zhang, Hongzheng Chen, Debjit Pal, Gabriel Rodríguez, Zhiru Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Mar. 2024.

2023

[C94] Binarized Neural Machine Translation
Y. Zhang, A. Garg, Y. Cao, L. Lew, B. Ghorbani, Z. Zhang, O. Firat
Conference on Neural Information Processing Systems (NeurIPS), Dec. 2023.
[J19] TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design
L. Guo, Y. Chi, J. Lau, L. Song, X. Tian, M. Khatti, W. Qiao, J. Wang, E. Ustun, Z. Fang, Z. Zhang, J. Cong
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Dec. 2023.
[C93] Machine Learning for Embedded System Design
E. S. Alcorta, A. Gerstlauer, C. Deng, Q. Sun, Z. Zhang, C. Xu, L. W. Wills, D. S. Lopera, W. Ecker, S. Garg, J. Hu
International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Sep. 2023. (Invited Paper)
[C92] Resilient Baseband Processing in Virtualized RANs with Slingshot
N. Lazarev, T. Ji, A. Kalia, D. Kim, I. Marinos, F. Y. Yan, C. Delimitrou, Z. Zhang, A. Akella
Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM), Sep. 2023.
[J18] RapidStream 2.0: Automated Parallel Implementation of Latency Insensitive FPGA Designs Through Partial Reconfiguration
L. Guo, P. Maidee, Y. Zhou, C. Lavin, E. Hung, W. Li, J. Lau, W. Qiao, Y. Chi, L. Song, Y. Xiao, A. Kaviani, Z. Zhang, J. Cong
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Sep. 2023.
[J17] A 28nm 8-bit Floating-Point Tensor Core based CNN Training Processor with Dynamic Activation/Weight Sparsification
S. K. Venkataramanaiah, J. Meng, H. Suh, I. Yeo, J. Saikia, S. K. Cherupally, Y. Zhang, Z. Zhang, J. Seo
IEEE Journal of Solid-State Circuits (JSSC), Jul. 2023.
[C91] Equality Saturation for Datapath Synthesis: A Pathway to Pareto Optimality (Invited Perspective Paper)
E. Ustun, C. Yu, Z. Zhang
Design Automation Conference (DAC), Jul. 2023.
[J16] An Intermediate Language for General Sparse Format Customization
J. Liu, Z. Zhao, Z. Ding, B. Brock, H. Rong, Z. Zhang
IEEE Computer Architecture Letters (CAL), Jul-Dec. 2023.
[C90] A Case for Open EDA Verticals (Invited Perspective Paper)
Z. Zhang, M. Hofmann, A. Butt
International Symposium on Physical Design (ISPD), Mar. 2023.

2022

[C89] GARNET: Reduced-Rank Topology Learning for Robust and Scalable Graph Neural Networks (Spotlight)
C. Deng, X. Li, Z. Feng, Z. Zhang
Learning on Graphs Conference (LoG), Dec. 2022.
[C88] Understanding Hyperdimensional Computing for Parallel Single-Pass Learning
T. Yu, Y. Zhang, Z. Zhang, C. De Sa
Conference on Neural Information Processing Systems (NeurIPS), Nov/Dec. 2022.
[J15] FPGA HLS Today: Successes, Challenges, and Opportunities (Keynote Paper)
J. Cong, J. Lau, G. Liu, S. Neuendorffer, P. Pan, K. Vissers, Z. Zhang
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Dec. 2022.
[B3] Machine Learning for Agile FPGA Design
D. Pal, C. Deng, E. Ustun, C. Yu, Z. Zhang
Machine Learning Applications in Electronic Design Automation, ed. H. Ren, J. Hu, Springer, Aug. 2022.
[J14] Reverse Engineering CNN Models using Side-Channel Attacks (IEEE HSTTC Top Picks in Hardware and Embedded Security)
W. Hua, Z. Zhang, G. E. Suh
IEEE Design & Test, Aug. 2022.
[C87] GuardNN: Secure Accelerator Architecture for Privacy-Preserving Deep Learning
W. Hua, M. Umar, Z. Zhang, G. E. Suh
Design Automation Conference (DAC), Jul. 2022.
[C86] Accelerator Design with Decoupled Hardware Customizations: Benefits and Challenges
D. Pal, Y.-H. Lai, S. Xiang, N. Zhang, H. Chen, J. Casas, P. Cocchini, Z. Yang, Jin Yang, L.-N. Pouchet, Z. Zhang
Design Automation Conference (DAC), Jul. 2022. (Invited Paper)
[C85] PokeBNN: A Binary Pursuit of Lightweight Accuracy
Y. Zhang, Z. Zhang, L. Lew
The Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2022.
[C84] SoftVN: Efficient Memory Protection via Software-Provided Version Numbers
M. Umar, W. Hua, Z. Zhang, G. E. Suh
International Symposium on Computer Architecture (ISCA), Jun. 2022.
[C83] MGX: Near-Zero Overhead Memory Protection for Data-Intensive Accelerators
W. Hua, M. Umar, Z. Zhang, G. E. Suh
International Symposium on Computer Architecture (ISCA), Jun. 2022.
[J13] A Tensor Processing Framework for CPU-Manycore Heterogeneous Systems
L. Cheng, P. Pan, Z. Zhao, K. Ranjan, J. Weber, B. Veluri, S. Ehsani, M. Ruttenberg, D. Jung, P. Ivanov, D. Richmond, M. Taylor, Z. Zhang, C. Batten
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Jun. 2022.
[C82] IMpress: Large Integer Multiplication Expression Rewriting for FPGA HLS
E. Ustun, I. San, J. Yin, C. Yu, Z. Zhang
International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2022.
[C81] HeteroFlow: An Accelerator Programming Model with Decoupled Data Placement for Software-Defined FPGAs
S. Xiang, Y.-H. Lai, Y. Zhou, H. Chen, N. Zhang, D. Pal, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb/Mar. 2022.
[C80] High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS: A Case Study on SpMV
Y. Du, Y. Hu, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb/Mar. 2022.
[C79] RapidStream: Parallel Physical Implementation of FPGA HLS Designs (Best Paper Award)
L. Guo, P. Maidee, Y. Zhou, C. Lavin, J. Wang, Y. Chi, W. Qiao, A. Kaviani, Z. Zhang, J. Cong
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb/Mar. 2022.
[B2] FPGA-Specific Compilers
N. Srivastava, G. Liu, Y.-H. Lai, Z. Zhang
Handbook of Computer Architecture, ed. A. Chattopadhyay, Springer, Jan. 2022.

2021

[C78] BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining
W. Hua, Y. Zhang, C. Guo, Z. Zhang, G. E. Suh
Conference on Neural Information Processing Systems (NeurIPS), Dec. 2021.
[J14] Programming and Synthesis for Software-Defined FPGA Acceleration: Status and Future Prospects (Invited Paper)
Y.-H. Lai, E. Ustun, S. Xiang, Z. Fang, H. Rong, Z. Zhang
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Dec. 2021.
[C77] Distilling Arbitration Logic from Traces using Machine Learning: A Case Study on NoC (Best Paper Nominee)
Y. Zhou, H. Wang, J. Yin, Z. Zhang
Design Automation Conference (DAC), Dec. 2021.
[C76] GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs
Y. Hu, Y. Du, E. Ustun, Z. Zhang
International Conference on Computer Aided Design (ICCAD), Nov. 2021.
[C75] Scaling Up Hardware Accelerator Verification using A-QED with Functional Decomposition
S. Chattopadhyay, F. Lonsing, L. Piccolboni, D. Soni, P. Wei, X. Zhang, Y. Zhou, L. Carloni, D. Chen, J. Cong, R. Karri, Z. Zhang, C. Trippel, C. Barrett, S. Mitra
Twenty-first Conference on Formal Methods in Computer-Aided Design (FMCAD), Oct. 2021.
[J12] Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design
C. Hao, J. Dotzel, J. Xiong, L. Benini, Z. Zhang, D. Chen
IEEE Design & Test, Aug. 2021.
[C74] SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation
W. Cheng, C. Deng, Z. Zhao, Y. Cai, Z. Zhang, Z. Feng
International Conference on Machine Learning (ICML), Jul. 2021.
[C73] Dagger: Efficient and Fast RPCs in Cloud Microservices with Near-Memory Reconfigurable NICs (IEEE Micro Top Picks Honorable Mention)
N. Lazarev, S. Xiang, N. Adit, Z. Zhang, C. Delimitrou
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Apr. 2021.
[C72] FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations (Best Paper Nominee)
Y. Zhang, J. Pan, X. Liu, H. Chen, D. Chen, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb/Mar. 2021.
[C71] AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs (Best Paper Award)
L. Guo, Y. Chi, J. Wang, J. Lau, W. Qiao, E. Ustun, Z. Zhang, J. Cong
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb/Mar. 2021.
[C70] GLAIVE: Graph Learning Assisted Instruction Vulnerability Estimation
J. Jiao, D. Pal, C. Deng, Z. Zhang
Design, Automation and Test in Europe Conference (DATE), Feb. 2021.

2020

[C69] Y. Hu, Z. Ye, M. Wang, J. Yu, D. Zheng, M. Li, Z. Zhang, Zhiru Zhang, and Y. Wang, FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems, International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), Nov. 2020.
[C68] Y.-H. Lai, H. Rong, S. Zheng, W. Zhang, X. Cui, Y. Jia, J. Wang, B. Sullivan, Z. Zhang, Y. Liang, Y. Zhang, J. Cong, N. George, J. Alvarez, C. Hughes and P. Dubey, SuSy: A Programming Model for Productive Construction of High-Performance Systolic Arrays on FPGAs, International Conference on Computer Aided Design (ICCAD), Nov. 2020.
[C67] E. Ustun, C. Deng, D. Pal, and Z. Zhang, Accurate Operation Delay Prediction for FPGA HLS using Graph Neural Networks, International Conference on Computer Aided Design (ICCAD), Nov. 2020.
[C66] N. Srivastava, H. Jin, J. Liu, D. Albonesi, and Z. Zhang, MatRaptor: A Sparse-Sparse Matrix Multiplication Accelerator Based on Row-Wise Product, International Symposium on Microarchitecture (MICRO), Oct. 2020.
[J11] N. Lazarev, N. Adit, S. Xiang, Z. Zhang, and C. Delimitrou, Dagger: Towards Efficient RPCs in Cloud Microservices with Near-Memory Reconfigurable NICs, IEEE Computer Architecture Letters (CAL), Jul-Dec. 2020.
[C65] L. Guo, J. Lau, Y. Chi, J. Wang, C. Yu, Z. Chen, Z. Zhang, and J. Cong, Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency, Design Automation Conference (DAC), Jul. 2020.
[C64] E. Singh, F. Lonsing, S. Chattopadhyay, M. Strange, P. Wei, X. Zhang, Y. Zhou, J. Cong, D. Chen, Z. Zhang, P. Raina, C. Barrett, and S. Mitra, A-QED Verification of Hardware Accelerators, Design Automation Conference (DAC), Jul. 2020.
[C63] R. Nigam, S. Atapattu, S. Thomas, Z. Li, T. Bauer, Y. Ye, A. Koti, A. Sampson, and Z. Zhang, Predictable Design of FPGA Accelerators using Time-Sensitive Affine Types, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Jun. 2020. (Selected into MIT PL Review 2023)
[C62] C. Deng, Z. Zhao, Y. Wang, Z. Zhang, and Z. Feng, GraphZoom: A Multi-Level Spectral Approach for Accurate and Scalable Graph Embedding, International Conference on Learning Representations (ICLR), Apr. 2020. (Oral Category)
[C61] Y. Zhang, R. Zhao, W. Hua, N. Xu, G. E. Suh, and Z. Zhang, Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations, International Conference on Learning Representations (ICLR), Apr. 2020.
[C60] N. Srivastava, H. Jin, S. Smith, H. Rong, D. Albonesi, and Z. Zhang, Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations, International Symposium on High-Performance Computer Architecture (HPCA), Feb. 2020.

2019

[J10] Evaluating Celerity: A 16nm 695 Giga-RISC-V Instructions/s Manycore Processor with Synthesizable PLL
A. Rovinski, C. Zhao, K. Al-Hawaj, P. Gao, S. Xie, C. Torng, S. Davidson, A. Amarnath, L. Vega, B. Veluri, A. Rao, T. Ajayi, J. Puscar, S. Dai, R. Zhao, D. Richmond, Z. Zhang, I. Galton, C. Batten, M. Taylor, R. Dreslinski
IEEE Solid-State Circuits Letters (SSC-L), Dec. 2019.
[C59] Channel Gating Neural Networks
W. Hua, Y. Zhou, C. De Sa, Z. Zhang, G. E. Suh
Conference on Neural Information Processing Systems (NeurIPS), Dec. 2019.
[C58] Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating
W. Hua, Y. Zhou, C. De Sa, Z. Zhang, G. E. Suh
International Symposium on Microarchitecture (MICRO), Oct. 2019.
[C57] Building Efficient Deep Neural Networks with Unitary Group Convolutions
R. Zhao, Y. Hu, J. Dotzel, C. De Sa, Z. Zhang
The Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2019.
[C56] Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
R. Zhao, Y. Hu, J. Dotzel, C. De Sa, Z. Zhang
International Conference on Machine Learning (ICML), Jun. 2019.
[C55] A 1.4 GHz 695 Giga RISC-V Inst/s 496-core Manycore Processor with Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS
A. Rovinski, C. Zhao, K. Al-Hawaj, P. Gao, S. Xie, C. Torng, S. Davidson, A. Amarnath, L. Vega, B. Veluri, A. Rao, T. Ajayi, J. Puscar, S. Dai, R. Zhao, D. Richmond, Z. Zhang, I. Galton, C. Batten, M. Taylor, R. Dreslinski
Symposium on VLSI Circuits (VLSI), Jun. 2019.
[C54] PRIMAL: Power Inference using Machine Learning
Y. Zhou, H. Ren, Y. Zhang, B. Keller, B. Khailany, Z. Zhang
Design Automation Conference (DAC), Jun. 2019.
[C53] Painting on Placement: Forecasting Routing Congestion using Conditional Generative Adversarial Nets
C. Yu, Z. Zhang
Design Automation Conference (DAC), Jun. 2019.
[C52] Designing Secure Cryptographic Accelerators with Information Flow Enforcement: A Case Study on AES
Z. Jiang, H. Jin, G. E. Suh, Z. Zhang
Design Automation Conference (DAC), Jun. 2019.
[C51] Improving Scalability of Exact Modulo Scheduling with Specialized Conflict-Driven Learning
S. Dai, Z. Zhang
Design Automation Conference (DAC), Jun. 2019.
[C50] Rapid Generation of High-Quality RISC-V Processors from Functional Instruction Set Specifications
G. Liu, J. Primmer, Z. Zhang
Design Automation Conference (DAC), Jun. 2019.
[C49] T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations
N. Srivastava, H. Rong, P. Barua, G. Feng, H. Cao, Z. Zhang, D. Albonesi, V. Sarkar, W. Chen, P. Petersen, G. Lowney, A. Herr, C. Hughes, T. Mattson, P. Dubey
International Symposium on Field-Programmable Custom Computing Machines (FCCM), Apr./May 2019.
[C48] LAMDA: Learning-Assisted Multi-Stage Autotuning for FPGA Design Closure
E. Ustun, S. Xiang, J. Gui, C. Yu, Z. Zhang
International Symposium on Field-Programmable Custom Computing Machines (FCCM), Apr./May 2019.
[C47] HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing (Best Paper Award)
Y.-H. Lai, Y. Chi, Y. Hu, J. Wang, C. H. Yu, Y. Zhou, J. Cong, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2019.
[J9] PIMap: A Flexible Framework for Improving LUT-Based Technology Mapping via Parallelized Iterative Optimization
G. Liu, Z. Zhang
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Jan. 2019.

2018

[C46] High-Level Synthesis with Timing-Sensitive Information Flow Enforcement
Z. Jiang, S. Dai, G. E. Suh, Z. Zhang
International Conference on Computer Aided Design (ICCAD), Nov. 2018.
[C45] Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks
W. Hua, Z. Zhang, G. E. Suh
Design Automation Conference (DAC), Jun. 2018.
[C44] Fast and Accurate Estimation of Quality of Results in High-Level Synthesis with Machine Learning (Best Paper Award — Short Paper Category)
S. Dai, Y. Zhou, H. Zhang, E. Ustun, E. F.Y. Young, Z. Zhang
International Symposium on Field-Programmable Custom Computing Machines (FCCM), Apr./May 2018.
[J8] The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips
S. Davidson, S. Xie, C. Torng, K. Al-Hawai, A. Rovinski, T. Ajayi, L. Vega, C. Zhao, R. Zhao, S. Dai, A. Amarnath, B. Veluri, P. Gao, A. Rao, G. Liu, R. Gupta, Z. Zhang, R. Dreslinski, C. Batten, M. Taylor
IEEE Micro, Mar/Apr. 2018.
[C43] A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation (Best Paper Nominee)
S. Dai, G. Liu, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2018.
[C42] Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs
Y. Zhou, U. Gupta, S. Dai, R. Zhao, N. Srivastava, H. Jin, J. Featherston, Y.-H. Lai, G. Liu, G. Velasquez, W. Wang, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2018.

2017

[C41] Statistically Certified Approximate Logic Synthesis
G. Liu, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov. 2017.
[C40] Enabling Adaptive Loop Pipelining in High-Level Synthesis
S. Dai, G. Liu, R. Zhao, Z. Zhang
52nd Annual Asilomar Conference on Signals, Systems, and Computers, Oct. 2017. (Invited Paper)
[C39] Celerity: An Open-Source RISC-V Tiered Accelerator Fabric
T. Ajayi, K. Al-Hawaj, A. Amarnath, S. Dai, S. Davidson, P. Gao, G. Liu, A. Lotfi, J. Puscar, A. Rao, A. Rovinski, L. Salem, N. Sun, C. Torng, L. Vega, B. Veluri, X. Wang, S. Xie, C. Zhao, R. Zhao, C. Batten, R. Dreslinski, I. Galton, R. Gupta, P. Mercier, M. Srivastava, M. Taylor, Z. Zhang
ACM/IEEE Symposium on High-Performance Chips (HOTCHIPS), Aug. 2017.
[C38] FPGA-based Real-time Charged Particle Trajectory Reconstruction at the Large Hadron Collider
E. Bartz, J. Chaves, Y. Gershtein, E. Halkiadakis, M. Hildreth, S. Kyriacou, K. Lannon, A. Lefeld, A. Ryd, L. Skinnari, R. Stone, C. Strohman, Z. Tao, B. Winer, P. Wittich, Z. Zhang, M. Zientek
International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2017.
[C37] A Parallelized Iterative Improvement Approach to Area Optimization for LUT-Based Technology Mapping (Best Paper Nominee)
G. Liu, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2017.
[C36] A New Approach to Automatic Memory Banking using Trace-Based Address Mining
Y. Zhou, K. Al-Hawaj, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2017.
[C35] Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs
R. Zhao, W. Song, W. Zhang, T. Xing, J.-H. Lin, M. Srivastava, R. Gupta, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2017.
[C34] A Parallel Bandit-Based Approach for Autotuning FPGA Compilation
C. Xu, G. Liu, R. Zhao, S. Yang, G. Luo, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2017.
[C33] Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis
S. Dai, R. Zhao, G. Liu, S. Srinath, U. Gupta, C. Batten, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2017.
[C32] Accelerating Face Detection on Programmable SoC Using C-Based Synthesis
N. Srivastava, S. Dai, R. Manohar, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2017.
[J7] Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests
G. Liu, M. Tan, S. Dai, R. Zhao, Z. Zhang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Feb. 2017.

2016

[J6] Platform Choices and Design Demands for IoT Platforms: Cost, Power and Performance Tradeoffs
D. Chen, J. Cong, S. Gurumani, W-M. Hwu, K. Rupnow, Z. Zhang
IET Cyber-Physical Systems: Theory & Applications (IET-CPS), Nov. 2016.
[C31] Improving High-Level Synthesis with Decoupled Data Structure Optimization
R. Zhao, G. Liu, S. Srinath, C. Batten, Z. Zhang
Design Automation Conference (DAC), Jun. 2016.
[C30] Characterizing the Benefits and Limitations of Smart Building Meeting Room Scheduling
A. Majumdar, Z. Zhang, D. Albonesi
International Conference on Cyber-Physical Systems (ICCPS), Apr. 2016.

2015

[C29] DA Systemization of Knowledge: A Catalog of Prior Forward-Looking Initiatives (Invited Paper)
F. Koushanfar, A. Mirhoseini, G. Qu, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov. 2015.
[C28] ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests
M. Tan, G. Liu, R. Zhao, S. Dai, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov. 2015.
[C27] A Reconfigurable Analog Substrate for Highly Efficient Maximum Flow Computation
G. Liu, Z. Zhang
Design Automation Conference (DAC), Jun. 2015.
[C26] Area-Efficient Pipelining for FPGA-Targeted High-Level Synthesis
R. Zhao, M. Tan, S. Dai, Z. Zhang
Design Automation Conference (DAC), Jun. 2015.
[C25] Mapping-Aware Constrained Scheduling for LUT-Based FPGAs
M. Tan, S. Dai, U. Gupta, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2015.
[J5] High-Level Synthesis for Low-Power Design
Z. Zhang, D. Chen, S. Dai, K. Campbell
IPSJ Transactions on System LSI Design Methodology (T-SLDM), Feb. 2015. (Invited Paper)

2014

[C24] Architectural Specialization for Inter-Iteration Loop Dependence Patterns
S. Srinath, B. Ilbeyi, M. Tan, G. Liu, Z. Zhang, C. Batten
International Symposium on Microarchitecture (MICRO), Dec. 2014.
[C23] Multithreaded Pipeline Synthesis for Data-Parallel Kernels
M. Tan, B. Liu, S. Dai, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov. 2014.
[C22] CASA: Correlation-Aware Speculative Adders
G. Liu, Y. Tao, M. Tan, Z. Zhang
International Symposium on Low Power Electronics and Design (ISLPED), Aug. 2014.
[C21] Flushing-Enabled Loop Pipelining for High-Level Synthesis
S. Dai, M. Tan, K. Hao, Z. Zhang
Design Automation Conference (DAC), Jun. 2014.

2013

[C20] SDC-Based Modulo Scheduling for Pipeline Synthesis
Z. Zhang, B. Liu
International Conference on Computer-Aided Design (ICCAD), Nov. 2013.

2012

[C19] Challenges and Opportunities of ESL Design Automation (Invited Paper)
Z. Zhang, D. Chen
International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Oct. 2012.

2011

[J4] High-Level Synthesis for FPGAs: From Prototyping to Deployment (Keynote Paper)
J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, Z. Zhang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 30(4):473–491, Apr. 2011.

2010

[J3] Behavior-Level Observability Analysis and Operation Gating in Low-Power Behavioral Synthesis (Best Paper Award)
J. Cong, B. Liu, R. Majumdar, Z. Zhang
ACM Transactions on Design Automation of Electronic Systems (TODAES), 16(1):1–29, Nov. 2010.
[C18] Bit-Level Optimization for High-Level Synthesis and FPGA-Based Acceleration
J. Zhang, Z. Zhang, S. Zhou, M. Tan, X. Liu, X. Cheng, J. Cong
International Symposium on FPGAs (FPGA), Feb. 2010.

2009

[C17] Scheduling with Soft Constraints (Best Paper Nominee)
J. Cong, B. Liu, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov 2009.
[C16] Behavior-Level Observability Don't-Cares and Application to Low-Power Behavioral Synthesis
J. Cong, B. Liu, Z. Zhang
International Symposium on Low Power Electronics and Design (ISLPED), Aug. 2009.
[C15] Evaluation of Static Analysis Techniques for Fixed-Point Precision Optimization
J. Cong, K. Gururaj, B. Liu, C. Liu, Z. Zhang, S. Zhou, Y. Zou
IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Apr. 2009.

2008

[B1] AutoPilot: A Platform-Based ESL Synthesis System
Z. Zhang, Y. Fan, W. Jiang, G. Han, C. Yang, J. Cong
High-Level Synthesis: From Algorithm to Digital Circuit, ed. P. Coussy, A. Morawiec, Springer, 2008.
[C14] Scheduling with Integer Time Budgeting for Low-Power Optimization
W. Jiang, Z. Zhang, M. Potkonjnak, J. Cong
Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2008.
[C13] Behavioral Synthesis with Activating Unused Flip-Flops for Reducing Glitch Power in FPGA
C.T. Hsieh, J. Cong, S.C. Chang, Z. Zhang
Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2008.

2007

[C12] High-Level Power Estimation and Low-Power Design Space Exploration for FPGAs
D. Chen, J. Cong, Y. Fan, Z. Zhang
Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2007.

2006

[J2] Architecture and Compiler Optimization for Data Bandwidth Improvement in Configurable Processors
J. Cong, G. Han, Z. Zhang
IEEE Transaction on Very Large Scale Integration Systems (TVLSI), 14(9):986–997, Sep. 2006.
[C11] Platform-Based Behavior-Level and System-Level Synthesis (Invited Paper)
J. Cong, Y. Fan, G. Han, W. Jiang, Z. Zhang
IEEE International SOC Conference (SOCC), Sep. 2006.
[C10] An Efficient and Versatile Scheduling Algorithm Based on SDC Formulation (TCFPGA Hall of Fame — Class of 2022)
J. Cong, Z. Zhang
Design Automation Conference (DAC), Jul. 2006.
[C9] Behavior and Communication Co-Optimization for Systems with Sequential Communication Media
J. Cong, Y. Fan, G. Han, W. Jiang, Z. Zhang
Design Automation Conference (DAC), Jul. 2006.

2005

[C8] Architecture and Compilation for Data Bandwidth Improvement in Configurable Embedded Processors
J. Cong, G. Han, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov. 2005.
[C7] Instruction Set Extension with Shadow Registers for Configurable Processors
J. Cong, Y. Fan, G. Han, A. Jagannathan, G. Reinman, Z. Zhang
International Symposium on FPGAs (FPGA), Feb. 2005.
[C6] Bitwidth-Aware Scheduling and Binding in High-Level Synthesis
J. Cong, Y. Fan, G. Han, Y. Lin, J. Xu, Z. Zhang, X. Cheng
Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2005.

2004

[C5] Architecture-Level Synthesis for Automatic Interconnect Pipelining
J. Cong, Y. Fan, Z. Zhang
Design Automation Conference (DAC), Jun. 2004.
[J1] Architecture and Synthesis for On-Chip Multicycle Communication
J. Cong, Y. Fan, G. Han, X. Yang, Z. Zhang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 23(4):550–564, Apr. 2004.
[C4] Application-Specific Instruction Generation for Configurable Processor Architectures (TCFPGA Hall of Fame — Class of 2023)
J. Cong, Y. Fan, G. Han, Z. Zhang
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb. 2004.

2003

[C3] Architectural Synthesis Integrated with Global Placement for Multi-Cycle Communication
J. Cong, Y. Fan, G. Han, X. Yang, Z. Zhang
International Conference on Computer-Aided Design (ICCAD), Nov. 2003.
[C2] Gradual Relaxation Technique with Application to Behavioral Synthesis
Z. Zhang, Y. Fan, M. Potkonjak, J. Cong
International Conference on Computer-Aided Design (ICCAD), Nov. 2003.
[C1] Architecture and Synthesis for Multi-Cycle Communication (Invited Paper)
J. Cong, Y. Fan, X. Yang, Z. Zhang
International Symposium on Physical Design (ISPD), Apr. 2003.

IEEE Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

ACM Copyright Notice: This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution.

Other Copyright: All non IEEE/ACM papers are copyright of the respective journal or conference organizing body. These online copies are provided for your personal research use only.