Co-Design of Binarized Deep Learning

Yichi Zhang

This talk presents our ongoing research into binarized neural networks (BNNs) using a co-design approach that involves contributions to both algorithms and hardware specialization. Existing BNNs are highly efficient in hardware, but their low model accuracy poses a significant challenge. I will introduce novel quantization techniques that enable the creation of a new family of BNN models, which achieve high accuracy on ImageNet and real-time inference on embedded FPGAs. Additionally, I will share the promising results of binarizing language models. I will conclude by discussing the deployment challenges of BNNs and exploring the potential of an end-to-end compilation and mapping flow to novel hardware.

Bio: Yichi is a fifth year Ph.D. student at CSL, advised by Professor Zhiru Zhang. His research interests align with the area of efficient ML model-hardware co-design. His work spans neural network quantization and binarization algorithms, ML accelerators, and building learning systems at hyperscale.