Exploiting Sparsity for Efficient Contrastive Self-Supervised Learning

Jian Meng

Contrastive learning (CL) (or its variants) has been a popular learning scheme in the self-supervised learning domain, achieving similar performance as supervised learning without massive labeling efforts. Despite the labeling efficiency, wide and large models are required to achieve high accuracy for CL, which incurs a high amount of computation and hinders the pragmatic merit of self-supervised learning. Pursuing sparsity could effectively reduce the model size and improve the efficiency, but we observed that state-of-the-art pruning methods for supervised learning are not always feasible in the contrastive learning domain.

This talk will present two novel sparsification methods designed for energy-efficient contrastive learning. First, for dynamic activation sparsity, we present contrastive dual gating (CDG), a novel pruning algorithm that skips the uninformative features during contrastive learning without hurting the trainability of the networks. Second, for weight sparsity, we introduce synchronized contrastive pruning (SyncCP), which investigates synchronized weight sparsification in the two contrastive branches, while embracing the advantage of asymmetrical CL. Both methods are evaluated comprehensively with both structured and unstructured sparsity, as well as GPU-friendly N:M structured fine-grained sparsity.

Bio: Jian Meng received the B.S degree from Portland State University, Portland, USA, in 2019. He has been pursuing the Ph.D. degree with the School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ, USA. Jian will continue his PhD study at Cornell Tech starting from Fall 2023. His current research focuses on the deep neural network compression optimization, self-supervised learning, hardware – software co-design with neuromorphic hardware acceleration, neuromorphic algorithm design for event-based vision and spiking neural networks, and energy efficient object rendering.