I am Shaojie Xiang, a Ph.D. candidate in Electrical and Computer Engineering at Cornell University, where I work with Prof. Zhiru Zhang in Computer Systems Lab (CSL). My research interests lie in the intersection of compiler, hardware accelerator design, FPGA/GPU. [CV]. I will be joining Amazon Web Service in June 2024.
E-mail: sx233 [at] cornell [dot] edu | OpenPGP
Office: 471 Frank H. T. Rhodes Hall, Ithaca, NY
Ph.D. in Electrical and Computer Engineering (2018 - present)
Cornell University, ECE CSL
Advisor: Prof. Zhiru Zhang
Advisor: Prof. Zhiru Zhang
B.Eng. in Electrical Engineering (2018)
Huazhong University of Science and Technology
GPA: 4.98/5.0 ranking 1st/423
Huazhong University of Science and Technology
GPA: 4.98/5.0 ranking 1st/423
- Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang. Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference. ACM Transactions on Reconfigurable Technology and Systems (TRETS) 2024
- Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang. Allo: A Programming Model for Composable Accelerator Design. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) 2024
- Shaojie Xiang, Yi-Hsiang Lai, Yuan Zhou, Hongzheng Chen, Niansong Zhang, Debjit Pal, Zhiru Zhang. HeteroFlow: An Accelerator Programming Model with Decoupled Data Placement for Software-Defined FPGAs. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2022
- Debjit Pal, Yi-Hsiang Lai, Shaojie Xiang, Niansong Zhang, Hongzheng Chen, Jeremy Casas, Pasquale Cocchini, Zhenkun Yang, Jin Yang, Louis-Noël Pouchet, Zhiru Zhang. Accelerator Design with Decoupled Hardware Customizations: Benefits and Challenges. Design Automation Conference (DAC), 2022.
- Nikita Lazarev, Shaojie Xiang, Neil Adit, Zhiru Zhang, Christina Delimitrou. Dagger: Efficient and Fast RPCs for Cloud Microservices with Near-Memory Reconfigurable NICs. 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2021
- Yi-Hsiang Lai, Ecenur Ustun, Shaojie Xiang, Zhenman Fang, Hongbo Rong, and Zhiru Zhang. Programming and Synthesis for Software-Defined FPGA Acceleration: Status and Future Prospects. ACM Transactions on Reconfigurable Technology and Systems (TRETS), Dec 2021. (Invited Paper)
- Ecenur Ustun, Shaojie Xiang, Jinny Gui, Cunxi Yu, Zhiru Zhang. LAMDA: Learning-Assisted Multi-stage Autotuning for FPGA Design Closure. 27th International Symposium on Field-Programmable Custom Computing Machines (FCCM) 2019
- Jianchi Zhou, Kaustav Ghosh, Shaojie Xiang, Xin Yan, Ahmad Hosseinbeig, Jongsung Lee, David Pommerenke. Characterization of ESD Risk for Wearable Devices. IEEE Transactions on Electromagnetic Compatibility (TEMC) 2018
Amazon AI, AWS (SU 2023)
Applied Scientist Intern. LLM Optimization
Mentors: Yuan Zhou, Fredrik Kjolstad, Yida Wang
Applied Scientist Intern. LLM Optimization
Mentors: Yuan Zhou, Fredrik Kjolstad, Yida Wang
Nvidia ML Compiler Group (FA 2020)
Research Intern. ML Compiler for GPUs
Mentors: Bin Fan, Vinod Grover
Research Intern. ML Compiler for GPUs
Mentors: Bin Fan, Vinod Grover
Intel Parallel Computing Lab (SU 2020)
Research Intern. Systolic Compiler for FPGAs
Mentor: Hongbo Rong
Research Intern. Systolic Compiler for FPGAs
Mentor: Hongbo Rong
Invited Reviewer
- USENIX Annual Technical Conference (ATC)
- IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM)
- IEEE Transactions on Computers (TC)
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- ACM Great Lakes Symposium on Very Large Scale Integrated Circuits (GLSVLSI)
- ACM Transactions on Reconfigurable Technology and Systems (TRETS)
- Cornell ECE 5775 High-Level Digital Design Automation (2022)
- Cornell ECE 5997 Hardware Accelerator Design and Automation (2021)
- HeteroCL: a domain-specific language and optimizing compiler for software-defined heterogeneous computing (CPU, FPGA, GPU, and Processing-in-Memory accelerators).
- T2S (temporal to spatial): a systolic array compiler that generates high-performance linear algebra and machine learning kernels for FPGAs.
- Dagger: a RPC framework powered by near-memory FPGA-based NIC. Dagger is designed for micro-services and featured ultra-low latency by offloading networking functions to FPGA
- UpTune: a distributed auto-tuning framework. UpTune makes it much easier for users to search optimal parameters in their programs in a distributed environment with multiple worker machines.
- vTB: virtual-TPM enabled secure boot. vTB is a UEFI firmware driver that verifies the integrity of boot image to avoid malware injection. vTB is designed to emulate a TPM hardware module in software.