I am Shaojie Xiang. I earned my Ph.D. degree in ECE from Cornell University in 2024. My research interests lie in the intersection of compiler and hardware accelerator design. [CV]. I am currently an Applied Scientist at Amazon Web Services. Please visit [shaojiex.com] for the latest information.
E-mail: sx233 [at] cornell [dot] edu | OpenPGP
Office: 471 Frank H. T. Rhodes Hall, Ithaca, NY
Education

Ph.D. in Electrical and Computer Engineering (2018 - 2024)
Cornell University
Committee: Zhiru Zhang, Adrian Sampson, Christina Delimitrou
Committee: Zhiru Zhang, Adrian Sampson, Christina Delimitrou

B.Eng. in Electrical Engineering (2018)
Huazhong University of Science and Technology
GPA: 4.98/5.0 ranking 1st/423
Huazhong University of Science and Technology
GPA: 4.98/5.0 ranking 1st/423
Publications
- Jinming Zhang*, Shaojie Xiang*, Hongzheng Chen, Niansong Zhang, Zhuoping Yang, Tony Mao, Zhiru Zhang, Peipei Zhou. ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2025. (Best Paper Nominee)
- Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang. Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference. ACM Transactions on Reconfigurable Technology and Systems (TRETS) 2024
- Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang. Allo: A Programming Model for Composable Accelerator Design. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) 2024
- Shaojie Xiang, Yi-Hsiang Lai, Yuan Zhou, Hongzheng Chen, Niansong Zhang, Debjit Pal, Zhiru Zhang. HeteroFlow: An Accelerator Programming Model with Decoupled Data Placement for Software-Defined FPGAs. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2022
- Debjit Pal, Yi-Hsiang Lai, Shaojie Xiang, Niansong Zhang, Hongzheng Chen, Jeremy Casas, Pasquale Cocchini, Zhenkun Yang, Jin Yang, Louis-Noël Pouchet, Zhiru Zhang. Accelerator Design with Decoupled Hardware Customizations: Benefits and Challenges. Design Automation Conference (DAC), 2022.
- Nikita Lazarev, Shaojie Xiang, Neil Adit, Zhiru Zhang, Christina Delimitrou. Dagger: Efficient and Fast RPCs for Cloud Microservices with Near-Memory Reconfigurable NICs. 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2021
- Yi-Hsiang Lai, Ecenur Ustun, Shaojie Xiang, Zhenman Fang, Hongbo Rong, and Zhiru Zhang. Programming and Synthesis for Software-Defined FPGA Acceleration: Status and Future Prospects. ACM Transactions on Reconfigurable Technology and Systems (TRETS), Dec 2021. (Invited Paper)
- Ecenur Ustun, Shaojie Xiang, Jinny Gui, Cunxi Yu, Zhiru Zhang. LAMDA: Learning-Assisted Multi-stage Autotuning for FPGA Design Closure. 27th International Symposium on Field-Programmable Custom Computing Machines (FCCM) 2019
- Jianchi Zhou, Kaustav Ghosh, Shaojie Xiang, Xin Yan, Ahmad Hosseinbeig, Jongsung Lee, David Pommerenke. Characterization of ESD Risk for Wearable Devices. IEEE Transactions on Electromagnetic Compatibility (TEMC) 2018
Internship

Amazon AI, AWS (2023)
Applied Scientist Intern. LLM Optimization
Mentors: Yuan Zhou, Fredrik Kjolstad, Yida Wang
Applied Scientist Intern. LLM Optimization
Mentors: Yuan Zhou, Fredrik Kjolstad, Yida Wang

Nvidia ML Compiler Group (2020)
Research Intern. ML Compiler for GPUs
Mentors: Bin Fan, Vinod Grover
Research Intern. ML Compiler for GPUs
Mentors: Bin Fan, Vinod Grover

Intel Parallel Computing Lab (2020)
Research Intern. Systolic Compiler for FPGAs
Mentor: Hongbo Rong
Research Intern. Systolic Compiler for FPGAs
Mentor: Hongbo Rong
Services
-
Invited Reviewer
- USENIX Annual Technical Conference (ATC)
- IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM)
- IEEE Transactions on Computers (TC)
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- ACM Great Lakes Symposium on Very Large Scale Integrated Circuits (GLSVLSI)
- ACM Transactions on Reconfigurable Technology and Systems (TRETS)
- Cornell ECE 5775 High-Level Digital Design Automation (2022)
- Cornell ECE 5997 Hardware Accelerator Design and Automation (2021)
Softwares
-
- HeteroCL: a domain-specific language and optimizing compiler for software-defined heterogeneous computing (CPU, FPGA, GPU, and Processing-in-Memory accelerators).
- T2S (temporal to spatial): a systolic array compiler that generates high-performance linear algebra and machine learning kernels for FPGAs.
- Dagger: a RPC framework powered by near-memory FPGA-based NIC. Dagger is designed for micro-services and featured ultra-low latency by offloading networking functions to FPGA.
- UpTune: a distributed auto-tuning framework. UpTune makes it much easier for users to search optimal parameters in their programs in a distributed environment with multiple worker machines.
- vTB: virtual-TPM enabled secure boot. vTB is a UEFI firmware driver that verifies the integrity of boot image to avoid malware injection. vTB is designed to emulate a TPM hardware module in software.