Publication
2024
[USENIX ATC-2024] Zheng Wang, Yuke Wang, Boyuan Feng, Guyue Huang, Dheevatsa Mudigere, Bharath Muthiah, Ang Li, Yufei Ding.
OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model.
USENIX Annual Technical Conference.
[Paper] [Bibtex][ASPLOS-2024] Zheng Wang, Yuke Wang, Jiaqi Deng, Da Zheng, Ang Li, Yufei Ding.
RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input Preprocessing.
ACM International Conference on Architectural Support for Programming Languages and Operating Systems.
[Paper] [Bibtex] [Code][ASPLOS-2024] Boyuan Feng, Zheng Wang, Yuke Wang, Shu Yang, Yufei Ding.
ZENO: A Type-based Optimization Framework for Zero-Knowledge Neural Network Inference.
ACM International Conference on Architectural Support for Programming Languages and Operating Systems.
[Paper] [Bibtex] [Code]
2023
[OSDI-2023] Â Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Ang Li, Kevin Barker, Yufei Ding.
MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
USENIX Symposium on Operating Systems Design and Implementation.
[Paper] [Bibtex] [Code] [Slides][Video][USENIX ATC-2023] Yuke Wang, Boyuan Feng, Zheng Wang, Guyue Huang, Yufei Ding.
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
USENIX Annual Technical Conference.
[Paper] [Bibtex] [Code] [Slides][Video][ISCA-2023] Hezi Zhang, Anbang Wu, Yuke Wang, Gushu Li, Hassan Shapourian, Alireza Shabani, Yufei Ding.
A Compilation Framework for Photonic One-Way Quantum Computation.
International Symposium on Computer Architecture.
[Paper] [Bibtex][MLSys-2023] Guyue Huang, Yang Bai, Liu Liu, Yuke Wang, Bei Yu, Yufei Ding, Yuan Xie.
ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
Conference on Machine Learning and Systems.
[Paper] [Bibtex] [Code] [Slides]
2022
[SC-2022] Â Zheng Wang, Yuke Wang, Boyuan Feng, Dheevatsa Mudigere, Bharath Muthiah, Yufei Ding.
EL-Rec: Efficient Large-scale Recommendation Model Training via Tensor-Train Embedding Table.
The International Conference for High Performance Computing, Networking, Storage, and Analysis.
[Paper] [Bibtex] [Code][USENIX ATC-2022] Boyuan Feng, Tianqi Tang, Yuke Wang, Zhaodong Chen, Zheng Wang, Shu Yang, Yuan Xie, Yufei Ding.
Faith: An Efficient Framework for Transformer Verification on GPUs.
USENIX Annual Technical Conference.
[Paper] [Bibtex] [Code] [Slides][PPoPP-2022] *Yuke Wang, *Boyuan Feng, Yufei Ding. *: equal contribution
QGTC: Accelerating Quantized Graph Neural Networks via GPU Tensor Core.
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.
[Paper] [Bibtex] [Code] [Slides] [Project][Video]
2021
[CIKM-2021 (Spotlight)] Yuke Wang, Boyuan Feng, Xueqiao Peng, Yufei Ding.
An Efficient Quantitative Approach for Optimizing Convolutional Neural Networks.
ACM International Conference on Information and Knowledge Management.
[Paper] [Bibtex] [Slides] [Video][SC-2021] *Boyuan Feng, *Yuke Wang, Tong Geng, Ang Li, Yufei Ding. *: equal contribution
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
The International Conference for High Performance Computing, Networking, Storage, and Analysis.
[Paper] [Bibtex] [Code] [Slides][OSDI-2021] Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, Yufei Ding.
GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
USENIX Symposium on Operating Systems Design and Implementation.
[Paper] [Bibtex] [Code] [Slides] [Video] [Project][USENIX ATC-2021] Boyuan Feng, Yuke Wang, Gushu Li, Yuan Xie, Yufei Ding.
Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew.
USENIX Annual Technical Conference.
[Paper] [Bibtex][CCGrid-2021] Yuke Wang, Boyuan Feng, Gushu Li, Georgios Tzimpragos, Lei Deng, Yuan Xie, Yufei Ding.
TiAcc: Triangle-inequality based Hardware Accelerator for K-means on FPGAs.
IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[Paper] [Bibtex] [Slides][ICASSP-2021] Boyuan Feng, Yuke Wang, Yufei Ding.
Sparse Adversarial Attack on EEG-based Brain Computer Interface.
IEEE International Conference on Acoustics, Speech, Signal Processing.
[Paper] [Bibtex][AAAI-2021] Boyuan Feng, Yuke Wang, Yufei Ding.
UAG: Uncertainty-aware Attention Graph Neural Network for Defending Adversarial Attacks.
AAAI Conference on Artificial Intelligence.
[Paper] [Bibtex][PPoPP-2021] Boyuan Feng, Yuke Wang, Guoyang Chen, Weifeng Zhang, Yuan Xie, Yufei Ding.
EGEMM-TC: Accelerating Scientific Computing on Tensor Cores with Extended Precision.
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.
[Paper] [Bibtex][IPDPS-2021] Yuke Wang, Boyuan Feng, Yufei Ding.
DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.
IEEE International Parallel & Distributed Processing Symposium.
[Paper] [Bibtex] [Code] [Slides] [Video][TCAD-2021] Yuke Wang, Boyuan Feng, Gushu Li, Lei Deng, Yuan Xie, Yufei Ding.
STPAcc: A Compiler-based Framework for Accelerating Distance Algorithms on CPU-FPGA Platforms.
IEEE Transactions on Computer Aided Design of Integrated Circuits & Systems.
[Paper] [Bibtex][TCAD-2021] Xiaobing Chen, Yuke Wang, Xinfeng Xie, Xing Hu, Abanti Basak, Ling Liang, Mingyu Yan, Lei Deng, Yufei Ding, Zidong Du, Yunji Chen, Yuan Xie.
Rubik: A Hierarchical Architecture for Efficient Graph Learning.
IEEE Transactions on Computer Aided Design of Integrated Circuits & Systems.
[Paper] [Bibtex]
2020
[ICTAI-2020] *Boyuan Feng, *Yuke Wang, Xu Li, Shu Yang, Xueqiao Peng, Yufei Ding. *: equal contribution.
SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization.
International Conference on Tools with Artificial Intelligence.
[Paper] [Bibtex][ICML-2020] Liu Liu, Lei Deng, Zhaodong Chen, Yuke Wang, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Yufei Ding, Yuan Xie.
Boosting Deep Neural Network Efficiency with Dual-Module Inference.
International Conference on Machine Learning.
[Paper] [Bibtex]