Publications / 出版物
Journal Paper
- Chen Zhang, Guangyu Sun, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (T-CAD 2018) 【PDF】
- Citation: 600
- Award: 2017~2019 Donald O. Pederson Best Paper Award
- Keyword: Convolutional Neural Network, FPGA, Design Automation, Caffe, SDAccel
Conference Paper
C20. OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization
- Cong Guo, Jiaming Tang, Weiming Hu, Jingwen Leng, Chen Zhang, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu
- Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA 2023)【PDF】
C19. Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training
- Cong Guo, Yuxian Qiu, Jingwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo
- 2022 IEEE 40th International Conference on Computer Design (ICCD 2022)【PDF】
C18. Ant: Exploiting adaptive numerical data type for low-bit deep neural network quantization
- Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu
- Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture (MICRO 2022)【PDF】
- Award: MICRO 2023 Top Picks Honorable Mention
- Keywords: AI acceleration, Tensor Core, Quantization
C17. SQuant: On-the-fly data-free quantization via diagonal hessian approximation
- Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo
- International Conference on Learning (ICLR 2022) 【PDF】
C16. Dual-side sparse tensor core
- Yang Wang, Chen Zhang, Zhiqiang Xie, Cong Guo, Yunxin Liu, Jingwen Leng
- 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA 2021)【PDF】
- Keywords: GPGPU, Sparse Tensor Core, AI acceleration
C15. Boosting mobile CNN inference through semantic memory
- Yun Li, Chen Zhang, Shihao Han, Li Lyna Zhang, Baoqun Yin, Yunxin Liu, Mengwei Xu
- Proceedings of the 29th ACM International Conference on Multimedia (Multimedia 2021)【PDF】【Web】
C14. Scylla: Qoe-aware continuous mobile vision with fpga-based dynamic deep neural network reconfiguration
- Shuang Jiang, Zhiyao Ma, Xiao Zeng, Chenren Xu, Mi Zhang, Chen Zhang, Yunxin Liu
- Proceedings of the 2022 IEEE Conference on Computer Communications (INFOCOM 2020)【PDF】
C13. Ladabert: Lightweight adaptation of bert through hybrid model compression
- Yihuan Mao, Yujing Wang, Chufan Wu, Chen Zhang, Yang Wang, Yaming Yang, Quanlu Zhang, Yunhai Tong, Jing Bai
- Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020)【PDF】
C12. Live video analytics with FPGA-based smart cameras
- Shang Wang, Chen Zhang, Yuanchao Shu, Yunxin Liu
- Proceedings of the 2019 Workshop on Hot Topics in Video Analytics and Intelligent Edges (HotEdges 2019)【PDF】
C11. Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity
- Shijie Cao, Chen Zhang, Zhuliang Yao, Wencong Xiao, Lanshun Nie, Dechen Zhan, Yunxin Liu, Ming Wu, Lintao Zhang
- Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2019) 【PDF】
- Industrial Impact: Used by NVIDIA Sparse Tensor Core (Ampere and Hopper Architecture)
- Keyword: Sparse Neural Network, Acceleration, FPGA
C10. Balanced sparsity for efficient dnn inference on gpu
- Zhuliang Yao, Shijie Cao, Wencong Xiao, Chen Zhang, Lanshun Nie
- Proceedings of the AAAI conference on artificial intelligence (AAAI 2019) 【PDF】
- Industrial Impact: Used by NVIDIA Sparse Tensor Core (Ampere and Hopper Architecture)
- Keyword: Sparse Neural Network, Acceleration, GPGPU
C9. Seernet: Predicting convolutional neural network feature-map sparsity through low-bit quantization
- Shijie Cao, Lingxiao Ma, Wencong Xiao, Chen Zhang, Yunxin Liu, Lintao Zhang, Lanshun Nie, Zhi Yang
- Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019) 【PDF】
C8. Best-effort FPGA programming: A few steps can go a long way
- Jason Cong, Zhenman Fang, Yuchen Hao, Peng Wei, Cody Hao Yu, Chen Zhang, Peipei Zhou
- arXiv preprint arXiv:1807.01340 (2018)
C7. Using data compression for optimizing FPGA-based convolutional neural network accelerators
- Yijin Guan, Ningyi Xu, Chen Zhang, Zhihang Yuan, Jason Cong
- International workshop on advanced parallel processing technologies (2017)
C6. Energy-efficient CNN implementation on a deeply pipelined FPGA cluster
- Chen Zhang, Di Wu, Jiayu Sun, Guangyu Sun, Guojie Luo, Jason Con
- Proceedings of the 2016 International Symposium on Low Power Electronics and Design (ISLPED 2016) 【PDF】
- Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong
- Proceedings of the 35th International Conference on Computer-Aided Design (ICCAD 2016) 【PDF】
C4. Optimizing FPGA-based accelerator design for deep convolutional neural networks
- Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, Jason Cong, “
- Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays (FPGA 2015)【PDF】
- Citation: 2218 (As of 2023, Top-1 citation in 29 year FPGA conference history)
- Award: FPGA-2015 Best Paper Nomination
- Keyword: Convolutional Neural Network, FPGA, Acceleration, Roofline Model
C3. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD
- Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, Jason Cong
- Proceedings of the Ninth European Conference on Computer Systems (EuroSys 2014)【PDF】
C2. Memory partitioning for multidimensional arrays in high-level synthesis
- Yuxin Wang, Peng Li, Peng Zhang, Chen Zhang, Jason Cong
- Proceedings of the 50th Annual Design Automation Conference (DAC 2013)【PDF】
C1. Automatic multidimensional memory partitioning for FPGA-based accelerators
- Yuxin Wang, Peng Li, Peng Zhang, Chen Zhang, Jason Cong
- Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays (FPGA 2013)