About me

Assistant professor, PhD advisor
The School of Computer Science,
Peking University,
Beijing, China

I am currently engaged in the research and design of storage systems and specialized processors. My research addresses the requirements for high-performance storage systems in the era of big data and artificial intelligence from the perspective of computer architecture. I am dedicated to breaking through the bottlenecks of data migration and the limitations of memory walls in the Von Neumann architecture.


I am actively seeking talented and self-motivated students. There are two openings per year for future PhD candidates and multiple positions for interns. It’s always welcome to contact me via email.


  • May 2024: Invited to serve as PC of HPCA.
  • May 2024: Two papers are accepted to USENIX ATC’24. Congratulations to Shushu Yi, Li Peng, Xiurui and Yuda!
  • April 2024: One paper is accepted to TACO.
  • April 2024: Flagger is accepted to ISCA’24. Congratulations to Xiurui and Yuda!
  • March 2024: Invited to serve as ERC of MICRO.
  • November 2023: One paper is accepted to ASPLOS’24.
  • October 2023: Invited to serve as ERC of ISCA.
  • October 2023: Four papers are accepted to HPCA’24. Congratulations to Yuda and Yuyue!
  • August 2023: Awarded Intel Young Faculty Researcher Program.
  • July 2023: invited to serve as TPC of HPCA.
  • April 2023: Awarded 1st prize in national storage technology competition. Congrats to Shushu Yi!
  • January 2023: one paper is accepted to NVMW.
  • January 2023: one paper is accepted to CAL.
  • December 2022: one paper is accepted to SAC.
  • October 2022: invited to serve as TPC of USENIX ATC and SAC.
  • September 2022: awarded ACM SIGCSE Rising Star!
  • July 2022: one paper is accepted to THPC.
  • May 2022: our paper “ScalaRAID” is accepted to HotStorage’22. Congrats to Shushu Yi!
  • April 2022: two papers are accepted to NVMW’22.
  • December 2021: awarded NSFC Excellent Young Scientists Fund Overseas Program (国家自然科学基金优秀青年科学基金海外项目)!
  • Sep 2021: our work “HAMS” is selected as KAIST breakthroughs 50 years.
  • August 2021: our work “OhmGPU” is reported by Naver headline + 26 and Press.
  • July 2021: one paper is accepted to MICRO’21.
  • April 2021: three papers are accepted to NVMW’21.
  • March 2021: our work “HAMS” is reported by Naver headline + 39, KBS and Press.
  • Feb 2021: one paper is accepted to ISCA’21.
  • June 2020: one paper is accepted to ISCA’20.
  • Feb 2020: one paper is accepted to HPCA’20.
  • Feb 2020: one paper is accpeted to FAST’20.
  • Feb 2020: join KAIST as a postdoctoral researcher.
  • Dec 2019: successfully defend PhD thesis.

Selected Publications:

  • (USENIX ATC) ScalaCache: Scalable User-Space Page Cache Management with Software-Hardware Coordination, 2024
  • (USENIX ATC) ScalaAFA: Constructing User-Space All-Flash Array Engine with Holistic Designs, 2024
  • (ISCA) Flagger: Cooperative Acceleration for Large-Scale Cross-Silo Federated Learning Aggregation, 2024
  • (ASPLOS) Achieving Near-Zero Read Retry for 3D NAND Flash Memory, 2024
  • (HPCA) BeaconGNN: Large-Scale GNN Acceleration with Asynchronous In-Storage Computing, 2024
  • (HPCA) StreamPIM: Streaming Matrix Computation in Racetrack Memory, 2024
  • (HPCA) LearnedFTL: A Learning-based Page-level FTL for Reducing Double Reads in Flash-based SSDs, 2024
  • (HPCA) Midas Touch: Invalid-Data Assisted Reliability and Performance Boost for 3D High-Density Flash, 2024
  • (MICRO) Ohm-GPU: Integrating New Optical Network and Heterogeneous Memory into GPU Multi-Processors, 2021
  • (ISCA) Revamping Storage Class Memory With Hardware Automated Memory-Over-Storage Solution, 2021
  • (ISCA) ZnG: Architecting GPU Multi-Processors with New Flash for Scalable Data Analysis, 2020
  • (USENIX FAST) Scalable Parallel Flash Firmware for Many-core Architectures, 2020
  • (HPCA) DRAM-less: Hardware Acceleration of Data Processing with New Memory, 2020
  • (DAC) FlashGPU: Placing New Flash Next to GPU Cores, 2019
  • (HPCA) FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads, 2019
  • (OSDI) FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs, 2018
  • (MICRO) Amber: Enabling Precise Full-System Simulation with Detailed Modeling of All SSD Resources, 2018
  • (Eurosys) FlashAbacus: A Self-governing Flash-based Accelerator for Low-power Systems, 2018
  • (HPCA) DUANG: Fast and Lightweight Page Migration in Asymmetric Memory Systems, 2016
  • (PACT) NVMMU: A Non-Volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures, 2015
  • (HotStorage) Power, Energy and Thermal Considerations in SSD-Based I/O Acceleration, 2014