I'm a PhD student in Computer Science at Institute of Computing Technology, Chinese Academy of Sciences, working on deep learning systems and compiler optimization. My research primarily focuses on developing efficient tensor computation methods and acceleration techniques for deep learning frameworks. I work closely with Professor Boyu Diao and Professor Yongjun Xu to explore novel approaches in deep learning compilation and optimization.
My recent work includes Gensor, a graph-based tensor compilation method that improves the efficiency of deep learning computations. I'm also interested in workload scheduling optimization for GPU computing and operator fusion techniques for inference acceleration. My research aims to bridge the gap between theoretical deep learning models and their practical implementation on hardware. I've published papers on resource-aware scheduling for unbalanced matrix multiplications on GPUs and evolutionary search methods for operator fusion.