I am a first-year Ph.D. student at the VRVC Lab, ShanghaiTech University, under the supervision of Professor Jingyi Yu. My research focuses on representation learning for biomolecules, including small molecules and proteins, as well as autoregressive generative models.
Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by resolving 3D structures of biomolecules at near-atomic resolution. However, revealing the continuous conformational heterogeneity from hundreds of thousands of noisy particle images remains challenging. Recent advances in heterogeneous reconstruction, often conducted in the Fourier domain, suffer from a lack of interpretability and are limited in achieving higher resolution in locally flexible regions. To address this issue, we propose CryoFormer, a novel approach for high-resolution and continuous heterogeneous cryo-EM reconstruction. CryoFormer leverages a feature volume in the real domain to capture fine-grained local changes. We then design a novel query-based transformer architecture that incorporates deformation-aware features and region-wise spatial features using a cross-attention mechanism. Our transformer-based pipeline further supports pose refinement and can automatically highlight flexible regions by visualizing 3D attention maps. Extensive experiments show that our method achieves the best performance on five datasets (two synthetic and three experimental). We also contribute a new synthetic dataset of the PEDV spike protein for more comprehensive evaluations. Both the code and the PEDV dataset will be released for better reproducibility.
In the past decade, deep conditional generative models have revolutionized the generation of realistic images, extending their application from entertainment to scientific domains. Single-particle cryo-electron microscopy (cryo-EM) is crucial in resolving near-atomic resolution 3D structures of proteins, such as the SARS-COV-2 spike protein. To achieve high-resolution reconstruction, a comprehensive data processing pipeline has been adopted. However, its performance is still limited as it lacks high-quality annotated datasets for training. To address this, we introduce physics-informed generative cryo-electron microscopy (CryoGEM), which for the first time integrates physics-based cryo-EM simulation with a generative unpaired noise translation to generate physically correct synthetic cryo-EM datasets with realistic noises. Initially, CryoGEM simulates the cryo-EM imaging process based on a virtual specimen. To generate realistic noises, we leverage an unpaired noise translation via contrastive learning with a novel mask-guided sampling scheme. Extensive experiments show that CryoGEM is capable of generating authentic cryo-EM images. The generated dataset can used as training data for particle picking and pose estimation models, eventually improving the reconstruction resolution.