About me

Minsi Lu

Research Interests

I have some research experiences at Tsinghua AIR, University of Washington, and University of Waterloo. Based on my research experiences, my core interests include:

  • Multimodal Learning & Representation: Contrastive learning frameworks, self-supervised representation learning, and cross-modal alignment.
  • Advanced Information Retrieval: Vector database optimization, and approximate nearest neighbor (ANN) search algorithms.
  • Generative AI & LLMs: Diffusion models, Model fine-tuning (PEFT, LoRA), RAG (Retrieval-Augmented Generation) architectures, and inference optimization.
  • AI for Science (AI4Science): Deep learning for computational biology, including structure-based drug design (SBDD).

Research Experiences

Contrastive Protein-Molecule Representation Learning for Virtual Screening

Intern, Institute for AI Industry Research, Tsinghua University
Advisor: Prof. YanYan Lan, Feb. 2023 - Jun. 2023

Index Optimization of DeepSearch for Mass Spectrometry Peptide Database Retrieval

Intern, Cheriton School of Computer Science, University of Waterloo
Advisor: Prof. Ming Li, Feb. 2025 - Sep. 2025

Resolution Enhancement and Time Dimension Modeling for Hi-C Data

Visiting Student, School of Computer Science & Engineering, University of Washington
Advisor: Prof. William Stafford Noble, Prof. Sheng Wang, Jun. 2023 - Oct. 2023

Skills

  • Programming Languages: Python, C/C++, Java, SQL, R, Bash.
  • Machine Learning & AI: PyTorch, TensorFlow, HuggingFace Transformers, Scikit-Learn.
  • Systems & Tools: Linux/Unix, Git, Docker, Kubernetes, AWS, CI/CD pipelines.
  • Databases & IR: Vector Databases (Milvus, FAISS), Graph Databases, PostgreSQL.
  • Core Knowledge: Data Structures & Algorithms, Distributed Systems, Linear Algebra, Probability & Statistics, Computer Architecture, Computer Networks, Database Systems.
  • Domain Expertise: Computational Biology, Bioinformatics, Cheminformatics.