Projects

Vector Search Systems

Biology-Informed Filtering Adaptive ANN Search for Tandem Mass Spectrometry

Project Image

Advisor: Prof. Ming Li, Prof. Xiao Hu

Designed a hybrid range-filtered approximate nearest neighbor (RF-ANNS) system for tandem MS open search, integrating mass-aware columnar storage with HNSW-based graph search to address large-tolerance (±500 Da) peptide identification. Achieved 13× speedup over MSFragger (500 Da tolerance) while maintaining competitive identification rates at 1% FDR.[Read more]

Representation Learning

DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening

Project Image

Advisor: Prof. Wei-Ying Ma, Prof. YanYan Lan, Institute for AI Industry Research, Tsinghua University

Worked with Bowen Gao, Bo Qiang, Haichuan Tan, Yinjun Jia, Minsi Ren

Reformulated virtual screening as a massive-scale dense retrieval task. Designed a multimodal contrastive learning framework to align representations of complex biochemical structures without relying on explicit scoring labels, enabling zero-shot inference on billion-scale databases.[Read more]

Generative Models

Capricorn: A Multi-View Diffusion Model for Contact Matrix Resolution Enhancement

Project Image

Advisor: Prof. William Stafford Noble and Prof. Sheng Wang, School of Computer Science & Engineering, University of Washington

Worked with Tangqi Fang, Yifeng Liu, Addie Woicik

A multi-view conditional diffusion framework that treats high-dimensional sequence tracking as a generative resolution enhancement problem. It incorporates diverse structural priors as independent condition views into a diffusion probability backbone to reconstruct fine-grained topologies from low-coverage constraints.[Read more]

LLM Agents & Data Systems

WildChat-MCP: Hierarchical Analysis Framework via Model Context Protocol

Advisor: Prof. Jimmy Lin, University of Waterloo

Project Image

Implemented a Model Context Protocol (MCP) server enabling LLMs (Claude Desktop) to autonomously explore and analyze a 1.4M+ conversation dataset via structured tools and resources. [Read more]

Graph Neural Networks

Multiomics Integration Via Graph Representation Learning

Project Image

Advisor: Prof. Jianyang Zeng, Tsinghua University

Worked with Wenda Chu, Botian Wang, Sihang Zeng, Shiyu Zhao

Designed a heterogeneous graph representation learning approach to map disparate data modalities onto a shared latent space. Optimized message passing in GCN layers by integrating spatial distance decay weights and deploying independent aggregators, significantly improving cross-modal alignment stability.[Read more]

Unsupervised Learning

Graph Regularization for Target Discovery

Project Image

Advisor: Prof. Jianyang Zeng, Tsinghua University

Worked with Wenda Chu, Botian Wang, Sihang Zeng, Shiyu Zhao

Enhanced Embedded Topic Models (ETM) by integrating structural interaction networks as a graph regularization prior. Developed an auxiliary NLP-based multi-layer perceptron to stabilize unsupervised clustering objectives on sparse, high-dimensional datasets.[Read more]

Web Applications & Full Stack

Venue Hub: Interactive Live Event Seating Preview Platform

Project Image

A full-stack web application designed to eliminate guesswork in live event ticket purchasing. Unlike major platforms that only provide artist or basic ticketing info, Venue Hub offers structured, seat-level experience data.

  • Provides specialized, detailed venue blueprints and structured information.
  • Features an interactive platform for users to exchange and view realistic seat-level reviews.
  • Enables actual sightline and sound quality previews for specific seats prior to purchase, significantly enhancing transparency for buyers.

[Website]

AI

Gomoku Genius: AI Search Strategies for Classic Board Game Mastery

Advisor: Prof. Mingsheng Long, Tsinghua University

This is a practical application of advanced AI search strategies in board games. We optimized Minimax for Tic-Tac-Toe. Applied Alpha-Beta Pruning to reduce search space. Implemented Truncated Search in Go-Moku, balancing time constraints with strategic depth using pattern evaluation. Conducted gameplay analysis comparing naive MCTS and Alpha-Beta. Enhanced MCTS with evaluation function, significantly improving board state assessments and decision-making.[Read more]

Text Sentimental Classification Based on the Twitter Dataset

Advisor: Prof. Mingsheng Long, Tsinghua University

In this project, we focus on comparing a variety of models using a Twitter sentiment analysis dataset. We utilized TF-IDF methods for feature extraction and implemented a range of machine learning models, including Decision Trees, Random Forests, Multilayer Perceptrons (MLPs), ResNet, and BERT, to conduct nuanced sentiment analysis.[Read more]