I graduated summa cum laude with B.S. in Computer Science from Washington University. As an undergraduate, I conducted research at AI for Health Institute (AIHealth), led by Prof. Chenyang Lu, collaborated with Prof. Joanna Abraham at WashU Medicine. I remotely worked in Prof. Yu Li’s group at CUHK on AI for drug discovery, and Prof. Sheng Wang’s group at UW. See details in [CV].
Note: I am currently on my post-bacc gap year from school, to explore new directions (e.g., NeuroAI) as an ML engineer, pursue other personal development (learn a new language, travel), and recover from a longstanding scoliosis condition.
🎖 Research Interests
I work on ML for healthcare, with research experience across different areas:
- LLMs & foundation models for biomedicineClinical text/images, EHR, genomic/protein sequence modeling for drug discovery.
- Probabilistic & graphical models for biomedical data Structured modeling of high-dimensional biomedical data.
I do not limit myself to a single direction, always open to exploring new problems, aiming for future research with genuine real-world impact.
🔥 News
- 2024.12 “A Novel Generative Multi-Task Representation Learning Approach for Predicting Postoperative Complications in Cardiac Surgery Patients” accepted by Journal of the American Medical Informatics Association (JAMIA)
- 2024.5 Received B.S. in Computer Science with Summa Cum Laude (GPA 3.98/4.00)
- 2024.4 Received Outstanding Senior Award at WashU McKelvey School of Engineering
- 2023.12 “Unbiased organism-agnostic and highly sensitive signal peptide predictor with deep protein language model” accepted by Nature Computational Science
📝 Selected Publications & Manuscripts
- A Novel Generative Multi-Task Representation Learning Approach for Predicting Postoperative Complications in Cardiac Surgery PatientsJunbo Shen, Bing Xue, Thomas Kannampallil, Chenyang Lu, Joanna AbrahamJournal of the American Medical Informatics Association (JAMIA), 2024/12
- Unbiased organism-agnostic and highly sensitive signal peptide predictor with deep protein language modelJunbo Shen*, Qinze Yu*, Shenyang Chen*, Qingxiong Tan, Jingchen Li, Yu LiNature Computational Science, 2023/12
- Deep Learning Predicts Synergy Effect of Antibacterial Drug CombinationsCo-first Author, In Preparation
- Scalable multi-modal long-context modeling of whole-genome assemblies and antibiotics for cross-species antimicrobial resistance prediction First Author, In Preparation
* equal contribution
🧪 Research Experience (details in CV)
Washington University AIHealth, School of Medicine / CSPL
PIs: Chenyang Lu, Joanna Abraham Feb 2023 – Dec 2024
🩺 surgVAE – Generative multi-task model for cardiac surgery outcome prediction (Lead)
- Probabilistic VAE-based framework for preoperative prediction of six critical postoperative complications in high-risk cardiac surgery patients, leveraging cross-surgery EHR (89k+ cases), disentangled latent spaces, and multi-task learning to improve both performance and interpretability on modifiable risk factors.
🌈 Deep learning for broadband spectrum reconstruction (Contributor)
- Optical conv2seq model for reconstructing broadband spectra from crystal responses, with GPT-style baselines and noise-robustness experiments.
University of Washington, Seattle
PI: Sheng Wang Jun 2024 – Nov 2024
🧬 Slide-level pretraining for pathology WSI FMs (Co-Lead)
- Developed slide-level pretraining for pathology FMs by representing each WSI as a variable-length sequence of ViT tile embeddings and training a slide encoder with DINO/DINOv2-style self-distillation.
🩻 CT segmentation foundation models (Contributor)
Chinese University of Hong Kong, CSE
PI: Yu Li Jun 2022 – Dec 2024
🧫 USPNet – Unbiased signal peptide prediction with protein language models (Lead)
- Organism-agnostic SP prediction framework leveraging MSA Transformer and ESM2 with a label distribution–aware margin loss to handle severe class imbalance and domain shift; outperforms prior methods on SP classification and cleavage prediction and discovers hundreds of novel SP candidates from metagenomes.
🧬 Genome–drug foundation model for AMR prediction (Lead)
- Genome–drug paired FM that predicts AMR phenotypes from full bacterial genomes and antibiotic structures using genomic LLM pretraining, multi-modal binding, and multi-instance learning.
💊 DNA-FM + molecular-FM for antibacterial drug combination synergy (Co-Lead)
- Framework that predicts synergy of antibacterial drug combinations by combining pre-trained molecular graph models for drugs with genomic foundation models for full bacterial genomes.
🧵 LLM-based multi-agent system for AMP discovery (Co-Lead)
- LLM-based multi-agent pipeline that automates discovery of novel antimicrobial peptides from human metagenomic data by orchestrating sequence models, AMP/SP predictors, and filtering tools into a unified workflow.
👥 Leadership & Service
Reviewer, American Medical Informatics Association (AMIA), Dec 2025 Reviewed paper submissions for the AMIA 2026 Annual Symposium
Teaching Assistant, Washington University, Jan 2023 – May 2024 Data Mining (Spring 2024), Intro. to AI (Fall 2023), Intro. to Cryptography (Spring 2023)
🎓 Education
Washington University, St. Louis, MO, McKelvey School of Engineering
B.S. in Computer Science, 2022.08 - 2024.05
- Selected Honors: Summa Cum Laude, Outstanding Senior Award (Awarded to 2 seniors at WashU CSE), Dean’s Lists, Tau Beta Pi Engineering Honor Society
- Before transferring, I was in CUHK’s CS ELITE Stream
Random Misc.
I grew up Korean Chinese (ethnic minority in China), so I know a bit of Korean.
Favorite J-pop band is Mrs. GREEN APPLE — their songs support me at my darkest.
My primary drive for research is fun.
I prefer not to work on research that feels purely incremental.

CS '24