Shaswat Patel

New York, NY

Email: shaswat178@gmail.com

I am a Machine Learning Engineer and Researcher currently pursuing a Master of Science in Computer Science at New York University. My research interests lie at the intersection of Large Language Models, Mechanistic Interpretability, and applications of LLM.

I’ve contributed to multiple projects focused on RAG (Retrieval-Augmented Generation) systems, fine-tuning transformer models, and leveraging machine learning to address a range of problems. Currently, I’m worked as a Machine Learning Engineer Intern at Studio Management, where I’m developing an Event Recommendation GPT system with advanced RAG capabilities. Checkout Outie.

Previously, I worked as a Software Engineer at Walmart, where I developed Confluence-integrated chatbots and automated monitoring systems. I also have experience as a Machine Learning Associate at Tavlab and MIDAS LAB, where I worked on biomedical NLP tasks, visual speech recognition, and COVID gene sequencing projects. Currently I am working with Dr. Eunsol Choi and Dr. He He.

My work has been published in venues including MICCAI MLMI Workshop, ACL Workshop, and Medrxiv. I’m also passionate about teaching and have served as a Teaching Assistant for courses in Algorithmic Problem Solving, Data Structures and Building LLM Reasoners at NYU.

selected publications

ACL 2026

Bridging Latent Reasoning and Target-Language Generation via Retrieval-Transition Heads

Shaswat Patel and al.

In ACL 2026 (In submission), 2026

Bib

@inproceedings{patel2025rth,
  title = {Bridging Latent Reasoning and Target-Language Generation via Retrieval-Transition Heads},
  author = {Patel, Shaswat and et. al.},
  booktitle = {ACL 2026 (In submission)},
  year = {2026},
}

MICCAI MLMI

A novel momentum-based deep learning techniques for medical image classification and segmentation

In MICCAI MLMI Workshop, 2024

Bib

@inproceedings{patel2024momentum,
  title = {A novel momentum-based deep learning techniques for medical image classification and segmentation},
  author = {},
  booktitle = {MICCAI MLMI Workshop},
  year = {2024},
}

SNAM

Rumour detection on benchmark twitter datasets using graph neural networks with data augmentation

Shaswat Patel and al.

Springer Nature Social Network Analysis and Mining Journal, 2024

Bib

@article{patel2024rumour,
  title = {Rumour detection on benchmark twitter datasets using graph neural networks with data augmentation},
  author = {Patel, Shaswat and et al.},
  journal = {Springer Nature Social Network Analysis and Mining Journal},
  year = {2024},
}

Medrxiv

Bias Amplification in Intersectional Subpopulations for Clinical Phenotyping by Large Language Models

Medrxiv, 2023

Bib

ACL Workshop

A Dataset for Detecting Humor in Telugu Social Media Text

In ACL Workshop, 2022

Bib

Medrxiv

ShockModes: A Multimodal Model for Prognosticating Intensive Care Outcomes from Physician Notes and Vitals

Medrxiv, 2022

Bib

@article{patel2022shockmodes,
  title = {ShockModes: A Multimodal Model for Prognosticating Intensive Care Outcomes from Physician Notes and Vitals},
  author = {},
  journal = {Medrxiv},
  year = {2022},
}