Shaswat Patel

New York University. Master's Student in Computer Science.

profile.png

New York, NY

Email: shaswat178@gmail.com

LinkedIn | Google Scholar

I am a Machine Learning Engineer and Researcher currently pursuing a Master of Science in Computer Science at New York University. My research interests lie at the intersection of Large Language Models, Mechanistic Interpretability, and applications of LLM.

I’ve contributed to multiple projects focused on RAG (Retrieval-Augmented Generation) systems, fine-tuning transformer models, and leveraging machine learning to address a range of problems. Currently, I’m worked as a Machine Learning Engineer Intern at Studio Management, where I’m developing an Event Recommendation GPT system with advanced RAG capabilities. Checkout Outie.

Previously, I worked as a Software Engineer at Walmart, where I developed Confluence-integrated chatbots and automated monitoring systems. I also have experience as a Machine Learning Associate at Tavlab and MIDAS LAB, where I worked on biomedical NLP tasks, visual speech recognition, and COVID gene sequencing projects. Currently I am working with Dr. Eunsol Choi and Dr. He He.

My work has been published in venues including MICCAI MLMI Workshop, ACL Workshop, and Medrxiv. I’m also passionate about teaching and have served as a Teaching Assistant for courses in Algorithmic Problem Solving, Data Structures and Building LLM Reasoners at NYU.

selected publications

  1. ACL 2026
    Bridging Latent Reasoning and Target-Language Generation via Retrieval-Transition Heads
    Shaswat Patel and al.
    In ACL 2026 (In submission), 2026
  2. MICCAI MLMI
    A novel momentum-based deep learning techniques for medical image classification and segmentation
    In MICCAI MLMI Workshop, 2024
  3. SNAM
    Rumour detection on benchmark twitter datasets using graph neural networks with data augmentation
    Shaswat Patel and al.
    Springer Nature Social Network Analysis and Mining Journal, 2024
  4. Medrxiv
    Bias Amplification in Intersectional Subpopulations for Clinical Phenotyping by Large Language Models
    Medrxiv, 2023
  5. ACL Workshop
    A Dataset for Detecting Humor in Telugu Social Media Text
    In ACL Workshop, 2022
  6. Medrxiv
    ShockModes: A Multimodal Model for Prognosticating Intensive Care Outcomes from Physician Notes and Vitals
    Medrxiv, 2022