Abinitha Gourabathina

PhD Student @ MIT EECS

My research focuses on model robustness and AI safety, particularly in sensitive domains such as healthcare.

About Me

Hello! I am a PhD student in MIT LIDS, working on trustworthy machine learning with interests in high-stakes domains like healthcare. I'm lucky to be co-advised by Professors Marzyeh Ghassemi and Collin Stultz. In 2023, I graduated from Princeton University with a B.S.E. in Operations Research and Financial Engineering, along with minors in Cognitive Science, Computer Science, Linguistics, and Statistics & Machine Learning. My senior thesis, supervised by the incredible Professor Christiane Fellbaum, explored NLP techniques for detecting and editing stigmatizing language in medical records.

My research focuses on reliable, responsible, and trustworthy Machine Learning, and my work spans group robustness, GenAI Agents, chain-of-thought, and backdoor attacks. I use tools and frameworks from Optimization, Probability, and Statistics. My work has been published at venues such as ACM FAccT and IEEE.

This past summer, I interned at IBM Research, where I investigated inverted model outputs to identify gaps in reasoning chains and improve hallucination detection for large language models.

You can reach me at abinitha@mit.edu. I'd love to hear from you!

News

Feb 2026: Our paper LEIA is released on arXiv!
Oct 2025: Presented an ML Tea Talk on trace inversion.
Sept 2025: Excited to serve as Program Chair of the 2026 LIDS Student Conference!
Aug 2025: Completed my summer internship at IBM Research!
June 2025: Our work is covered by MIT News and other press.
June 2025: Presented at FAccT '25!
June 2025: Paper released on arXiv!
May 2025: Excited to join the AIES PC!
May 2025: MedPerturb project website online!
April 2025: Presented @ the MIT EECS Town Hall for SERC.
April 2025: Paper accepted to FAccT '25!
April 2025: Presented @ the MIT IMES Seminar Series.
Aug 2024: Started my PhD at MIT!

Research Highlights

Robustness Beyond Known Groups with Low-rank Adaptation

Abinitha Gourabathina, Hyewon Jeong, Teya Bergamaschi, Marzyeh Ghassemi, and Collin Stultz, 2026. arXiv preprint arXiv:2602.06924.

Abstract Paper Code

Deep learning models trained to optimize average accuracy often exhibit systematic failures on particular subpopulations. In real world settings, the subpopulations most affected by such disparities are frequently unlabeled or unknown, thereby motivating the development of methods that are performant on sensitive subgroups without being pre-specified. However, existing group-robust methods typically assume prior knowledge of relevant subgroups, using group annotations for training or model selection. We propose Low-rank Error Informed Adaptation (LEIA), a simple two-stage method that improves group robustness by identifying a low-dimensional subspace in the representation space where model errors concentrate. LEIA restricts adaptation to this error-informed subspace via a low-rank adjustment to the classifier logits, directly targeting latent failure modes without modifying the backbone or requiring group labels. Using five real-world datasets, we analyze group robustness under three settings: (1) truly no knowledge of subgroup relevance, (2) partial knowledge of subgroup relevance, and (3) full knowledge of subgroup relevance. Across all settings, LEIA consistently improves worst-group performance while remaining fast, parameter-efficient, and robust to hyperparameter choice.

The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making

Abinitha Gourabathina, Yuexing Hao, Walter Gerych, Marzyeh Ghassemi, 2025. arXiv:2506.17163.

Abstract Website Paper Code

Clinical robustness is critical to the safe deployment of medical Large Language Models (LLMs), but key questions remain about how LLMs and humans may differ in response to the real-world variability typified by clinical settings. To address this, we introduce MedPerturb, a dataset designed to systematically evaluate medical LLMs under controlled perturbations of clinical input. MedPerturb consists of clinical vignettes spanning a range of pathologies, each transformed along three axes: (1) gender modifications; (2) style variation; and (3) format changes. With MedPerturb, we release a dataset of 800 clinical contexts grounded in realistic input variability, outputs from four LLMs, and three human expert reads per clinical context. We use MedPerturb in two case studies to reveal how shifts in gender identity cues, language style, or format reflect diverging treatment selections between humans and LLMs.

The Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMs

Abinitha Gourabathina, Walter Gerych, Eileen Pan, and Marzyeh Ghassemi. 2025. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25). Association for Computing Machinery, New York, NY, USA, 1805-1828. https://doi.org/10.1145/3715275.3732121

Abstract Paper Code

The integration of large language models (LLMs) into clinical diagnostics necessitates a careful understanding of how clinically irrelevant aspects of user inputs directly influence generated treatment recommendations and, consequently, clinical outcomes for end-users. Building on prior research that examines the impact of demographic attributes on clinical LLM reasoning, this study explores how non-clinically relevant attributes shape clinical decision-making by LLMs. Through the perturbation of patient messages, we evaluate whether LLM behavior remains consistent, accurate, and unbiased when non-clinical information is altered. These perturbations assess the brittleness of clinical LLM reasoning by replicating structural errors that may occur during electronic data processing patient questions and simulating interactions between patient-AI systems in diverse, vulnerable patient groups. Our findings reveal notable inconsistencies in LLM treatment recommendations and significant degradation of clinical accuracy in ways that reduce care allocation to patients. Additionally, there are significant disparities in treatment recommendations between gender subgroups as well as between model-inferred gender subgroups. We also apply our perturbation framework to a conversational clinical dataset to find that even in conversation, LLM clinical accuracy decreases post-perturbation, and disparities exist in how perturbations impact gender subgroups. By analyzing LLM outputs in response to realistic yet modified clinical contexts, our work deepens understanding of the sensitivity, inaccuracy, and biases inherent in medical LLMs, offering critical insights for the deployment of patient-AI systems.

PanDa Game: Optimized Privacy-Preserving Publishing of Individual-Level Pandemic Data Based on a Game Theoretic Model

Abinitha Gourabathina, Zhiyu Wan, J. Thomas Brown, Chao Yan, Bradley A. Malin, 2023. In IEEE Transactions on NanoBioscience, vol. 22, no. 4, pp. 808-817, Oct. 2023, doi: 10.1109/TNB.2023.3284092.

Abstract Paper Code

Sharing individual-level pandemic data is essential for accelerating the understanding of a disease. For example, COVID-19 data have been widely collected to support public health surveillance and research. In the United States, these data are typically de-identified before publication to protect the privacy of the corresponding individuals. However, current data publishing approaches for this type of data, such as those adopted by the U.S. Centers for Disease Control and Prevention (CDC), have not flexed over time to account for the dynamic nature of infection rates. Thus, the policies generated by these strategies have the potential to both raise privacy risks or overprotect the data and impair the data utility (or usability). To optimize the tradeoff between privacy risk and data utility, we introduce a game theoretic model that adaptively generates policies for the publication of individual-level COVID-19 data according to infection dynamics. We model the data publishing process as a two-player Stackelberg game between a data publisher and a data recipient and then search for the best strategy for the publisher. In this game, we consider 1) average performance of predicting future case counts; and 2) mutual information between the original data and the released data. We use COVID-19 case data from Vanderbilt University Medical Center from March 2020 to December 2021 to demonstrate the effectiveness of the new model. The results indicate that the game theoretic model outperforms all state-of-the-art baseline approaches, including those adopted by CDC, while maintaining low privacy risk. We further perform an extensive sensitivity analyses to show that our findings are robust to order-of-magnitude parameter fluctuations.

What seems to be the problem? Stigmatizing language in patient medical notes

Abinitha Gourabathina (Senior thesis, Princeton University). Princeton DataSpace. http://arks.princeton.edu/ark:/88435/dsp01cv43p110t

Abstract Paper

Stigmatizing language in medical notes can prevent a patient from acquiring proper treatment. Reading medical notes containing biased language can influence subsequent clinicians' perception of a patient, further compounding a patient's inability to receive adequate care. Thus, there is a clear need to correct patient notes to eliminate stigmatizing language. Prior work involving stigmatizing language in medical notes has largely remained qualitative where clinicians and researchers manually analyzed notes for stigmatizing keywords. Our work utilized a computational approach to obtain a more robust set of stigmatizing keywords. We created contextual word embeddings from BERT-based and BioBERT-based models that are trained on free-text patient-oriented clinical data. These state-of-the-art models allowed us to develop word vector representations, from which we identified 30 new stigmatizing keywords. We then complete a thorough analysis to build a grammar structure that categorizes stigmatizing keywords according to the ways they induce stigma and better understand the syntactical environments in which these keywords occur. Following our analysis, we developed a model called MedStiLE (Medical note Stigmatizing Language Editor) that utilizes the grammar structure and constituency parsing to edit notes containing the stigmatizing keywords to be non-stigmatizing. We conducted an evaluation to test the efficacy of MedStiLE using human raters and found that it significantly reduced stigma in notes. This research provides various novel insights in terms of methodology and results that can help shape future works involving the intersection of language and healthcare.

Papers

Abinitha Gourabathina, Hyewon Jeong, Teya Bergamaschi, Marzyeh Ghassemi, and Collin Stultz, 2026. Robustness Beyond Known Groups with Low-rank Adaptation. arXiv preprint arXiv:2602.06924. (Preprint).
[Paper] [Code]
Abinitha Gourabathina, Yuexing Hao, Walter Gerych, and Marzyeh Ghassemi, 2025. The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making. arXiv preprint arXiv:2506.17163. (Preprint).
[Paper] [Website] [Code]
Abinitha Gourabathina, Walter Gerych, Eileen Pan, and Marzyeh Ghassemi. 2025. The Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMs. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25).
[Paper] [Code]
Abinitha Gourabathina, Zhiyu Wan, J. Thomas Brown, Chao Yan, Bradley A. Malin, 2023. PanDa Game: Optimized Privacy-Preserving Publishing of Individual-Level Pandemic Data Based on a Game Theoretic Model. In IEEE Transactions on NanoBioscience, vol. 22, no. 4, pp. 808-817, Oct. 2023.
[Paper] [Code]
Abinitha Gourabathina, Zhiyu Wan, J. Thomas Brown, Chao Yan, Bradley A. Malin, 2022. PanDa Game: Optimized Privacy-Preserving Publishing of Individual-Level Pandemic Data Based on a Game Theoretic Model. In Proceedings of the 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, 2022, pp. 961-968.
[Paper]
Abinitha Gourabathina (Senior thesis, Princeton University). What seems to be the problem? Stigmatizing language in patient medical notes. Princeton DataSpace. http://arks.princeton.edu/ark:/88435/dsp01cv43p110t
[Paper]