Anirudh Khatry

Computer Science PhD Student

University of Texas at Austin

Biography

Hi, I am a ~~first~~ second year PhD student in the Department of Computer Science at the University of Texas at Austin, advised by Prof. Işıl Dillig and Prof. Greg Durrett. I work on secure code transpilation using neurosymbolic techniques.

I have, previously, been a Pre-doctoral Research Fellow at Microsoft, working with the PROSE (Program Synthesis) team where I worked on the Copilot experience for data wrangling in Fabric. I received my bachelor’s degree in Information Technology from Veermata Jijabai Technological Institute (V.J.T.I.), Mumbai, India.

I have had the good fortune to work with Dr. Ashish Tiwari, Dr. Sumit Gulwani, Dr. Vu Le, Dr. Gust Verbruggen, Dr. Sandeep Udmale, and Dr. Vijay Sambhe.

Outside of work, you can find me running, playing the guitar, and listening to songs.

Download my CV .

Interests

AI4Code
Natural Language Processing
Programming Languages
Program Synthesis
Information Retrieval

Education

Ph.D. in Computer Science, started 2024.
University of Texas at Austin.
B.Tech. in Information Technology, 2017-2021
Veermata Jijabai Technological Institute, India.

Recent News

All news»

[07/07/2025] CRUST-bench has been accepted to COLM ‘25! Link to the paper.

[11/06/2025] CRUST-bench leaderboard is now available! Check it out here.

[29/01/2025] “An empirical study of validating synthetic data for formula generation” was accepted to NAACL-Findings 2025.

[10/10/2024] I am on the PC for ICLR, 2025.

[01/10/2024] I am on the PC for the Table Representation Learning Workshop at NeurIPS ‘24.

Recent Publications

Quickly discover relevant content by filtering publications.

Anirudh Khatry, Robert Zhang, Jia Pan, Ziteng Wang, Jocelyn Chen, Greg Durrett, Isil Dillig (2025). CRUST-bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation. In COLM'25.

PDF Code Project

Usneek Singh, Jose Cambronero, Sumit Gulwani, Aditya Kanade, Anirudh Khatry, Vu Le, Mukul Singh, Gust Verbruggen (2024). An Empirical Study of Validating Synthetic Data for Formula Generation. In NAACL-Findings'25.

PDF

Ananya Singha, Bhavya Chopra, Anirudh Khatry, Sumit Gulwani, Austin Henley, Vu Le, Chris Parnin, Mukul Singh, Gust Verbruggen (2024). Semantically Aligned Question and Code Generation for Automated Insight Generation (Best Paper). In LLM4Code, ICSE ‘24.

PDF

Anirudh Khatry, Joyce Cahoon, Jordan Henkel, Shaleen Deep, Venkatesh Emani, Avrilia Floratou, Sumit Gulwani, Vu Le, Mohammad Raza, Sherry Shi, Mukul Singh, Ashish Tiwari (2023). Alternate Task Technique for Natural Language to Code in Low-Resource Languages.

Anirudh Khatry, Sumit Gulwani, Vu Le, Mukul Singh, Gust Verbruggen (2023). COOPER: Learning what to teach models for code generation.

Anirudh Khatry, Sumit Gulwani, Priyanshu Gupta, Vu Le, Ananya Singha, Mukul Singh, Gust Verbruggen (2023). TSTR: Target Similarity Tuning Meets the Real World. In EMNLP-Findings'23.

PDF Video

Anirudh Khatry, Yasharth Bajpai, Priyanshu Gupta, Sumit Gulwani, Ashish Tiwari (2023). Augmented Embeddings for Custom Retrievals.

PDF

Anirudh Khatry, Joyce Cahoon, Jordan Henkel, Shaleen Deep, Venkatesh Emani, Avrilia Floratou, Sumit Gulwani, Vu Le, Mohammad Raza, Sherry Shi, Mukul Singh, Ashish Tiwari (2023). From Words to Code: Harnessing Data for Program Synthesis from Natural Language. In MLAIDS'23.

PDF

Suresh Parthasarathy, Lincy Pattanaik, Anirudh Khatry, Arun Iyer, Arjun Radhakrishna, Sriram Rajamani, Mohammad Raza (2022). Landmarks and Regions: A Robust Approach to Data Extraction. In PLDI'22.

PDF

Experience

Pre-doctoral Research Fellow

Microsoft

Aug 2022 – Jun 2024 Redmond, WA

Conceptualized and built the natural language to code feature for the Power Query M language, used for wrangling tables in Excel, Fabric and PowerBI.
Collaborated towards building the Copilot experience as a part of the Power Query experience in Fabric and Excel.
Devised two state-of-the-art strategies TSTR (EMNLP-Findings 2023) and COOPER (Under submission to ASE 2024) for optimal dynamic prompt construction aiding in-context learning for natural language to code tasks.
Developed Alternate Task Technique (ATT) (Under submission), a generalized framework to post-process LLM outputs using alternate tasks that improved performance on low resource languages, like Power Query M, by 13%.
Developed Adapted Dense Retrieval (ADDER) (Under submission) framework for Information Retrieval tasks using dense embedding for efficient code retrieval in low-resource settings.

Research Intern

Microsoft Research

Jul 2021 – Aug 2022 India

Collaborated with Microsoft Edge team for web-based data extraction tasks to improve product purchasing experience.
Successfully automated invoice data extraction tasks for the Microsoft IDC Finance team to improve productivity.
Employed techniques to combat low-resource name entity recognition tasks by employing ML and program synthesis techniques.
Devised Landmark-based Robust Synthesis (LRSyn), a state-of-the-art interpretable data extraction framework, robust to version changes in data.
Spearheaded the clustering and landmark detection tasks during the development of LRSyn, and developed a novel fingerprinting technique for images.
Successfully published our research paper titled “Landmarks and Regions: A Robust Approach to Data Extraction” at the Conference on Programming Languages Design and Implementation 2022, San Diego.

Machine Learning Intern

Human Rights First

May 2021 – Jul 2021 Remote

Collaborated with 30 changemakers to develop a war-crime detection tool using social media channels.
Fine-tuned a distil-RoBERTa model for binary classification of war crimes
Spearheaded the development of a novel two stage prediction pipeline for multi-label classification of warcrimes.

Research Intern

Samsung Research and Development Institute, India

May 2020 – Jul 2020 Remote

Worked with the On-Device AI team to improve system performance using Reinforcement Learning.
Built a State-Of-The-Art Multi-Agent Deep Q-network leveraging prioritized experience replay(PER) and time-bound dynamic reward functions
Designed a landmark agent simulation environment to show proof of concept.

Programming Analyst Intern

Pexabyte Technology Solutions Pvt. Ltd.

May 2019 – Jul 2019 Remote

Coordinated with the product development team to build an ERP application for manufacturing and service-based industries.
Employed JavaFX for the development of the application and MySQL for database management.
Followed an agile based product development life cycle with constant interaction with key product owners.