Machine Learning Engineer

About Me

Hi, I’m Jaydeep! I’m a Machine Learning Engineer passionate about building robust, interpretable, and generalizable AI systems that can solve real-world problems and make a meaningful impact. My interests span across deep learning interpretability, computer vision, natural language processing, and generative AI — with a long-term goal of developing models that can reason about the world, learn from limited data, and generalize beyond i.i.d. settings.

I recently graduated with a Master’s in Computer Science from Indiana University Bloomington (May 2023), and since then I’ve worked on a range of impactful projects:

At IU’s School of Public Health, I designed large-scale pipelines for processing millions of PubMed articles, fine-tuned LLMs, and built RAG systems to recommend evidence-based nudges. I also co-developed a reminiscence therapy AI agent for dementia care, now backed by an NIH grant and heading to clinical trials.

At Causify, I built scalable data pipelines, integrated exchange data, and collaborated on custom LLM research for streamlining workflows.

Earlier, at ParallelDots, I worked on object detection models and computer vision pipelines for Fortune 500 FMCG clients, publishing research at CVPR 2021.

I also enjoy competing on Kaggle, where I hold Expert rank and have earned silver medals in computer vision and healthcare challenges.

Beyond engineering, I see myself as a polymath who loves connecting AI with diverse domains — from healthcare to climate-tech, from generative media to human-computer interaction.

If you’re also excited about any of these areas or want to collaborate, feel free to reach out via email or LinkedIn.

Thanks for stopping by!

Projects

Volumetric rendering using Neural Radiance Fields

github.com/Jd8111997/NeRF

This is one of the coolest application of computer vision!

This project is based on an implementation of NeRF(https://www.matthewtancik.com/nerf) paper to render a novel 3D view of a scene. The objective of this project was to learn about the 3D view reconstruction, neural rendering and depth estimation using latest state of the art model in computer vision.

Supervised Contrastive Learning for pretrained language models

github.com/Jd8111997/Supervised-contrastive-learning

Supervised contrastive learning proves a great technique for few shot learning!

This project is based on an implementation of supervised contrastive learning for pretrained language models paper(https://arxiv.org/abs/2011.01403).The main objective behind this project was to learn about few shot learning for data hungry large language models. We got fascinating results to see that after applying supervised contrastive loss; the model is able to learn a very good representation of discriminative features from very few samples.

Speech Enhancer

github.com/Jd8111997/Speech-Enhancer

I built it to enhance my crappy guitar riffs!

This purpose of working on this project was to explore the domain of speech enhancement by incorporating Speech Enhancement Generative Adversarial Networks.

Autoregressive Generative Models

github.com/Jd8111997/Auto-regressive-models

Feeling bored, let's make an AI model to generate some dog images:)

This project is based on an implementation of Autoregressive Generative Models such as PixelCNN, PixelCNN++ and GatedPixelCNN to generate a novel images of a dog.

Matrix optimization using Angular and Python

github.com/Jd8111997/ChatClub

After watching Silicon Valley TV series, I got inspired to work on decentralized chatting application.

The main objective for working on this project was to learn more about decentralization concepts and algorithms by developing a decentralized chat application based on popular open-source matrix protocol and synapse server using Angular as front-end and Python-twisted as a backend.

Duplicate Image Detection Tool

github.com/Jd8111997/Duplicate-Image-Detection-Tool

Isn't it too tiring to manually delete duplicate images from computer ?

This project was a PyQt based GUI application that can detect Duplicate Images by sensitivity hasing algorithm; It can also detect the noisy images and can automatically delete them based on the threshold.

Music Recommendation Engine

github.com/Jd8111997/Music-Recommedation-Engine

Collaborative filtering won't work if you don't have enough data.

Beats is a music hosting site and music recommendation engine built in C# using collaborative filtering algorithms.

Experience

Indiana University School of Public Health

Machine Learning Engineer

Jan 2025 – Present

https://publichealth.indiana.edu/

Designed and implemented a scalable pipeline to process 8M+ PubMed documents using TF-IDF and LLM-based semantic matching.
Developed classifiers to identify evidence-based nudge articles via in-context and few-shot learning.
Fine-tuned and quantized an instruction-tuned LLaMA 3.1-8B for structured information extraction.
Built a RAG system using BGE + LLaMA embeddings with FAISS for recommending targeted nudge behaviors.
Co-developed a speech-to-speech AI reminiscence therapy agent (ASR, LLM, TTS, Unity), securing an NIH R21 grant and preparing for clinical trials.

Causify

Machine Learning Software Engineer

Feb 2024 – Dec 2024

https://www.causify.ai/

Designed and maintained Apache Airflow DAGs for large-scale cryptocurrency price data pipelines.
Built robust pipelines for Crypto.com exchange data.
Enhanced internal Datapull API and optimized codebase for performance and maintainability.
Collaborated on custom LLM and intelligent agent research projects to streamline workflows.

Indiana University Computer Vision Lab

Research Assistant

Aug 2023 – Jun 2024

https://vision.soic.indiana.edu/

Conducted research on out-of-distribution generalization under Dr. David Crandall, focusing on robustness of deep learning models for image classification.
Completed Master’s thesis: “Out-of-Distribution Generalization, Spurious Correlations, and Supervised Contrastive Groupwise Learning.”
Explored 3D computer vision (NeRFs) and egocentric action recognition, advancing perception research.

Vimaan Robotics

Computer Vision Research Intern

Jun 2022 – Sept 2022

https://vimaan.ai/

Developed a synthetic-data–trained No-Reference Image Quality Assessment (NR-IQA) method, robust to real-world data.
Implemented meta-RL–based data valuation for sample quality estimation.
Built pipelines for image deblurring and enhancement using state-of-the-art DL models.

ParallelDots Inc.

Associate Data Scientist – DL Research

Jun 2019 – Jul 2021

https://www.paralleldots.com/

Benchmarked domain-invariant object detectors, improving ShelfWatch product pipelines.
Managed and optimized CV pipelines for Fortune 500 FMCG clients (BAT, RB, Mondelez).
Trained and deployed optimized object detection models for mobile (ONNX, TensorRT, Tfjs).
Contributed to semi-supervised learning research for dense detection, published at CVPR 2021 RetailVision Workshop.

Education

Indiana University

Masters in Computer Science

Aug 2021 - May 2023

IU is considered one of the most beautiful campus in the US.

During my masters, I have studied many interesting subjects : Applied Algorithms, Elements of AI, Machine Learning, Computer Vision, Advance OS, Applied ML for Computational Linguistics, Music Data mining.
Currently, I’m doing Masters Thesis on generalization under distributional shifts, disentanglement and causality in deep learning under the guidance of Prof. David Crandall.

Skills

Languages : Python, R, C/C++, JavaScript, Java SE, SQL, Bash, PHP
Libraries : PyTorch, Scikit, NLTK, Matplotlib, TensorFlow, Keras, JAX, Pandas, Scipy, Plotly
Tools : GIT, MySQL, SQLite, Docker, Wandb, Kubernetes, Latex

Honors and Awards

Won a silver medal for Cassava Leaf Disease Classification challenge on kaggle(top 4% worldwide). - Jan 21
Won a silver medal for SIIM-ISIC Melanoma Classification challenge on kaggle (ranked top 2% worldwide) - Aug 20
Expert’ in Kaggle Competitions(Current rank 1174 out of 166,012)

Publication

Semi-supervised Learning for Dense Object Detection in Retail Scenes: CVPR 2021 RetailVision Workshop
Comparative study of GAN and VAE: International Journal of Computer Applications (0975 - 8887) Volume 182 - No.22, October 2018

A Little More About Me

Alongside tinkering with ML models, some of my other interests and hobbies are:

Playing riffs on guitar(Currently I’m learning to play Nothing else matters by Metallica)
Excercising
Cooking
Watching stand up comedies
Reading fictions(Currently I’m finishing Brothers Karmazov)
Musing and trying to connect dots between science, arts, philosophy and history

My Bucket list

Build my own RV and travel across all the states in the US
Visit a F1 circuit at Monte Carlo
Learn surfing and sky diving
Produce my own music

Jaydeep Chauhan