Abhinav Gupta — Data Scientist & AI/ML Engineer

Professional Work

@ Bajaj Finserv Health Limited

Systems I've designed, built, and shipped in production during my internship. Click a card to expand the detail.

ML/DL Powered Cataract Eye Detection Agent

Live in Production Docs ↗

End-to-end automated system for individual eye assessment using MTCNN and 12-layer Vision Transformers (BioMedCLIP) for high-precision diagnostic classification.

98.3% ROC-AUC 94.8% Accuracy BioMedCLIP ViT <0.2s Latency

▾

Pipeline Architecture

1. Multi-Stage Detection & Preprocessing

Initial human detection (99% accuracy) followed by MTCNN for eye frame extraction. Applied Lab-space preprocessing: 1.2x whiteness enhancement for cataracts, 0.8x darkening for pupils, and advanced haze removal/de-noising.

2. Feature Extraction (BioMedCLIP)

Leveraged Microsoft's 12-layer BioMedCLIP. Froze first 9 layers and utilized last 3 Vision Transformer (ViT) layers. Standardized eye frames to 224×224 and generated 512-dimensional embeddings for complex pattern recognition.

3. Custom Pupil Geometry & Classification

Radius-based pupil detection (validated at >87% confidence) mapping to 128-dim features. Combined with eye embeddings and MTCNN physical scores (whiteness, haziness, edges) into a 4-layer custom MLP (512→256→64→4) optimized via Optuna.

Technical Excellence

Condition Assessment: Capable of independent diagnosis for both eyes (e.g., detecting Right Eye: Cataract while Left Eye: Normal).
Optimization: Achieved 93% ROC-AUC and 85% F1-score on validation/test sets using hyperparameter tuning with Optuna.
Active Learning Feedback: Implemented a backpropagation loop where MLP misclassifications trigger embedding refinement in BioMedCLIP to maximize class separability.
Production Ready: Response time under 0.2 seconds, ensuring real-time diagnostic capability.

Tech Stack

BioMedCLIP (ViT) MTCNN Optuna Python PyTorch OpenCV

View Technical Documentation ↗

IPD Claims Processing Agent

Live in Production

Automated end-to-end pipeline for IPD insurance claims — document verification, classification, and rule-based adjudication for dialysis and cataract claim types.

4,200 claims / day ₹14L / month savings 89% coverage 61% effort reduction

▾

What it does

Processes 1.2 lakh IPD claims per month (4,200/day) in approximately 5 hours — a workload that previously required 68 manual staff working 10 hours/day.
Brought processing cost down to ~₹2.2L/month for 1.2L claims, saving ₹14 lakh/month compared to the manual process.
Fully configurable — supports any dialysis or cataract claim type without code changes; rules are toggled per document class at runtime.
The pipeline processed and helped adjudicate ₹144 crores worth of claims end-to-end (total claims value handled, not savings).

Technical Breakdown

Document Classifier: Sentence-transformer model (all-MiniLM-L6-v2) supporting 14+ medical document classes with multi-label capability — 94% TP, 83% FN. Can classify two documents on the same page independently.
Computer Vision Models: Custom MobileNetV2-based human detection (97%+ TP) and cataract eye detection (97%+ TP) models, both improved from a 79% baseline during the project.
Adjudication Engine: Dynamic rule-based engine with 62+ configurable rules. Rules are assigned per document type, giving precise per-class adjudication control. Achieves 93% TP accuracy end-to-end.

Stack

Python PyTorch MobileNetV2 all-MiniLM-L6-v2 Computer Vision Rule Engine

LLM Fine-tuning for OPD Claims — InternVL 3.5 (38B)

Live in Production

Fine-tuned a 38B vision-language model to extract structured data from handwritten OPD prescriptions at production scale.

12.3L claims / month ₹3L / month savings 43× faster inference 18 hrs/day runtime

▾

Impact & Scale

Processes 42,000 handwritten prescriptions per day (12.3 lakh/month) across 5 claim task types, running 18 hours continuously in production.
Replaces a costlier LLM solution — saves ₹3 lakh/month with an accepted trade-off of ~6% accuracy drop, a deliberate engineering decision based on cost-benefit analysis.
Inference time reduced from 52 seconds → 1.2 seconds per claim — a 43× speedup — through quantization and inference engine optimisation.

Model Selection Rationale

Benchmarked MedGemma 27B, Qwen 2.5 32B, and InternVL 3.5 38B. Selected InternVL for its superior vision encoder — Shanghai AI Lab's custom architecture on top of Qwen 2.5 32B — which consistently outperformed the others on degraded, handwritten medical text.
Ran a 1-month stealth-mode trial processing 12 lakh real claims in parallel before switching the production pipeline over.
Applied destructive testing to probe for hallucinations under edge-case inputs; validated 18-hour sustained runtime stability before full deployment.

Optimisation Techniques

PEFT with LoRA / QLoRA for parameter-efficient fine-tuning; 4-bit quantization to fit a 38B model on available GPU memory.
LM Deploy for optimised inference; batch processing, token-length management, deadlock prevention, and OOM safeguards for 18-hour continuous operation.

Stack

InternVL 3.5 (38B) Qwen 2.5 PEFT / LoRA 4-bit Quantization LM Deploy VLM

Document Categorization Platform

Live in Production

A self-service ML platform that lets non-technical teams train, benchmark, and deploy custom document classifiers — the backbone of all claims verification workflows at BFHL.

14+ document classes 3 ML approaches Zero-code training Auto model selection

▾

Overview

Teams upload a folder of labelled documents — the folder name becomes the class label. The platform automatically trains all three approaches, benchmarks them against the dataset, and recommends the optimal model. Weights are exported in standard formats (ONNX, TorchScript, etc.) ready for deployment. No training or testing scripts needed.

Three Approaches (auto-benchmarked)

1. Knowledge Distillation — Teacher–Student

Best for badly degraded or handwritten documents
BioBERT (teacher) → DistilBERT (student)
Highest accuracy; slower inference

2. OCR + FastText + ML — Speed-optimised

Best for high-volume, latency-sensitive jobs
Azure OCR → FastText → Logistic / Random Forest
Excellent on printed docs, reasonable on handwritten

3. Sentence Transformers — Default ⭐

Best accuracy–speed trade-off; most adopted by teams
Azure OCR → all-MiniLM-L6-v2/v6
Strong contextual understanding; handles medical terminology well

Stack

BioBERT DistilBERT all-MiniLM-L6-v2 FastText Azure OCR Knowledge Distillation Scikit-learn

Multi-Model LLM Evaluation Framework

Live in Production

An internal tool to objectively benchmark and compare multiple LLMs on medical document extraction — helping make informed model selection decisions backed by data.

10+ models supported Automated reports Medicine validation

▾

How it works

Upload a JSONL or Excel file containing prescription images and expected outputs → select models to run → receive a structured side-by-side comparison report.
Supports GPT-4o (direct API or Azure), Gemini 2.5+ (Vertex AI or direct), and Qwen 2.5 32B/72B (via OpenRouter) — with flexible credential inputs for each provider.
Extracts patient information, prescribed medicines, and diagnosis; validates extracted medicines against the Tata 1mg database for accuracy.
Outputs per-prescription and per-medicine accuracy scores, field coverage rates, and token/API cost breakdown per model.

Stack

GPT-4o Gemini 2.5 Qwen 2.5 OpenRouter Azure OpenAI Vertex AI Tata 1mg API

Production Observability — LangFuse Integration

Live in Production

Instrumented the IPD and OPD claim pipelines with full request tracing, cost monitoring, and performance dashboards using LangFuse.

End-to-end tracing Per-stage cost tracking Live dashboards

▾

What was added

Call-level tracing across both pipelines — every request, model call, and processing step is captured with full context for debugging and auditing.
Per-stage cost attribution, giving the team visibility into which steps consume the most API budget and where to optimise.
Latency, throughput, and error rate dashboards monitored in production on an ongoing basis.

Stack

LangFuse Observability Tracing

Personal Projects

Things I've Built Outside Work

A mix of healthcare, agriculture, industry, and general ML projects.

🏭

NALCO Optimization System

Government of India Patent awarded. Production software for NALCO to optimise raw-to-finished aluminium processing using ML-based process control. SIH 2024 finalist project.

View on GitHub ↗

🏥

MediMind — Multi-Agent Diagnosis

Multi-agent AI system for medical diagnosis using collaborative LLMs with agent orchestration for comprehensive health assessment.

View on GitHub ↗

🌾

Krishi Moolya

An end-to-end ML platform for farmers — combines crop price prediction using market trend analysis, a crop recommendation system based on soil and climate inputs, and actionable insights to help farmers decide what to grow and when to sell for maximum benefit.

View on GitHub ↗

👤

AI Attendance System

Computer vision-based attendance management using facial recognition — real-time processing with anti-spoofing mechanisms.

View on GitHub ↗

🏃

Human Activity Recognition

ANN-based activity classifier trained on Samsung smartwatch sensor data with high temporal accuracy.

View on GitHub ↗

📚

Book Recommender System

Collaborative filtering recommendation engine using matrix factorization for personalised book suggestions.

View on GitHub ↗

View all on GitHub ↗

Skills

Technical Skills

A broad foundation across the AI/ML stack — built through coursework, side projects, and real production work.

🧠 Machine Learning & Deep Learning

Supervised Learning Unsupervised Learning PyTorch Scikit-learn Computer Vision NLP Knowledge Distillation Hyperparameter Tuning Model Evaluation

🤖 LLMs & Fine-tuning

PEFT (LoRA / QLoRA) Quantization LM Deploy Prompt Engineering Qwen 2.5 InternVL MedGemma Microsoft Phi Unsloth Hugging Face

🔍 OCR & Documents

Azure OCR Azure Form Recognizer Tesseract PaddleOCR SuryaOCR Document Classification

🔗 Embeddings & Vector DBs

Sentence Transformers all-MiniLM-L6-v2 FastText FAISS Qdrant Pinecone RAG Pipelines

🛠️ Frameworks & Tools

LangChain LangGraph (basic) LangFuse FastAPI Streamlit MLflow OpenRouter Postman

☁️ Cloud & APIs

Azure AI Foundry Google AI Studio Google Vertex AI

📊 Data & Analytics

Python Pandas NumPy Matplotlib EDA SQL Excel

🎯 Graph & Network

Memgraph NetworkX Graph Analytics

Recognition

Achievements

Hackathons, awards, and milestones along the way.

🏆

Top 5 Best Intern — BFHL 2026

Recognised among the top 5 interns in Bajaj Finserv Health Limited 2026 batch for project delivery and technical contributions.

🥈

1st Runner-up — HackRx (BFHL Internal)

Runner-up at BFHL's internal hackathon, competing with teams across the organisation.

📜

Government of India Patent

Patent granted for the NALCO aluminium processing optimization software, deployed in production at an industrial scale.

🎯

Smart India Hackathon 2024 — Finalist

National-level finalist building real-world industrial AI solutions.

🥈

GDSC Hackathon 2025 — Runner-up

1st runner-up at Google Developer Student Clubs hackathon.

🏁

SVIM Hackathon — Finalist

Advanced to finals demonstrating strong technical problem-solving.

Community

Keeping Up with the Field

One of the best ways I stay sharp is by staying current — I run a small Instagram page for that.

Contact

Get in Touch

Feel free to reach out — whether it's about a role, a project idea, or just to talk AI/ML.

📧 abhinavg963@gmail.com 💼 LinkedIn 💻 GitHub 📱 @synthixlabs

Hi, I'm Abhinav Gupta

@ Bajaj Finserv Health Limited

ML/DL Powered Cataract Eye Detection Agent

Pipeline Architecture

1. Multi-Stage Detection & Preprocessing

2. Feature Extraction (BioMedCLIP)

3. Custom Pupil Geometry & Classification

Technical Excellence

Tech Stack

IPD Claims Processing Agent

What it does

Technical Breakdown

Stack

LLM Fine-tuning for OPD Claims — InternVL 3.5 (38B)

Impact & Scale

Model Selection Rationale

Optimisation Techniques

Stack

Document Categorization Platform

Overview

Three Approaches (auto-benchmarked)

1. Knowledge Distillation — Teacher–Student

2. OCR + FastText + ML — Speed-optimised

3. Sentence Transformers — Default ⭐

Stack

Multi-Model LLM Evaluation Framework

How it works

Stack

Production Observability — LangFuse Integration

What was added

Stack

Things I've Built Outside Work

Technical Skills

🧠 Machine Learning & Deep Learning

🤖 LLMs & Fine-tuning

🔍 OCR & Documents

🔗 Embeddings & Vector DBs

🛠️ Frameworks & Tools

☁️ Cloud & APIs

📊 Data & Analytics

🎯 Graph & Network

Achievements

Keeping Up with the Field

@synthixlabs

Get in Touch