Hello, I'm
Balachandra Devarangadi Sunil
Software Engineer · Applied Scientist · MS CS @ UMass Amherst
Software engineer who likes building thoughtful, playful things.
scroll to explore ↓
⛰ About Me
Hi! I'm Balachandra, a graduate student at UMass Amherst who loves turning ideas into polished, working software. I care about clean architecture, thoughtful details, and building things that actually feel good to use.
When I'm not coding, I'm usually outdoors, sketching something, or tinkering with side projects like this one.
By training, I'm a machine learning researcher and software engineer with 2+ years of experience spanning product engineering and applied AI. My M.S. work at UMass focused on LLMs, retrieval, and efficient ML systems, including research on controllable memory for LLM agents, RAG pipelines, and edge inference, I'm drawn to questions about how models behave and how to make them genuinely useful in the real world.
Before grad school, I spent time at Epsilon building backend systems and data pipelines that needed to just work at scale, and more recently I've been a Graduate Research Extern at Microsoft exploring memory frameworks for LLM agents. What excites me most is the space where solid engineering meets real research, that's usually where the interesting problems live.
⛰ Experience
Accomplishments
- Architected a dual-latent encoder splitting memories into a stable 'gist' latent and a noisable 'detail' latent, using cosine-scheduled forward diffusion to simulate forgetting and reverse diffusion to simulate cue-conditioned recall.
- Ran ablations across one-step vs. multi-step latent estimation, denoiser training length, and semantic projection modules, improving reconstruction quality (BERTScore F1 up to 0.8251) through targeted architectural changes.
- Evaluated selective recall of PII on a 1.4M-record synthetic privacy dataset (Privasis), uncovering a key finding: the model abstracts or fabricates identity details while reliably preserving event-level structure, a discovery with direct implications for privacy-preserving memory systems.
Tools
Accomplishments
- Achieved up to 90.9% latency improvement (11x speedup) on individual ResNet50V2 partitions and drove the system-wide result of up to 63.8% mean latency reduction for single-tenant and 77.4% for multi-tenant workloads versus the default Edge TPU compiler.
- Diagnosed and solved a hard cross-stack correctness problem, inconsistent TPU/CPU execution results traced to TFLite versioning, restoring consistent output and unlocking a validated 24% latency improvement for InceptionV4 pipelines.
- Built a dual-threadpool inference pipeline (separate TPU and CPU thread pools with closed-loop request handoff) to enable accurate throughput benchmarking across split points and CPU thread counts, and generated model-size and partition tables that directly informed the paper's analytic queueing model design.
Tools
Accomplishments
- Built an ensemble architecture where a meta-LLM selects the best answer from structured Chain-of-Thought candidates generated by each retrieval method.
- Implemented dual retrieval backends (BM25 & Elasticsearch + FAISS) to support flexible sparse and dense retrieval across different invocation strategies.
- Solved the challenge of dynamically deciding which retrieval strategy fits a given query, unifying multiple RAG invocation methods into one adaptive, ensemble-driven system.
Tools
Accomplishments
- Designed 32+ conversation stories and NLU training data covering intents like NPA tracking, loan disbursement, regional sales performance, and live loan counts, each with multi-turn slot-filling dialogue flows.
- Integrated Duckling's time-entity extraction to resolve natural language expressions (e.g. 'last quarter', 'same month last year') into structured timestamps for backend queries.
- Built custom Rasa SDK actions that mapped extracted entities and slots (type, region, time, order) into structured backend API calls, with automatic slot-resets and fallback handling for robust conversation recovery.
Tools
⛰ Projects
…and here are more
ML / AI
Fullstack / Web Development
Flutter / Mobile
Hardware / Embedded
Desktop Applications
⛰ Wins
📄
Best Paper Award
IEEE DCOSS-IoT · 2026
"Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs"
🥇
Hackathon Winner
Hack(H)er413 · 2026
AI-powered vision assistant that helps visually impaired users shop independently by finding items, reading labels, and understanding products through voice.
🥇
Hackathon Winner
HackUMass · 2025
Halo makes the web safer and calmer, protecting people with photosensitive epilepsy and helping those with ADHD read with ease.
🏆
Olympiad Winner
National engineering Olympiad · 2021
Secured All India Rank 7 among third-year Computer Science Engineering students nationwide in the National Engineering Olympiad 4.0, recognized for outstanding performance in a competitive, pan-India technical assessment.
⛰ Education
⛰ Publications
Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases
R. Adsul, B. Devarangadi Sunil, I. Nalawade, S. Govindan
arXiv, 2026
Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs
N. Ng, W. A. Hanafy, P. Kadambi, B. Devarangadi Sunil, A. Gupta, D. Irwin, Y. Simmhan, P. Shenoy
IEEE DCOSS-IoT, 2026
Memory Poisoning Attack and Defense on Memory Based LLM-Agents
B. Devarangadi Sunil, I. Sinha, P. Maheshwari, S. Todmal, S. Mallik, S. Mishra
arXiv, 2026
Density Estimation and Crowd Counting
B. Devarangadi Sunil, R. Venkatesh, S. Todmal
arXiv, 2025
Smart Safety Watch for Elderly People and Pregnant Women
B. Devarangadi Sunil, M. Mysore Sampath, S. Pavan BM, S. Shashank , P. Devaki, A. M.
arXiv, 2023
⛰ Skills
hover (or tap, on phones) a skill to see where it earned its keep
Languages
ML & LLM
Distributed Systems
Databases, Cloud & Infra
Frontend
Backend & Data
⛰ Contact Me
I'd love to hear from you - about work, ideas, or a good trail recommendation.
bdevarangadi@umass.edu© 2026 Balachandra Devarangadi Sunil · Built with Next.js

