Professional portfolio · Balachandra Devarangadi Sunil

⛰ About Me

Hi! I'm Balachandra, a graduate student at UMass Amherst who loves turning ideas into polished, working software. I care about clean architecture, thoughtful details, and building things that actually feel good to use.

When I'm not coding, I'm usually outdoors, sketching something, or tinkering with side projects like this one.

By training, I'm a machine learning researcher and software engineer with 2+ years of experience spanning product engineering and applied AI. My M.S. work at UMass focused on LLMs, retrieval, and efficient ML systems, including research on controllable memory for LLM agents, RAG pipelines, and edge inference, I'm drawn to questions about how models behave and how to make them genuinely useful in the real world.

Before grad school, I spent time at Epsilon building backend systems and data pipelines that needed to just work at scale, and more recently I've been a Graduate Research Extern at Microsoft exploring memory frameworks for LLM agents. What excites me most is the space where solid engineering meets real research, that's usually where the interesting problems live.

⛰ Experience

Accomplishments

Architected a dual-latent encoder splitting memories into a stable 'gist' latent and a noisable 'detail' latent, using cosine-scheduled forward diffusion to simulate forgetting and reverse diffusion to simulate cue-conditioned recall.
Ran ablations across one-step vs. multi-step latent estimation, denoiser training length, and semantic projection modules, improving reconstruction quality (BERTScore F1 up to 0.8251) through targeted architectural changes.
Evaluated selective recall of PII on a 1.4M-record synthetic privacy dataset (Privasis), uncovering a key finding: the model abstracts or fabricates identity details while reliably preserving event-level structure, a discovery with direct implications for privacy-preserving memory systems.

Tools

PyTorchDiffusion ModelsHuggingFace TransformersFlan-T5Weights & BiasesSLURMPython

Accomplishments

Achieved up to 90.9% latency improvement (11x speedup) on individual ResNet50V2 partitions and drove the system-wide result of up to 63.8% mean latency reduction for single-tenant and 77.4% for multi-tenant workloads versus the default Edge TPU compiler.
Diagnosed and solved a hard cross-stack correctness problem, inconsistent TPU/CPU execution results traced to TFLite versioning, restoring consistent output and unlocking a validated 24% latency improvement for InceptionV4 pipelines.
Built a dual-threadpool inference pipeline (separate TPU and CPU thread pools with closed-loop request handoff) to enable accurate throughput benchmarking across split points and CPU thread counts, and generated model-size and partition tables that directly informed the paper's analytic queueing model design.

Tools

TensorFlow LiteEdge TPURaspberry Pi 5Jetson Orin NanoPythonONNXKeras

Accomplishments

Built an ensemble architecture where a meta-LLM selects the best answer from structured Chain-of-Thought candidates generated by each retrieval method.
Implemented dual retrieval backends (BM25 & Elasticsearch + FAISS) to support flexible sparse and dense retrieval across different invocation strategies.
Solved the challenge of dynamically deciding which retrieval strategy fits a given query, unifying multiple RAG invocation methods into one adaptive, ensemble-driven system.

Tools

BM25ElasticsearchFAISSLLMsPython

Accomplishments

Designed 32+ conversation stories and NLU training data covering intents like NPA tracking, loan disbursement, regional sales performance, and live loan counts, each with multi-turn slot-filling dialogue flows.
Integrated Duckling's time-entity extraction to resolve natural language expressions (e.g. 'last quarter', 'same month last year') into structured timestamps for backend queries.
Built custom Rasa SDK actions that mapped extracted entities and slots (type, region, time, order) into structured backend API calls, with automatic slot-resets and fallback handling for robust conversation recovery.

Tools

RASASpaCyFacebook DucklingPythonNLUCRF

⛰ Projects

…and here are more

ML / AI

Fullstack / Web Development

Flutter / Mobile

Hardware / Embedded

Desktop Applications

⛰ Wins

📄

Best Paper Award

IEEE DCOSS-IoT · 2026

"Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs"

🥇

Hackathon Winner

Hack(H)er413 · 2026

AI-powered vision assistant that helps visually impaired users shop independently by finding items, reading labels, and understanding products through voice.

🥇

Hackathon Winner

HackUMass · 2025

Halo makes the web safer and calmer, protecting people with photosensitive epilepsy and helping those with ADHD read with ease.

🏆

Olympiad Winner

National engineering Olympiad · 2021

Secured All India Rank 7 among third-year Computer Science Engineering students nationwide in the National Engineering Olympiad 4.0, recognized for outstanding performance in a competitive, pan-India technical assessment.

⛰ Education

⛰ Skills

hover (or tap, on phones) a skill to see where it earned its keep

Languages

ML & LLM

Distributed Systems

Databases, Cloud & Infra

Frontend

Backend & Data

⛰ Contact Me

I'd love to hear from you - about work, ideas, or a good trail recommendation.

bdevarangadi@umass.edu

GitHub LinkedIn Google Scholar Devpost Medium

Balachandra Devarangadi Sunil

⛰ About Me

⛰ Experience

Graduate Research Extern

Research Assistant

Research Assistant

Research Engineer

⛰ Projects

Task-Aware LoRA Adapter Composition

Density Estimation and Crowd Counting

REMIND: Diffusion-Based Controllable Memory

…and here are more

Domain LLM Alignment

Edge Inference Pipeline for Multi-Tenant TPUs

Neural Style Transfer

Fugitive Detection Traffic System

Smart Parking

Halo

NotBored

Smart Safety Watch

Hostel Management System

⛰ Wins

Best Paper Award

Hackathon Winner

Hackathon Winner

Olympiad Winner

⛰ Education

🎓 University of Massachusetts Amherst

🎓 The National Institute of Engineering, Mysore

⛰ Publications

Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases

Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs

Memory Poisoning Attack and Defense on Memory Based LLM-Agents

Density Estimation and Crowd Counting

Smart Safety Watch for Elderly People and Pregnant Women

⛰ Skills

Languages

ML & LLM

Distributed Systems

Databases, Cloud & Infra

Frontend

Backend & Data

⛰ Contact Me