Skip to content

Engineering Notes

Technical deep dives, architecture decisions, and lessons learned while building production AI systems.

June 10, 2026 8 min read

Why I Built OmniSLM

The AI tooling ecosystem is fragmented. Every local LLM project requires stitching together RAG, memory, and agents from scratch. OmniSLM is my attempt at a unified framework.

OmniSLMArchitecturePython
May 22, 2026 12 min read

Building Local AI Systems

Enterprise data privacy requires local execution. Exploring the tradeoffs between Ollama, vLLM, and Llama.cpp for production workloads on edge hardware.

Local LLMPrivacyInfrastructure
April 15, 2026 10 min read

Designing Multi-Tenant AI Applications

How to isolate vector databases, manage tenant-specific model configurations, and handle billing boundaries in Spring Boot applications.

Spring AISystem DesignJava
March 8, 2026 7 min read

Lessons From Spring AI

Migrating enterprise Java teams to modern AI patterns. What Spring AI gets right, where it falls short compared to LangChain, and how to bridge the gaps.

Spring BootEnterprise Architecture
February 18, 2026 15 min read

RAG Systems Beyond Tutorials

Standard dense retrieval isn't enough for production. Deep dive into hybrid search (FAISS + BM25), cross-encoder re-ranking, and dynamic chunking strategies.

RAGFAISSMachine Learning