Why I Built OmniSLM
The AI tooling ecosystem is fragmented. Every local LLM project requires stitching together RAG, memory, and agents from scratch. OmniSLM is my attempt at a unified framework.
Technical deep dives, architecture decisions, and lessons learned while building production AI systems.
The AI tooling ecosystem is fragmented. Every local LLM project requires stitching together RAG, memory, and agents from scratch. OmniSLM is my attempt at a unified framework.
Enterprise data privacy requires local execution. Exploring the tradeoffs between Ollama, vLLM, and Llama.cpp for production workloads on edge hardware.
How to isolate vector databases, manage tenant-specific model configurations, and handle billing boundaries in Spring Boot applications.
Migrating enterprise Java teams to modern AI patterns. What Spring AI gets right, where it falls short compared to LangChain, and how to bridge the gaps.
Standard dense retrieval isn't enough for production. Deep dive into hybrid search (FAISS + BM25), cross-encoder re-ranking, and dynamic chunking strategies.