Sudhii is the personal brand of Sudesh P, an AI Systems Engineer and M.Tech Computer Science student at SRMIST Chennai, India. He is the creator of OmniSLM, an open-source Python framework for building production AI applications with Small Language Models.

OmniSLM is an open-source Python framework created by Sudesh P (Sudhii) for building production-ready AI apps with Small Language Models. It unifies RAG pipelines, vector memory, agent orchestration, and local inference in one extensible architecture.

PaathAI is an AI-powered academic intelligence platform built by Sudesh P that automatically transcribes lectures, maps content to syllabus topics, and generates searchable summaries with progress analytics.

Is Sudesh P available for AI Engineering roles?

Yes. Sudesh P (Sudhii) is open to AI Engineering roles. Contact via sudhii.in/contact or email mrsudesh17@gmail.com.

Where can I find OmniSLM source code?

OmniSLM is open source on GitHub at github.com/sudeshsudhii/OmniSLM under the MIT license.

Sudesh P — AI Systems Engineer

Creator of OmniSLM. Building production-ready AI applications with Small Language Models.

Focused on RAG pipelines, local-first LLM platforms, agent architectures, and privacy-first AI infrastructure.

Chennai, India M.Tech Computer Science, SRMIST, Chennai

Discover OmniSLM Open to AI Engineering Roles Read My Notes

Creator of OmniSLM

AI Systems Engineer

M.Tech CS @ SRMIST

Python + Go AI Engineering

RAG & Agent Architectures

Open Source Contributor

Featured Release

OmniSLM v0.5 is now available

Introducing native agent orchestration, seamless vLLM continuous batching integration, and enhanced memory providers for complex multi-turn workflows.

Explore OmniSLM

Latest Insights

Thoughts on building production-grade AI infrastructure and the shift towards Small Language Models.

View all notes

Why Small Language Models Matter in Production

AI Systems Engineering

Jun 10, 20262 min read

Why Small Language Models Matter in Production

A deep dive into why enterprise AI is shifting towards specialized, privacy-first Small Language Models over massive generic APIs.

AI Infrastructure

May 22, 20262 min read

Building Multi-Tenant AI Systems

Architectural patterns for designing AI infrastructure that securely isolates tenant data while maximizing resource utilization.

RAG vs Fine-Tuning: Choosing the Right Strategy

RAG

Apr 15, 20262 min read

RAG vs Fine-Tuning: Choosing the Right Strategy

A comprehensive guide on when to use Retrieval-Augmented Generation versus Fine-Tuning for your AI projects.

Engineering Note: Vector Isolation

"Never trust the LLM prompt to filter tenant data. In multi-tenant RAG architectures, isolation must happen at the physical or metadata layer before the retrieved context ever reaches the inference engine."

Engineering Note: Async Inference

"Synchronous HTTP requests to an LLM endpoint will eventually bring down your system. Always decouple the web tier from the inference tier using a robust message queue like RabbitMQ."

Subscribe to AI Engineering Notes

Occasional insights on Small Language Models, RAG architectures, and building production-ready local AI systems. No spam, ever.

Other Engineering Work

Case studies of production architectures, from academic intelligence platforms to Web3 supply chains.

View all projects

SeedTracking

Blockchain-based seed supply chain platform with ML fraud detection.

Spring BootReactSolidityEthereum

Read Case Study

Problem

Fraud and opacity in agricultural seed supply chains costs farmers billions annually. Counterfeit seeds reduce crop yields and there's no reliable way to verify authenticity.

Outcome

Enables end-to-end traceability of seed batches. ML model flags anomalous distribution patterns that indicate potential fraud.

Architecture Highlights

Smart contracts on Ethereum handle state changes, while IPFS is used for decentralized document storage. An ML service scores fraud risk.

Local LLM Application

Privacy-first local LLM platform built with Java 21, Spring AI, and MongoDB.

Java 21Spring BootSpring AIOllama

Read Case Study

Problem

Java enterprise teams need LLM capabilities but existing tools are Python-only, creating a skills gap.

Outcome

Bridges the Java-AI gap. Enterprise teams can integrate LLM features using familiar Spring patterns.

Architecture Highlights

A Spring Boot application using WebFlux for reactive endpoints, MongoDB for session storage, and Spring AI for model orchestration.

PaathAI

AI-driven lecture intelligence platform for transcription, summarization, and progress tracking.

JavaSpring BootAI/NLPTranscription

Read Case Study

Problem

Students miss key points in lectures, and there's no structured way to search, review, or track coverage of syllabus topics across sessions.

Outcome

Transforms passive lecture recordings into structured, searchable knowledge bases with syllabus alignment.

Architecture Highlights

An AI platform that processes lecture audio, maps content to syllabus topics, and provides searchable summaries with progress analytics.

RAG System for Local LLM

Privacy-preserving Retrieval-Augmented Generation pipeline using FAISS and Ollama.

PythonFAISSOllamaSentence Transformers

Read Case Study

Problem

Organizations with sensitive documents can't use cloud-based AI services due to data privacy and compliance requirements.

Outcome

Enables AI-powered document Q&A for privacy-sensitive organizations. Processes documents locally with zero data leakage.

Architecture Highlights

A pipeline that ingests documents, chunks them, embeds them locally using SentenceTransformers, and stores them in FAISS. Ollama handles LLM inference.