5 5 days ago

This model is a beginner-focused fine-tuning project that adapts Meta's Llama-3.1 8B Instruct base into a specialized technical expert for the Vellium search engine. Trained via Unsloth and 4-bit QLoRA on structural architectural data,

tools
ollama run kathitjoshi/vellium

Applications

Claude Code
Claude Code ollama launch claude --model kathitjoshi/vellium
Codex App
Codex App ollama launch codex-app --model kathitjoshi/vellium
OpenClaw
OpenClaw ollama launch openclaw --model kathitjoshi/vellium
Hermes Agent
Hermes Agent ollama launch hermes --model kathitjoshi/vellium
Codex
Codex ollama launch codex --model kathitjoshi/vellium
OpenCode
OpenCode ollama launch opencode --model kathitjoshi/vellium

Models

View all →

1 model

vellium:latest

4.9GB · 128K context window · Text · 5 days ago

Readme

What is Vellium? Vellium is a modern, AI-augmented search engine and web utility built from the ground up to challenge traditional search friction. You can check out the live implementation at here and explore the open-source codebase directly on GitHub at here and the repo for this model can be found here

Instead of relying on traditional web scraping methods (which easily get flagged and blocked by anti-bot frameworks on cloud hosting providers), Vellium uses an incredibly smart architecture: it concurrently queries 8 completely open developer APIs (including Wikipedia, StackOverflow, MDN Web Docs, Dev.to, GitHub, HackerNews, Crossref, and OpenLibrary) the exact millisecond a user types a query. It passes those live results along to an Express.js backend where a Contextual Ranking Engine reorganizes them based on what the user is trying to find (like boosting coding docs for technical questions, or research papers for academic queries). Simultaneously, a Multimodal Intent Engine catches specific needs right away, automatically injecting rich graphical widgets for direct mathematics, live weather data, images, or dictionary definitions without needlessly wasting AI resources. Finally, it uses an isolated iframe rendering pipeline to cleanly and securely display a beautiful, unified workspace complete with smooth fluid animations.

The Vellium AI Model (A Beginner’s Fine-Tuning Milestone) To take the system a step further, the search assistant experience has been customized using a beginner fine-tuned LLM project. As a first hands-on dive into open-source machine learning lifecycle pipelines, this project serves as a foundational step toward understanding local deployment and model optimization.

Starting with Meta’s open-source Llama-3.1 8B Instruct base, the model was fine-tuned using a hardware-conscious approach inside a free cloud environment. By utilizing Unsloth and 4-bit QLoRA acceleration, the model’s massive baseline footprint was heavily optimized, targeting and training only about 1% of the model’s total network weights. The training dataset consisted of structural question-and-answer pairs mapping out the exact technical specs, code files (like server.ts), and architectural constraints of the Vellium platform.

The end result was successfully compiled into a lightweight 4-bit quantized GGUF format (q4_k_m) and pushed directly to Hugging Face. This makes the specialized model fully accessible to the public, allowing anyone to download the package and chat with a dedicated, offline Vellium technical expert using local runtimes like Ollama or Open WebUI.