35 Downloads Updated 3 weeks ago
ollama run fredrezones55/Jan-v3
ollama launch claude --model fredrezones55/Jan-v3
ollama launch codex --model fredrezones55/Jan-v3
ollama launch opencode --model fredrezones55/Jan-v3
ollama launch openclaw --model fredrezones55/Jan-v3
Hugging face Source

Jan-v3-4B-base-instruct is a 4B-parameter model obtained via post-training distillation from a larger teacher, transferring capabilities while preserving general-purpose performance on standard benchmarks. The result is a compact, ownable base that is straightforward to fine-tune, broadly applicable and minimizing the usual capacity–capability trade-offs.
Building on this base, Jan-Code, a code-tuned variant, will be released soon.
This repo contains the BF16 version of Jan-v3-4B-base-instruct, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 4B in total - Number of Layers: 36 - Number of Attention Heads (GQA): 32 for Q and 8 for KV - Context Length: 262,144 natively.
Intended Use

Jan-v3 demo is hosted on Jan Browser at chat.jan.ai. It is also optimized for direct integration with Jan Desktop, select the model in the app to start using it.
Using vLLM:
vllm serve janhq/Jan-v3-4B-base-instruct \
--host 0.0.0.0 \
--port 1234 \
--enable-auto-tool-choice \
--tool-call-parser hermes
Using llama.cpp:
llama-server --model Jan-v3-4B-base-instruct-Q8_0.gguf \
--host 0.0.0.0 \
--port 1234 \
--jinja \
--no-context-shift
For optimal performance in agentic and general tasks, we recommend the following inference parameters:
temperature: 0.7
top_p: 0.8
top_k: 20
Updated Soon