Jan V1 Setup Guide

Aug 19, 2025

jan-v1mcpsearch

What is Jan V1?

Jan V1 is a 4B parameters language model that achieves 91.1% accuracy on SimpleQA through reinforcement learning with verifiable rewards (RLVR). It’s optimized for searching and synthesizing information from the internet.

This guide covers the basic setup and usage of Jan V1.

System Requirements

Memory: Minimum 8GB RAM (Q4 quantization), Recommended 16GB RAM (Q8 quantization)
Hardware: CPU or GPU
Storage: 2.5GB - 4.28GB depending on quantization level

Basic Usage with Jan Desktop

Download latest version of Jan Desktop from jan.ai
Install and launch the application
Select and download Jan V1 from the model list. Q8_0 is recommended for best performance.
Start chatting - no additional configuration needed!

Basic Setup Hub

Search Setup

To enable search capabilities in Jan Desktop:

Ensure you are using the latest version of Jan Desktop that supports MCP (0.6.8 or higher)
Go to Settings → General → Experimental Features → On
Go to Settings → MCP Servers → enable Search-related MCP (e.g., Serper)
Enter your Serper API key in the MCP Servers settings. You can get your API key here

Ensure Jan’s instruction field is empty and use recommended sampling parameters. Jan v1 comes with baked instructions, so you need to remove any custom instructions in Jan Assistant to use the Jan v1 model at its best performance.

Remove Instruction

After that, you can choose Jan V1 as your model and start using search capabilities in Jan!

Available Quantizations

Choose based on your memory constraints:

Q4_K_M: 2.5 GB - Good balance of size and quality
Q5_K_M: 2.89 GB - Better quality, slightly larger
Q6_K: 3.31 GB - Near-full quality
Q8_0: 4.28 GB - Highest quality quantization (recommended for best performance)

Recommended Setting

temperature: 0.6
top_p: 0.95
top_k: 20
min_p: 0.0

Server Setup

vLLM

For vLLM, the model is baked with the recommended sampling parameters and a system prompt by default to match benchmark performance so you don’t need to set it manually.

vllm serve janhq/Jan-v1-4B \
    --host 0.0.0.0 \
    --port 1234 \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --reasoning-parser qwen3

llama.cpp

Install llama.cpp following the instructions here
Download Jan-v1-4B-Q8_0.gguf from Hugging Face
Run the following command:

llama-server --model path/to/Jan-v1-4B-Q8_0.gguf \
    --host 0.0.0.0 \
    --port 1234 \
    --jinja \
    --no-context-shift

Connect Jan with Jan-V1 Server

If using vLLM or llama.cpp, the model will be available at:

Endpoint: http://localhost:1234/v1
API Format: OpenAI-compatible

This can be easily connected to the Jan application jan-v1-setup_server Setup

Troubleshooting

Out of Memory: Try a lower quantization level (Q4 instead of Q8)
Slow Performance: Ensure adequate RAM and consider GPU acceleration
Model Not Loading: Verify file paths and permissions
API Connection: Check if the server is running on the correct port
Infinite Loop: Try to use the largest quantization level (Q8)
Missing MCP Section: Ensure you are using the latest version of Jan Desktop that supports MCP (0.6.8 or higher) and enable Experimental Features
Error 403 when using search: Ensure you have entered your Serper API key in the MCP Servers settings. You can get your API key here