DeepSeek, Qwen, Mistral, and Open-Source Models · Lesson 1
A Map of Open-Weight Models
DeepSeek, Qwen, Mistral, Llama — who's strong at what and how to think about it.
Why open weights
An open-weight model is a model whose weights you can download and run locally or in your own cloud. This gives you:
- Full privacy.
- Cost control.
- The ability to fine-tune for your own tasks.
- Independence from an API provider.
The main families
- DeepSeek — strong reasoning, affordable APIs. R1 / V3 / V3-base. Especially good at code and math.
- Qwen (Alibaba) — multilingualism, vision, coding. Many versions in open weights.
- Mistral / Mixtral — European models, a good compromise of performance and quality.
- Llama (Meta) — the old open-source "base", still evolving.
- Gemma (Google) — lightweight models, convenient for local runs.
- Phi (Microsoft) — small ones, for CPU/mobile scenarios.
The selection principle
If you run locally — look at your GPU and memory. If via the API — look at price and quality on your tasks. Don't trust benchmarks as the final truth — always test on your own scenarios.
Practical exercise
What to do after this lesson
Open OpenRouter / Ollama. Run the same prompt on DeepSeek, Qwen, and Mistral. Note who's better for your task.
Ready-to-use prompt
Template for this lesson
Copy and adapt to your context. Text in angle brackets should be replaced.
Help me choose an open-weight model. Task: <…> Where I run it (locally / API / cloud): <…> Available GPU memory: <…> Constraints: <…> Give a recommendation from DeepSeek / Qwen / Mistral / Llama / Gemma with reasoning.
Common mistakes
What people get wrong
- They choose by benchmarks without testing on their own tasks.
- They run a huge model on a small GPU — they get a crawl.
- They ignore the open-weights license (there are restrictions).
Pro tips
What works but no one documents
- Quantized versions (Q4/Q5) often lose almost no quality and save a lot of memory.
- OpenRouter — a fast way to test models before choosing.
- Keep a "reference set of tasks" for regular re-testing.
When to use
Privacy, cost control, fine-tune for a niche.
When not to use
When you need maximum quality and privacy doesn't matter — the flagships are still stronger.
Official sources
Квиз — 2 вопроса
1.Why do you need open-weight models?
2.DeepSeek is known primarily for:
Отвечено: 0 из 2