AI builder self-host stack
4 open-source picks · replaces 4 SaaS · self-host on your own VPS
Persona. Developer building LLM-powered apps who wants local inference + a chat UI + IDE assist + translation, all running on their own hardware or a rented GPU box.
Why these together
This is the 'I want to build with LLMs without paying OpenAI per token' stack. Ollama runs the models on your hardware (consumer GPU box or rented A100); Open WebUI is the ChatGPT-equivalent UI that talks to Ollama and lets non-technical teammates use the same models you're prototyping with; Continue is the IDE-side companion that wires the same Ollama endpoint into VS Code or JetBrains for completions and chat; LibreTranslate handles language pairs without round-tripping to DeepL. Cross-tool integration is one URL: every component points at `http://ollama:11434` and shares the loaded model. The honest cost call: this stack is hardware-bound, not VPS-bound — budget $1500 for a 24GB GPU box or $0.50/hr for a rented A100 when you actually need scale.
The 4 picks
| Pick | Replaces | Cost / setup | Health |
|---|---|---|---|
| ollama/ollama · MIT | OpenAI API LLM inference API | $200/mo+ easy · 5min single binary | alive |
| open-webui/open-webui · BSD-3-Clause | ChatGPT AI chat assistant (consumer UI) | $10/mo+ easy · 10min docker-compose (Open WebUI + Ollama) | alive |
| continuedev/continue · Apache-2.0 | GitHub Copilot AI code completion / chat in the IDE | $0/mo+ easy · 10min — VS Code or JetBrains plugin + a local model | alive |
| LibreTranslate/LibreTranslate · AGPL-3.0 | DeepL Machine translation API | $10/mo+ easy · 10min docker run (model downloads on first start) | alive |
Other stacks
- Indie Hacker self-host stack · 5 picks
- Remote team self-host stack · 5 picks
- Customer support team self-host stack · 4 picks
- Dev platform self-host stack · 5 picks
- Observability on $5 self-host stack · 5 picks
- Marketing team self-host stack · 5 picks
- Product team self-host stack · 5 picks
- Privacy-first self-host stack · 5 picks