Self-Hosting
Host your Quack suite on your own
Backend API
Prerequisites
Whatever your installation method, you’ll need at least the following to be installed:
- Docker (and Docker compose if you’re using an old version)
- NVIDIA Container Toolkit and a GPU
We recommend min 5Gb of VRAM on your GPU for good performance/latency balance. Please note that by default, this will run your LLM locally (available offline) but if you don’t have a GPU, you can use online LLM providers (currently supported: Groq, OpenAI)
60 seconds setup ⏱️
This method is easier for you to try it out locally. Follow the instructions over here.
Production setup
This method is more adequate if you have an isolated/remote environment where you don’t necessarily want Git to interfere. You’ll need to own a domain name with the default configuration (e.g. “mydomain.nom”)
Create a `.env` file
Save this example file and name it .env
.
Edit your environment variables
Edit your .env
and replace the following values:
SUPERADMIN_GH_PAT
: your Github user Personal Access Token to authenticate you as the admin. Head over to your Developer settings on GitHub, and “Generate new token”, pick a name and an expiration and confirm with “Generate token” (no need for extra permissions i.e. read-only)- pick secure passwords for
POSTGRES_PASSWORD
,SUPERADMIN_PWD
,GF_ADMIN_PWD
ACME_EMAIL
: the email linked to your certificate for HTTPSPOSTGRES_HOST
&POSTGRES_PORT
: the host and port of your remote PostgreSQL database service.BACKEND_HOST
: the subdomain where your users will access your API (e.g “api.mydomain.com”)
If you want to edit other aspects, check the env variable description.
Define a Docker orchestration
Save this docker compose configuration and name it docker-compose.yml
.
You can comment the deploy
section of the ollama service if you wish to use your CPU to run the LLM instead.
Setting the certificate access permission
Before starting our docker setup, we need to make sure your ACME certificate will be readable:
touch acme.json
sudo chmod 600 acme.json
Run the service
You should now have two files (.env
, docker-compose.yml
) in your folder. Time to start the services:
docker compose up -d --wait ollama backend traefik
Using your API
Bravo, you now have a full running service! You can now start your VSCode extension, open the command palette and look for “Quack Companion: Set API endpoint” where you’ll need to paste the URL to the API endpoint.
Additional options
There are additional options to customize your service, here are a few:
Database hosting
Instead of hosting your PostgreSQL database locally or self-hosted, you can use hosting services like Supabase. You only need to replace the values of POSTGRES_HOST
& POSTGRES_PORT
in your .env
file.
LLM selection
We use Ollama to serve LLMs. Edit OLLAMA_MODEL
to use other models from the hub.
For a good performance/latency balance, we recommend you use one of the following models: dolphin-mistral:7b-v2.6-dpo-laser-q4_K_M
, deepseek-coder:6.7b-instruct-q4_K_M
.
Please don’t pick oversized hardware or models for your needs, to preserve both your hardware life expectancy and your energy bills 💚
Application performance monitoring
When you start running your service at high workload, you might want to monitor its performances. You’ll find additional service in your docker compose using Prometheus & Grafana to give you a proper APM dashboard.
Was this page helpful?