Backend API

Prerequisites

Whatever your installation method, you’ll need at least the following to be installed:

  1. Docker (and Docker compose if you’re using an old version)
  2. NVIDIA Container Toolkit and a GPU

We recommend min 5Gb of VRAM on your GPU for good performance/latency balance. Please note that by default, this will run your LLM locally (available offline) but if you don’t have a GPU, you can use online LLM providers (currently supported: Groq, OpenAI)

60 seconds setup ⏱️

This method is easier for you to try it out locally. Follow the instructions over here.

Production setup

This method is more adequate if you have an isolated/remote environment where you don’t necessarily want Git to interfere. You’ll need to own a domain name with the default configuration (e.g. “mydomain.nom”)

1

Create a `.env` file

Save this example file and name it .env.

2

Edit your environment variables

Edit your .env and replace the following values:

  • SUPERADMIN_GH_PAT: your Github user Personal Access Token to authenticate you as the admin. Head over to your Developer settings on GitHub, and “Generate new token”, pick a name and an expiration and confirm with “Generate token” (no need for extra permissions i.e. read-only)
  • pick secure passwords for POSTGRES_PASSWORD, SUPERADMIN_PWD, GF_ADMIN_PWD
  • ACME_EMAIL: the email linked to your certificate for HTTPS
  • POSTGRES_HOST & POSTGRES_PORT: the host and port of your remote PostgreSQL database service.
  • BACKEND_HOST: the subdomain where your users will access your API (e.g “api.mydomain.com”)

If you want to edit other aspects, check the env variable description.

3

Define a Docker orchestration

Save this docker compose configuration and name it docker-compose.yml. You can comment the deploy section of the ollama service if you wish to use your CPU to run the LLM instead.

4

Setting the certificate access permission

Before starting our docker setup, we need to make sure your ACME certificate will be readable:

touch acme.json
sudo chmod 600 acme.json
5

Run the service

You should now have two files (.env, docker-compose.yml) in your folder. Time to start the services:

docker compose up -d --wait ollama backend traefik

Using your API

Bravo, you now have a full running service! You can now start your VSCode extension, open the command palette and look for “Quack Companion: Set API endpoint” where you’ll need to paste the URL to the API endpoint.

Additional options

There are additional options to customize your service, here are a few:

Database hosting

Instead of hosting your PostgreSQL database locally or self-hosted, you can use hosting services like Supabase. You only need to replace the values of POSTGRES_HOST & POSTGRES_PORT in your .env file.

LLM selection

We use Ollama to serve LLMs. Edit OLLAMA_MODEL to use other models from the hub. For a good performance/latency balance, we recommend you use one of the following models: dolphin-mistral:7b-v2.6-dpo-laser-q4_K_M, deepseek-coder:6.7b-instruct-q4_K_M. Please don’t pick oversized hardware or models for your needs, to preserve both your hardware life expectancy and your energy bills 💚

Application performance monitoring

When you start running your service at high workload, you might want to monitor its performances. You’ll find additional service in your docker compose using Prometheus & Grafana to give you a proper APM dashboard.