Skip to content

Docker

The Docker image is the SQLFlow On-Premise edition. It contains every feature available in the Cloud version — lineage analysis, all 20+ supported database vendors, the web UI, the REST API, SQL file upload, widget embedding, and so on. The image runs entirely inside your own infrastructure, so your SQL never leaves your network.

System Requirements

Resource Minimum Recommended
CPU 2 cores 4+ cores
Memory (RAM) 8 GB 16 GB+
Disk Space 5 GB free 10 GB+ free
Docker Engine 20.10+ Latest stable

The Docker Image

SQLFlow images are published to Docker Hub under gudusqlflow/sqlflow. The latest tag always points to the most recent release; specific releases are tagged with their version number (e.g. 8.2.3.0).

Pull the latest SQLFlow docker image:

1
docker pull gudusqlflow/sqlflow:latest

Or pull a specific version:

1
docker pull gudusqlflow/sqlflow:8.2.3.0

If you have never used Docker before, this is the easiest path. You do not need to run docker pull, docker run, or remember a long list of flags. The short answer to "do I just need a docker-compose.yml and then docker compose up -d?" is — yes, that's it.

Here is exactly what you do, start to finish:

1. Install Docker Engine

Docker Compose ships as a plugin of modern Docker Engine (the docker compose subcommand). The docker-compose-plugin package is only available from Docker's official apt repository — it is not in Ubuntu's default repos. If you run sudo apt-get install -y docker.io docker-compose-plugin against a stock Ubuntu, you will get:

1
E: Unable to locate package docker-compose-plugin
  • On Ubuntu — follow the full step-by-step install in the Appendix at the end of this page. It adds Docker's official repo and then installs both the engine and the Compose plugin.
  • On Windows / macOS — install Docker Desktop; it bundles both the engine and Compose.

Once the install is done, verify:

1
2
docker --version
docker compose version

2. Create a working folder and a docker-compose.yml file

1
mkdir ~/sqlflow && cd ~/sqlflow

Inside that folder, create a file named docker-compose.yml with exactly this content:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
version: '3.8'
services:
  sqlflow:
    image: gudusqlflow/sqlflow:latest
    container_name: sqlflow
    ports:
      - "8165:8165"
    volumes:
      - sqlflow-data:/sqlflow/data
      - sqlflow-logs:/sqlflow/log
    deploy:
      resources:
        limits:
          memory: 8G
    restart: unless-stopped

volumes:
  sqlflow-data:
  sqlflow-logs:

That file is your entire configuration — it tells Docker which image to download, which port to expose (8165), how much memory to give it (8 GB), and where to store persistent data.

3. Start SQLFlow with one command

From the same folder that contains docker-compose.yml:

1
docker compose up -d

That single command does everything a first-time user needs:

  1. Downloads the gudusqlflow/sqlflow:latest image from Docker Hub (about 1–2 GB, one-time only).
  2. Creates the persistent volumes for your data and logs.
  3. Starts the container in the background (-d = detached).
  4. Sets it to auto-restart if your machine reboots.

Wait 90–120 seconds for all internal services to finish starting, then open your browser — see Access SQLFlow below for the URL and remote-access options.

4. Day-to-day commands

All of these run from the same folder as your docker-compose.yml:

What you want to do Command
Check if it is running docker compose ps
Watch the startup log live docker compose logs -f
Stop SQLFlow (keeps your data) docker compose down
Start it again docker compose up -d
Upgrade to the latest image docker compose pull && docker compose up -d
Remove everything including data docker compose down -v

That is the complete workflow. You never need to touch docker run, docker pull, or manage ports and volumes by hand — docker compose reads docker-compose.yml every time and does it for you.

Alternative: Manual docker run

If you prefer explicit control over each flag — or you cannot use Docker Compose in your environment — you can start the container directly with docker run.

Option A: Simple run

1
docker run -d --name sqlflow -p 8165:8165 gudusqlflow/sqlflow:latest

Option B: Run with memory limit and persistent data (recommended)

1
2
3
4
5
6
7
8
docker run -d \
  --name sqlflow \
  -p 8165:8165 \
  --memory=8g \
  -v sqlflow-data:/sqlflow/data \
  -v sqlflow-logs:/sqlflow/log \
  --restart unless-stopped \
  gudusqlflow/sqlflow:latest

Explanation of the flags:

Flag Purpose
-d Run in the background (detached mode)
--name sqlflow Give the container a friendly name
-p 8165:8165 Map host port 8165 to the container's 8165. Change the host side if 8165 is already in use (e.g. -p 9165:8165).
--memory=8g Limit memory usage to 8 GB
-v sqlflow-data:/sqlflow/data Persist application data across restarts
-v sqlflow-logs:/sqlflow/log Persist log files across restarts
--restart unless-stopped Automatically restart the container if it crashes or after a reboot

For more information about container creation, see the official Docker Doc.

Access SQLFlow

Regardless of which method above you used, SQLFlow takes approximately 90–120 seconds to finish starting all internal services. Monitor the startup progress:

1
docker logs -f sqlflow

Wait until you see:

1
All services started. SQLFlow available at http://localhost:8165/

Press Ctrl+C to stop following the logs, then open your browser and navigate to:

1
http://localhost:8165/
  • After getting the SQLFlow docker version installed, contact support@gudusoft.com with your SQLFlow Id to obtain a 1-month temporary license.
  • The docker version uses the same user management logic as SQLFlow On-Premise. It has the admin account and the basic account.
1
2
3
4
5
6
7
Admin Account
username: admin@local.gudusoft.com
password: admin

Basic Account
username: user@local.gudusoft.com
password: user

Restful API

By default, every SQLFlow API call requires a userId and a token. The token is obtained by calling the generateToken API with your userId and secretKey — the same handshake used by the Cloud version.

Optional: skip the token on Docker / On-Premise

On the Docker / On-Premise version, SQLFlow can be configured to accept API calls with only a userId — no token and no secretKey handshake required. This is useful when you just want to log or attribute calls to a specific user inside your own network.

Edit conf/gudu_sqlflow.conf inside the container and change:

1
ignore_user_token = false

to:

1
ignore_user_token = true

To edit the file without leaving the host:

1
2
docker exec -it sqlflow /bin/bash
vi /sqlflow/conf/gudu_sqlflow.conf

Then restart the container so the new setting takes effect:

1
docker restart sqlflow

After the restart, every API request must include a valid userId; the token is no longer checked.

Running Docker on a remote server

If SQLFlow is running on a remote server, you have two ways to reach the UI from your laptop.

Option A: Open port 8165 to the network

Use http://<your-server-ip>:8165/ and make sure the port is open in your firewall:

1
sudo ufw allow 8165/tcp

Option B: SSH tunnel (recommended — no firewall change needed)

An SSH tunnel forwards port 8165 on your laptop to port 8165 on the server over your existing SSH connection, so SQLFlow stays completely private to the server's localhost. From your laptop:

1
ssh -L 8165:localhost:8165 ubuntu@<server-ip>

Keep that terminal open. Then on your laptop open:

1
http://localhost:8165/

Traffic flows your browser → laptop port 8165 → encrypted SSH tunnel → server's localhost:8165 → SQLFlow container. The container port never touches the public internet, so you can leave the server's firewall closed on 8165.

If you authenticate with a key file, add -i:

1
ssh -i ~/.ssh/your-key.pem -L 8165:localhost:8165 ubuntu@<server-ip>

Internal Services

SQLFlow runs four Java-based microservices inside the container:

Service Port Purpose
Eureka 8761 Service registry (internal)
SQLService 8083 SQL parsing engine (internal)
LayoutService 8084 Graph layout engine (internal)
gspLive 8165 Main web application (exposed)

Only port 8165 needs to be exposed on the host. The container has a built-in Docker health check (HEALTHCHECK) that probes http://localhost:8165/ every 30 seconds after a 120-second start period.

To verify all services are running:

1
docker exec sqlflow ps aux | grep java

You should see four Java processes.

Data Persistence

SQLFlow writes two things that must survive container restarts: its database files (user sessions, uploaded SQL, DuckDB data) and its log files. The volumes: block in your docker-compose.yml is what keeps them alive.

Docker volume name (on your Ubuntu host) Container path (inside the Docker container) Contents
sqlflow-data /sqlflow/data User sessions, uploaded SQL files, DuckDB databases
sqlflow-logs /sqlflow/log Application log files

Two points that commonly confuse first-time Docker users:

1. /sqlflow/data and /sqlflow/log are paths inside the container, not on your Ubuntu machine. You do not need to create a /sqlflow folder on your host — those paths live inside the Docker image and are already set up for you. From SQLFlow's point of view (running inside the container), it simply writes to /sqlflow/data; from your host's point of view that folder does not exist.

2. sqlflow-data and sqlflow-logs are named volumes, not folders under your ~/sqlflow directory. ~/sqlflow on your host only holds the docker-compose.yml file you created. The actual data lives in a Docker-managed area:

1
2
/var/lib/docker/volumes/sqlflow-data/_data/
/var/lib/docker/volumes/sqlflow-logs/_data/

That directory is owned by root — you normally manage it through docker volume commands rather than browsing it directly:

1
2
docker volume ls                      # list all volumes
docker volume inspect sqlflow-data    # show its real path + metadata

If you'd rather have the data under ~/sqlflow/

If you want the data to appear as a normal folder next to your docker-compose.yml, change the volumes: lines from names (sqlflow-data) to paths (./data) — Docker will mount those paths directly:

1
2
3
    volumes:
      - ./data:/sqlflow/data
      - ./log:/sqlflow/log

Also delete the top-level volumes: block at the bottom of the file (it is only needed for named volumes). After docker compose up -d, your home folder will look like:

1
2
3
4
~/sqlflow/
├── docker-compose.yml
├── data/   <-- SQLFlow writes here; same as /sqlflow/data inside the container
└── log/    <-- log files; same as /sqlflow/log inside the container

Now you can ls, tar, or cp those folders directly — no Docker command needed.

Back up your data

Named volumes (the default):

1
2
docker run --rm -v sqlflow-data:/data -v $(pwd):/backup alpine \
  tar czf /backup/sqlflow-data-backup.tar.gz -C /data .

Bind-mount folders under ~/sqlflow/:

1
tar czf ~/sqlflow-data-backup.tar.gz -C ~/sqlflow/data .

Upgrading SQLFlow

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# 1. Pull the latest image
docker pull gudusqlflow/sqlflow:latest

# 2. Stop and remove the old container (data volumes are preserved)
docker stop sqlflow && docker rm sqlflow

# 3. Start a new container with the updated image
docker run -d \
  --name sqlflow \
  -p 8165:8165 \
  --memory=8g \
  -v sqlflow-data:/sqlflow/data \
  -v sqlflow-logs:/sqlflow/log \
  --restart unless-stopped \
  gudusqlflow/sqlflow:latest

With Docker Compose:

1
2
docker compose pull
docker compose up -d

Invoke the SQLFlow API from Docker Container

The SQLFlow API is available as soon as the container is up and the license file has been uploaded. Point your client at http://<your-host>:8165/. By default, every call requires a userId and a token generated from userId + secretKey via the generateToken API — see Restful API above for how to switch Docker / On-Premise into userId-only mode.

Ready-to-run samples and the full REST reference:

Troubleshooting

Container starts but web UI is not accessible

Services need 90-120 seconds to initialize. Check progress with docker logs -f sqlflow and wait for All services started. You can also probe the health status directly:

1
docker inspect --format='{{.State.Health.Status}}' sqlflow

Port 8165 is already in use

Map SQLFlow to a different host port. For example, to use 9165:

1
docker run -d --name sqlflow -p 9165:8165 gudusqlflow/sqlflow:latest

Then access SQLFlow at http://localhost:9165/.

Out of memory errors

The container needs at least 8 GB of memory. Recreate it with a higher limit:

1
2
3
4
5
docker stop sqlflow && docker rm sqlflow

docker run -d --name sqlflow -p 8165:8165 --memory=10g \
  -v sqlflow-data:/sqlflow/data -v sqlflow-logs:/sqlflow/log \
  --restart unless-stopped gudusqlflow/sqlflow:latest

On Windows with Docker Desktop, also increase memory in Settings > Resources > Advanced.

Container keeps restarting

Check the logs for errors:

1
2
3
4
5
6
7
docker logs --tail 200 sqlflow

# Individual service logs
docker exec sqlflow cat /sqlflow/log/eureka.log
docker exec sqlflow cat /sqlflow/log/gspLive.log
docker exec sqlflow cat /sqlflow/log/sqlservice.log
docker exec sqlflow cat /sqlflow/log/layoutservice.log

401 "Invalid user or token" when accessing from a different hostname

If SQLFlow works fine at http://<server-ip>:8165/ but returns

1
2
3
code: 401
error: "Invalid user or token, access deny."
userId: "490fc09b60044068919aa01f9be2bd1e"

as soon as you open it on a different URL (for example http://localhost:8165/ via an SSH tunnel), the cause is stale credentials cached in your browser for that origin — not a backend or tunnel problem.

The browser treats http://<server-ip>:8165 and http://localhost:8165 as two separate origins, each with its own localStorage and cookie jar. The SQLFlow frontend, if it finds a userId + token stored for the current origin (usually left over from a previous SQLFlow Cloud session or an older local install), will attach them to every API request. The Docker backend does not know that user, so it returns 401. The origin you first used works because nothing stale was stored against it, so no credentials are sent and the Docker backend lets you in.

Fix — clear site data for the failing origin

In Chrome/Edge:

  1. Open the failing URL (e.g. http://localhost:8165/).
  2. DevTools → ApplicationStorageClear site data (check Cookies and Local storage).
  3. Hard reload (Ctrl+Shift+R / Cmd+Shift+R).

Or quicker: open the URL in an Incognito / Private window. If it works there, stored credentials in your normal profile are the cause — clear them and reload.

Get License fail (Centos stream9 only)

The following issue only occurs in Centos stream9; we don't foresee the error in Centos 7, Centos stream8, Ubuntu20 or Debian11.

If you got this error after launching the docker image, check first whether the docker image is running correctly:

1
docker ps -a

If the container status is up, go into the container:

1
docker exec -it sqlflow /bin/bash

Go to the SQLFlow lib folder and try launching eureka directly:

1
2
cd /sqlflow/lib
java -jar eureka.jar

The error in the following capture means that there is not enough memory for the docker.

You can raise the Docker daemon's file-descriptor limit with:

1
2
sudo mkdir -p /etc/systemd/system/docker.service.d
sudo vi /etc/systemd/system/docker.service.d/override.conf

and then enter:

1
2
3
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --default-ulimit nofile=65536:65536 -H fd://

Save and reload:

1
2
sudo systemctl daemon-reload
sudo systemctl restart docker

Appendix: Install Docker Engine on Ubuntu 24.04 (first-time users)

If you have never used Docker before and you are on a fresh Ubuntu 24.04 server, follow the steps below once. After this, you can come back to Using Docker Compose and start SQLFlow.

The quick way (one line) is sudo apt-get install -y docker.io docker-compose-plugin. Most users want the steps below instead, because they pull Docker directly from Docker's official repository — you get the latest version and security updates straight from Docker.

Step 1. Check if Docker is already installed

1
docker --version
  • If you see output like Docker version 27.x.x, build xxxxxxx, Docker is already installed — skip ahead to Step 6.
  • If you see command not found, continue with Step 2.

Step 2. Update the package index and install prerequisites

1
2
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg

These three packages let apt download and verify Docker's packages over HTTPS:

Package Purpose
ca-certificates Lets your system trust Docker's HTTPS server when downloading packages.
curl Used in Step 3 to fetch Docker's GPG signing key.
gnupg Verifies that the Docker packages you install are genuine and untampered.

Step 3. Add Docker's official GPG key

1
2
3
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

Step 4. Add the Docker repository

Copy and paste this entire block as-is — it auto-detects your Ubuntu version (noble on 24.04):

1
2
3
4
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Step 5. Install Docker Engine and the Compose plugin

1
2
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

This installs:

  • docker-ce — the Docker Engine itself (the background service that runs containers).
  • docker-ce-cli — the docker command you type in the terminal.
  • containerd.io — the low-level container runtime that Docker uses under the hood.
  • docker-buildx-plugin — modern image-build support (optional but installed by default).
  • docker-compose-plugin — adds the docker compose subcommand that you will use to start SQLFlow.

Step 6. Verify the installation

1
2
3
docker --version
docker compose version
sudo docker run hello-world

If the third command prints "Hello from Docker!", everything is working.

By default only root can talk to the Docker daemon, which is why Step 6 needed sudo. To run Docker as your normal user, add yourself to the docker group:

1
sudo usermod -aG docker $USER

Then log out and log back in (SSH users: disconnect and reconnect) for the group change to take effect. After that:

1
docker run hello-world

should work without sudo.

Security note: Members of the docker group have effective root access on the host. Only add users you trust.

Step 8. Make sure Docker starts on boot

On Ubuntu this is already the default after installing via apt, but you can confirm with:

1
2
sudo systemctl enable docker
sudo systemctl status docker

You are now ready to continue with Using Docker Compose — create the docker-compose.yml file and run docker compose up -d.