Docker¶
The Docker image is the SQLFlow On-Premise edition. It contains every feature available in the Cloud version — lineage analysis, all 20+ supported database vendors, the web UI, the REST API, SQL file upload, widget embedding, and so on. The image runs entirely inside your own infrastructure, so your SQL never leaves your network.
System Requirements¶
| Resource | Minimum | Recommended |
|---|---|---|
| CPU | 2 cores | 4+ cores |
| Memory (RAM) | 8 GB | 16 GB+ |
| Disk Space | 5 GB free | 10 GB+ free |
| Docker Engine | 20.10+ | Latest stable |
The Docker Image¶
SQLFlow images are published to Docker Hub under gudusqlflow/sqlflow. The latest tag always points to the most recent release; specific releases are tagged with their version number (e.g. 8.2.3.0).
Pull the latest SQLFlow docker image:
1 | |
Or pull a specific version:
1 | |
Using Docker Compose (recommended for first-time users)¶
If you have never used Docker before, this is the easiest path. You do not need to run docker pull, docker run, or remember a long list of flags. The short answer to "do I just need a docker-compose.yml and then docker compose up -d?" is — yes, that's it.
Here is exactly what you do, start to finish:
1. Install Docker Engine¶
Docker Compose ships as a plugin of modern Docker Engine (the docker compose subcommand). The docker-compose-plugin package is only available from Docker's official apt repository — it is not in Ubuntu's default repos. If you run sudo apt-get install -y docker.io docker-compose-plugin against a stock Ubuntu, you will get:
1 | |
- On Ubuntu — follow the full step-by-step install in the Appendix at the end of this page. It adds Docker's official repo and then installs both the engine and the Compose plugin.
- On Windows / macOS — install Docker Desktop; it bundles both the engine and Compose.
Once the install is done, verify:
1 2 | |
2. Create a working folder and a docker-compose.yml file¶
1 | |
Inside that folder, create a file named docker-compose.yml with exactly this content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
That file is your entire configuration — it tells Docker which image to download, which port to expose (8165), how much memory to give it (8 GB), and where to store persistent data.
3. Start SQLFlow with one command¶
From the same folder that contains docker-compose.yml:
1 | |
That single command does everything a first-time user needs:
- Downloads the
gudusqlflow/sqlflow:latestimage from Docker Hub (about 1–2 GB, one-time only). - Creates the persistent volumes for your data and logs.
- Starts the container in the background (
-d= detached). - Sets it to auto-restart if your machine reboots.
Wait 90–120 seconds for all internal services to finish starting, then open your browser — see Access SQLFlow below for the URL and remote-access options.
4. Day-to-day commands¶
All of these run from the same folder as your docker-compose.yml:
| What you want to do | Command |
|---|---|
| Check if it is running | docker compose ps |
| Watch the startup log live | docker compose logs -f |
| Stop SQLFlow (keeps your data) | docker compose down |
| Start it again | docker compose up -d |
| Upgrade to the latest image | docker compose pull && docker compose up -d |
| Remove everything including data | docker compose down -v |
That is the complete workflow. You never need to touch docker run, docker pull, or manage ports and volumes by hand — docker compose reads docker-compose.yml every time and does it for you.
Alternative: Manual docker run¶
If you prefer explicit control over each flag — or you cannot use Docker Compose in your environment — you can start the container directly with docker run.
Option A: Simple run
1 | |
Option B: Run with memory limit and persistent data (recommended)
1 2 3 4 5 6 7 8 | |
Explanation of the flags:
| Flag | Purpose |
|---|---|
-d |
Run in the background (detached mode) |
--name sqlflow |
Give the container a friendly name |
-p 8165:8165 |
Map host port 8165 to the container's 8165. Change the host side if 8165 is already in use (e.g. -p 9165:8165). |
--memory=8g |
Limit memory usage to 8 GB |
-v sqlflow-data:/sqlflow/data |
Persist application data across restarts |
-v sqlflow-logs:/sqlflow/log |
Persist log files across restarts |
--restart unless-stopped |
Automatically restart the container if it crashes or after a reboot |
For more information about container creation, see the official Docker Doc.
.png)
Access SQLFlow¶
Regardless of which method above you used, SQLFlow takes approximately 90–120 seconds to finish starting all internal services. Monitor the startup progress:
1 | |
Wait until you see:
1 | |
Press Ctrl+C to stop following the logs, then open your browser and navigate to:
1 | |
- After getting the SQLFlow docker version installed, contact support@gudusoft.com with your SQLFlow Id to obtain a 1-month temporary license.
- The docker version uses the same user management logic as SQLFlow On-Premise. It has the admin account and the basic account.
1 2 3 4 5 6 7 | |
.png)
Restful API¶
By default, every SQLFlow API call requires a userId and a token. The token is obtained by calling the generateToken API with your userId and secretKey — the same handshake used by the Cloud version.
Optional: skip the token on Docker / On-Premise¶
On the Docker / On-Premise version, SQLFlow can be configured to accept API calls with only a userId — no token and no secretKey handshake required. This is useful when you just want to log or attribute calls to a specific user inside your own network.
Edit conf/gudu_sqlflow.conf inside the container and change:
1 | |
to:
1 | |
To edit the file without leaving the host:
1 2 | |
Then restart the container so the new setting takes effect:
1 | |
After the restart, every API request must include a valid userId; the token is no longer checked.
Running Docker on a remote server¶
If SQLFlow is running on a remote server, you have two ways to reach the UI from your laptop.
Option A: Open port 8165 to the network
Use http://<your-server-ip>:8165/ and make sure the port is open in your firewall:
1 | |
Option B: SSH tunnel (recommended — no firewall change needed)
An SSH tunnel forwards port 8165 on your laptop to port 8165 on the server over your existing SSH connection, so SQLFlow stays completely private to the server's localhost. From your laptop:
1 | |
Keep that terminal open. Then on your laptop open:
1 | |
Traffic flows your browser → laptop port 8165 → encrypted SSH tunnel → server's localhost:8165 → SQLFlow container. The container port never touches the public internet, so you can leave the server's firewall closed on 8165.
If you authenticate with a key file, add -i:
1 | |
Internal Services¶
SQLFlow runs four Java-based microservices inside the container:
| Service | Port | Purpose |
|---|---|---|
| Eureka | 8761 | Service registry (internal) |
| SQLService | 8083 | SQL parsing engine (internal) |
| LayoutService | 8084 | Graph layout engine (internal) |
| gspLive | 8165 | Main web application (exposed) |
Only port 8165 needs to be exposed on the host. The container has a built-in Docker health check (HEALTHCHECK) that probes http://localhost:8165/ every 30 seconds after a 120-second start period.
To verify all services are running:
1 | |
You should see four Java processes.
Data Persistence¶
SQLFlow writes two things that must survive container restarts: its database files (user sessions, uploaded SQL, DuckDB data) and its log files. The volumes: block in your docker-compose.yml is what keeps them alive.
| Docker volume name (on your Ubuntu host) | Container path (inside the Docker container) | Contents |
|---|---|---|
sqlflow-data |
/sqlflow/data |
User sessions, uploaded SQL files, DuckDB databases |
sqlflow-logs |
/sqlflow/log |
Application log files |
Two points that commonly confuse first-time Docker users:
1. /sqlflow/data and /sqlflow/log are paths inside the container, not on your Ubuntu machine. You do not need to create a /sqlflow folder on your host — those paths live inside the Docker image and are already set up for you. From SQLFlow's point of view (running inside the container), it simply writes to /sqlflow/data; from your host's point of view that folder does not exist.
2. sqlflow-data and sqlflow-logs are named volumes, not folders under your ~/sqlflow directory. ~/sqlflow on your host only holds the docker-compose.yml file you created. The actual data lives in a Docker-managed area:
1 2 | |
That directory is owned by root — you normally manage it through docker volume commands rather than browsing it directly:
1 2 | |
If you'd rather have the data under ~/sqlflow/¶
If you want the data to appear as a normal folder next to your docker-compose.yml, change the volumes: lines from names (sqlflow-data) to paths (./data) — Docker will mount those paths directly:
1 2 3 | |
Also delete the top-level volumes: block at the bottom of the file (it is only needed for named volumes). After docker compose up -d, your home folder will look like:
1 2 3 4 | |
Now you can ls, tar, or cp those folders directly — no Docker command needed.
Back up your data¶
Named volumes (the default):
1 2 | |
Bind-mount folders under ~/sqlflow/:
1 | |
Upgrading SQLFlow¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
With Docker Compose:
1 2 | |
Invoke the SQLFlow API from Docker Container¶
The SQLFlow API is available as soon as the container is up and the license file has been uploaded. Point your client at http://<your-host>:8165/. By default, every call requires a userId and a token generated from userId + secretKey via the generateToken API — see Restful API above for how to switch Docker / On-Premise into userId-only mode.
Ready-to-run samples and the full REST reference:
- Python samples on GitHub: sqlflow_public / api / python / basic
- REST API reference: User Interface · Export Lineage as CSV
Troubleshooting¶
Container starts but web UI is not accessible¶
Services need 90-120 seconds to initialize. Check progress with docker logs -f sqlflow and wait for All services started. You can also probe the health status directly:
1 | |
Port 8165 is already in use¶
Map SQLFlow to a different host port. For example, to use 9165:
1 | |
Then access SQLFlow at http://localhost:9165/.
Out of memory errors¶
The container needs at least 8 GB of memory. Recreate it with a higher limit:
1 2 3 4 5 | |
On Windows with Docker Desktop, also increase memory in Settings > Resources > Advanced.
Container keeps restarting¶
Check the logs for errors:
1 2 3 4 5 6 7 | |
401 "Invalid user or token" when accessing from a different hostname¶
If SQLFlow works fine at http://<server-ip>:8165/ but returns
1 2 3 | |
as soon as you open it on a different URL (for example http://localhost:8165/ via an SSH tunnel), the cause is stale credentials cached in your browser for that origin — not a backend or tunnel problem.
The browser treats http://<server-ip>:8165 and http://localhost:8165 as two separate origins, each with its own localStorage and cookie jar. The SQLFlow frontend, if it finds a userId + token stored for the current origin (usually left over from a previous SQLFlow Cloud session or an older local install), will attach them to every API request. The Docker backend does not know that user, so it returns 401. The origin you first used works because nothing stale was stored against it, so no credentials are sent and the Docker backend lets you in.
Fix — clear site data for the failing origin
In Chrome/Edge:
- Open the failing URL (e.g.
http://localhost:8165/). - DevTools → Application → Storage → Clear site data (check Cookies and Local storage).
- Hard reload (
Ctrl+Shift+R/Cmd+Shift+R).
Or quicker: open the URL in an Incognito / Private window. If it works there, stored credentials in your normal profile are the cause — clear them and reload.
Get License fail (Centos stream9 only)¶
The following issue only occurs in Centos stream9; we don't foresee the error in Centos 7, Centos stream8, Ubuntu20 or Debian11.
 (1) (1) (1) (1) (1) (1) (1).png)
If you got this error after launching the docker image, check first whether the docker image is running correctly:
1 | |
If the container status is up, go into the container:
1 | |
Go to the SQLFlow lib folder and try launching eureka directly:
1 2 | |
The error in the following capture means that there is not enough memory for the docker.
 (1).png)
You can raise the Docker daemon's file-descriptor limit with:
1 2 | |
and then enter:
1 2 3 | |
Save and reload:
1 2 | |
Appendix: Install Docker Engine on Ubuntu 24.04 (first-time users)¶
If you have never used Docker before and you are on a fresh Ubuntu 24.04 server, follow the steps below once. After this, you can come back to Using Docker Compose and start SQLFlow.
The quick way (one line) is
sudo apt-get install -y docker.io docker-compose-plugin. Most users want the steps below instead, because they pull Docker directly from Docker's official repository — you get the latest version and security updates straight from Docker.
Step 1. Check if Docker is already installed¶
1 | |
- If you see output like
Docker version 27.x.x, build xxxxxxx, Docker is already installed — skip ahead to Step 6. - If you see
command not found, continue with Step 2.
Step 2. Update the package index and install prerequisites¶
1 2 | |
These three packages let apt download and verify Docker's packages over HTTPS:
| Package | Purpose |
|---|---|
ca-certificates |
Lets your system trust Docker's HTTPS server when downloading packages. |
curl |
Used in Step 3 to fetch Docker's GPG signing key. |
gnupg |
Verifies that the Docker packages you install are genuine and untampered. |
Step 3. Add Docker's official GPG key¶
1 2 3 | |
Step 4. Add the Docker repository¶
Copy and paste this entire block as-is — it auto-detects your Ubuntu version (noble on 24.04):
1 2 3 4 | |
Step 5. Install Docker Engine and the Compose plugin¶
1 2 | |
This installs:
- docker-ce — the Docker Engine itself (the background service that runs containers).
- docker-ce-cli — the
dockercommand you type in the terminal. - containerd.io — the low-level container runtime that Docker uses under the hood.
- docker-buildx-plugin — modern image-build support (optional but installed by default).
- docker-compose-plugin — adds the
docker composesubcommand that you will use to start SQLFlow.
Step 6. Verify the installation¶
1 2 3 | |
If the third command prints "Hello from Docker!", everything is working.
Step 7. Run Docker without sudo (recommended)¶
By default only root can talk to the Docker daemon, which is why Step 6 needed sudo. To run Docker as your normal user, add yourself to the docker group:
1 | |
Then log out and log back in (SSH users: disconnect and reconnect) for the group change to take effect. After that:
1 | |
should work without sudo.
Security note: Members of the
dockergroup have effective root access on the host. Only add users you trust.
Step 8. Make sure Docker starts on boot¶
On Ubuntu this is already the default after installing via apt, but you can confirm with:
1 2 | |
You are now ready to continue with Using Docker Compose — create the docker-compose.yml file and run docker compose up -d.