Gudu SQLFlow Product Docs
  • 1. Introduction
    • What is Gudu SQLFlow?
      • What SQLFlow can do
      • Architecture Overview
    • Getting Started
      • Sign up a new account
        • Team Management
        • Delete My Account
        • Activate by entering a coupon
      • How to use SQLFlow
      • Different modes in Gudu SQLFlow
        • Query mode
        • Job mode
      • Basic Usage
      • Convert SQL to E-R Diagram
      • Colors in SQLFlow Diagram
      • Show call relationship
    • Installation
      • Version and Users
        • Cloud and On-Premise version
        • SQLFlow before Version 6
          • For older version SQLFlow under Linux
          • For older version SQLFlow under MacOS
          • For older version SQLFlow under Windows
      • Linux
      • MacOS
      • Windows
      • Docker
      • Clickhouse Installation
        • Clickhouse For CentOs
        • Clickhouse For Ubuntu/Debian/RHEL
      • Troubleshooting
      • Upgrade
      • Third Party Components
      • Renew License File
    • UI
      • SQLText Editor
      • Schema Explorer
      • Diagram Panel
      • Settings
      • Job Management
        • Job Sources
    • Dlineage Tool
      • Overview
      • Usage
        • Analyze data linege from SQL files
        • Analyze data linege from a database
        • Resolve the ambiguous columns in SQL query
        • Map the DataFlowAnalyzer and the settings on SQLFlow UI
        • Settings
      • Dataflow.xml structure
      • FAQ
  • 2. CONCEPTS
    • Data Lineage Basics
      • Dataflow
        • Relations generated by SQLFlow
      • Direct Dataflow
      • Indirect Dataflow
      • Aggregate function and Dataflow
      • Dataflow chain
    • Data Lineage Format Reference
  • 3. API Docs
    • Prerequisites
    • Using the Rest API
    • SQLFlow Rest API reference
      • User Interface
      • Generation Interface
        • /sqlflow
        • /sqlflow/selectedgraph/table_level_lineage
        • /sqlflow/selectedgraph/image
        • /sqlflow/graph
        • /sqlflow/graph/table_level_lineage
        • /sqlflow/graph/image
        • /sqlflow/downstreamGraph
        • /sqlflow/upstreamGraph
        • /sqlflow/erdiagramSelectGraph
        • /sqlflow/leftMostSourceTableGraph
      • Job Interface
        • /submitUserJob
        • /displayUserJobSummary
        • /displayUserJobsSummary
        • /exportLineageAsJson
        • /exportFullLineageAsJson
        • /exportLineageAsGraphml
        • /submitPersistJob
        • /displayUserLatestJobTableLevelLineage
      • Export Image
      • Export CSV
        • /sqlflow/exportFullLineageAsCsv
        • /job/exportFullLineageAsCsv
    • Swagger UI
    • Export the data lineage result
    • Python
      • Basic Usage
      • Advanced Usage
    • SQL Parser API
      • checkSyntax
  • 4. SQLFlow Widget
    • Widget Get started
    • Usages
    • Widget API Reference
  • 5. Databases
    • Database Objects
      • Azure
      • DB2
  • 6. SQLFlow-ingester
    • Introduction
      • SQLFlow-Exporter
      • SQLFlow-Extractor
      • SQLFlow-Submitter
    • Get Started
      • SQL Server
    • SQLFlow-Ingester Java API Usage
    • Understand the format of exported data
      • Oracle
      • Microsoft SQL Server
      • MySQL
      • PostgreSQL
    • List of Supported dbVendors
    • Git Repo
    • Third Party Components
  • 7. Reference
    • Lineage Model
      • Json Format Lineage Model
      • XML Format Lineage Model
      • Data Lineage Elements
    • Database Model
  • 8. other
    • FAQ
      • Handling Internal Database
      • Delete Your Account
      • Table Form Data Without Intermediates
      • Not all schema exported from Oracle
      • Lineage Customization
    • Roadmap
    • SQL Samples
      • Exchange table partition
      • Generate relationship for renamed table
      • Snowflake table function lineage detection
    • Change Logs
    • SQLFlow with Oracle XML functions
    • Major Organizations Utilizing SQLFlow
Powered by GitBook
On this page
  • This page is for SQLFlow 5.x.x.x
  • Prerequisites
  • Setup Environment (Ubuntu for example)
  • Upload Files
  • Nginx Reverse Proxy
  • Customize the port
  • Start Backend Services
  • Start Frontend Services
  • Gudu SQLFlow License file
  • Backend Services Configuration
  • Sqlflow client api call
  • Enable Regular Job
  1. 1. Introduction
  2. Installation
  3. Version and Users
  4. SQLFlow before Version 6

For older version SQLFlow under Linux

PreviousSQLFlow before Version 6NextFor older version SQLFlow under MacOS

Last updated 1 year ago

This page is for SQLFlow 5.x.x.x

Please refer to the latest install manual if you are using the latest SQFlow(version > 6.0.0.0):

You can check this page for the SQLFlow berfore version


Prerequisites

  • A linux server with at least 8GB memory (ubuntu 20.04 is recommended).

  • Java 8

  • Nginx web server.

  • Port needs to be opened. (80, 8761,8081,8083. Only 80 port need to be opened if you setup the nginx reverse proxy as mentioned in the below)

Setup Environment (Ubuntu for example)

sudo apt-get update
sudo apt-get install nginx -y
sudo apt-get install default-jre -y	

CentOS

Upload Files

create a directory :

# it must be created start with root path
sudo mkdir -p /wings/sqlflow

upload your zip file including backend and frontend file to sqlflow folder, and unzip like this :

unzip sqlflow.zip

You should get files organized like this:

/wings/
└── sqlflow
    ├── backend
    │   ├── bin
    │   │   ├── backend.bat 
    │   │   ├── backend.sh
    │   │   ├── eureka.bat
    │   │   ├── eureka.sh
    │   │   ├── eureka.vbs
    │   │   ├── gspLive.bat
    │   │   ├── gspLive.sh  
    │   │   ├── gspLive.vbs  
    │   │   ├── monitor.bat
    │   │   ├── monitor.sh 
    │   │   ├── sqlservice.bat
    │   │   ├── sqlservice.sh 
    │   │   ├── sqlservice.vbs
    │   │   ├── stop.bat
    │   │   ├── stop.sh
    │   ├── lib
    │   │   ├── eureka.jar
    │   │   ├── gspLive.jar  
    │   │   ├── sqlservice.jar
    │   ├── conf
    │   │   ├── gudu_sqlflow_license.txt     
    │   │   ├── gudu_sqlflow.conf     
    │   ├── data
    │   │   ├── job  
    │   │   │   ├── task     
    │   │   │   ├── {userid}   
    │   │   ├── schema     
    │   │   ├── session     
    │   │   ├── version     
    │   ├── log
    │   ├── tmp
    │   │   └── cache  
    └── frontend
        ├── config.public.json
        ├── images
        │   ├── check.svg
        │   ├── Join.svg
        │   ├── pic_Not logged in.png
        │   └── visualize.svg
        ├── index.********************.css
        ├── index.********************.css
        ├── index.********************.css
        ├── index.********************.css
        └── index.html
        └── lang
        ├── page.*********************.js
        ├── page.*********************.js
        ├── page.*********************.js
        ├── page.*********************.js
        ├── public.*********************.js
        ├── widget
        │   ├── index.js
        │   ├── sqlflow-library.version.css
        │   └── sqlflow-library.version.js

set folder permissions :

sudo chmod -R 755 /wings/sqlflow

Nginx Reverse Proxy

1. Config Nginx

open your nginx configuration file ( at /etc/nginx/sites-available/default under ubuntu ), add a server :

server {
	listen 80 default_server;
	listen [::]:80 default_server;

	root /wings/sqlflow/frontend/;
	index index.html;

	location ~* ^/index.html {
		add_header X-Frame-Options deny; # remove this line if you want embed sqlflow in iframe
		add_header Cache-Control no-store;
	}

	location / {
		try_files $uri $uri/ =404;
	}
	
	location /api/ {
		proxy_pass http://127.0.0.1:8081/;
		proxy_connect_timeout 600s ;
		proxy_read_timeout 600s;
		proxy_send_timeout 600s;
		
		proxy_set_header Host $host;
		proxy_set_header X-Real-IP $remote_addr;
		proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
		proxy_set_header User-Agent $http_user_agent;  
	}
}

note that 8081 in proxy_pass http://127.0.0.1:8081/ should be consistent to gspLive.jar's port.

/api is mapped to http://127.0.0.1:8081 in the above configration. This is useful if you company doesn't allow access 8081 port from the external.

2. modify frontend configuration file config.private.json

  • Open the configration file "/wings/sqlflow/frontend/config.private.json"

  • Modify the ApiPrefix attribute

  "ApiPrefix": "/api"

Customize the port

If you don't want to change the default service port you can just ignore this section. Otherwise this section will show you how to customize the port.

1. Default port

  1. Web port is 80

  2. SQLFlow backend service port:

File
Port

eureka.jar

8761

gspLive.jar

8081

sqlservice.jar

8083

2. Modify the web port

Change the default web port from 80 to 9000 (or any port you like).

3. Modify java service port

Change the default gspLive port from 8081 to 9001(or any port you like).

Step 1: Change the port in nginx config file

Step 2: Change the port in gspLive.sh(gspLive.bat)

Start Backend Services

start service in background:

sudo /wings/sqlflow/backend/bin/backend.sh

please allow 3-5 minutes to start the service.

use ps -ef|grep java to check those 3 processing are running.

ubuntu   11047     1  0 Nov02 ?        00:04:44 java -server -jar eureka.jar
ubuntu   11076     1  0 Nov02 ?        00:04:11 java -server -Xmn512m -Xms2g -Xmx2g -Djavax.accessibility.assistive_technologies=  -jar sqlservice.jar
ubuntu   11114     1  0 Nov02 ?        00:05:17 java -server -jar gspLive.jar

Start Frontend Services

start your nginx :

sudo service nginx start

or reload :

sudo nginx -s reload

open http://yourdomain.com/ to see the SQLFlow.

open http://yourdomain.com/api/gspLive_backend/doc.html?lang=en to see the Restful API documention. OR

open http://yourdomain.com:8081/gspLive_backend/doc.html?lang=en to see the Restful API documention.

Gudu SQLFlow License file

If this is the first time you setup the Gudu SQLFlow on a new machine, then, you will see this license UI:

  1. You send us the Gudu SQLFlow Id (6 characters in red).

  2. We will generate a license file for you based on this id.

  3. You upload the license file by click the "upload license file" link.

Backend Services Configuration

sqlflow provides several optioins to control the service analysis logic. Open the sqlservice configuration file(conf/gudu_sqlflow.conf)

  • relation_limit: default value is 1000. When the count of selected object relations is greater than relation_limit, sqlflow will fallback to the simple mode, ignore all the record sets. If the relations of simple mode are still greater than relation_limit, sqlflow will only show the summary information.

  • big_sql_size: default value is 4096. If the sql length is greater than big_sql_size, sqlflow submit the sql in the work queue and execute it. If the work queue is full, sqlflow throws an exception and return error message "Sorry, the service is busy. Please try again later."

Sqlflow client api call

  1. Get userId from gudu_sqlflow.conf

  • Open the configration file "/wings/sqlflow/backend/conf/gudu_sqlflow.conf"

  • The value of anonymous_user_id field is webapi userId

  anonymous_user_id=xxx
  • Note: on-promise mode, webapi call doesn't need the token parameter

  1. Test webapi by curl

    • test sql:

      select name from user
    • curl command:

      curl -X POST "http://yourdomain.com/api/gspLive_backend/sqlflow/generation/sqlflow" -H "accept:application/json;charset=utf-8" -F "userId=YOUR USER ID HERE" -F  "dbvendor=dbvoracle" -F "sqltext=select name from user"
    • response:

      {
        "code": 200,
        "data": {
          "dbvendor": "dbvoracle",
          "dbobjs": [
            ...
          ],
          "relations": [
            ...
          ]
        },
        "sessionId": ...
      }
    • If the code returns 401, please check the userId is set or the userId is valid.

Enable Regular Job

If you need to enable regular job feature on your sqlflow on-premiser, you will also need to install Clickhouse on your server. Check here for Clickhouse installation:

See

SQLFlow on-premise version
How To Install Nginx on CentOS
How To Install Java on CentOS
sqlflow client api call
Clickhouse Installation
Linux
SQLFlow before Version 6
Installation Guide - Linux