Gudu SQLFlow Product Docs
  • 1. Introduction
    • What is Gudu SQLFlow?
      • What SQLFlow can do
      • Architecture Overview
    • Getting Started
      • Sign up a new account
        • Team Management
        • Delete My Account
        • Activate by entering a coupon
      • How to use SQLFlow
      • Different modes in Gudu SQLFlow
        • Query mode
        • Job mode
      • Basic Usage
      • Convert SQL to E-R Diagram
      • Colors in SQLFlow Diagram
      • Show call relationship
    • Installation
      • Version and Users
        • Cloud and On-Premise version
        • SQLFlow before Version 6
          • For older version SQLFlow under Linux
          • For older version SQLFlow under MacOS
          • For older version SQLFlow under Windows
      • Linux
      • MacOS
      • Windows
      • Docker
      • Clickhouse Installation
        • Clickhouse For CentOs
        • Clickhouse For Ubuntu/Debian/RHEL
      • Troubleshooting
      • Upgrade
      • Third Party Components
      • Renew License File
    • UI
      • SQLText Editor
      • Schema Explorer
      • Diagram Panel
      • Settings
      • Job Management
        • Job Sources
    • Dlineage Tool
      • Overview
      • Usage
        • Analyze data linege from SQL files
        • Analyze data linege from a database
        • Resolve the ambiguous columns in SQL query
        • Map the DataFlowAnalyzer and the settings on SQLFlow UI
        • Settings
      • Dataflow.xml structure
      • FAQ
  • 2. CONCEPTS
    • Data Lineage Basics
      • Dataflow
        • Relations generated by SQLFlow
      • Direct Dataflow
      • Indirect Dataflow
      • Aggregate function and Dataflow
      • Dataflow chain
    • Data Lineage Format Reference
  • 3. API Docs
    • Prerequisites
    • Using the Rest API
    • SQLFlow Rest API reference
      • User Interface
      • Generation Interface
        • /sqlflow
        • /sqlflow/selectedgraph/table_level_lineage
        • /sqlflow/selectedgraph/image
        • /sqlflow/graph
        • /sqlflow/graph/table_level_lineage
        • /sqlflow/graph/image
        • /sqlflow/downstreamGraph
        • /sqlflow/upstreamGraph
        • /sqlflow/erdiagramSelectGraph
        • /sqlflow/leftMostSourceTableGraph
      • Job Interface
        • /submitUserJob
        • /displayUserJobSummary
        • /displayUserJobsSummary
        • /exportLineageAsJson
        • /exportFullLineageAsJson
        • /exportLineageAsGraphml
        • /submitPersistJob
        • /displayUserLatestJobTableLevelLineage
      • Export Image
      • Export CSV
        • /sqlflow/exportFullLineageAsCsv
        • /job/exportFullLineageAsCsv
    • Swagger UI
    • Export the data lineage result
    • Python
      • Basic Usage
      • Advanced Usage
    • SQL Parser API
      • checkSyntax
  • 4. SQLFlow Widget
    • Widget Get started
    • Usages
    • Widget API Reference
  • 5. Databases
    • Database Objects
      • Azure
      • DB2
  • 6. SQLFlow-ingester
    • Introduction
      • SQLFlow-Exporter
      • SQLFlow-Extractor
      • SQLFlow-Submitter
    • Get Started
      • SQL Server
    • SQLFlow-Ingester Java API Usage
    • Understand the format of exported data
      • Oracle
      • Microsoft SQL Server
      • MySQL
      • PostgreSQL
    • List of Supported dbVendors
    • Git Repo
    • Third Party Components
  • 7. Reference
    • Lineage Model
      • Json Format Lineage Model
      • XML Format Lineage Model
      • Data Lineage Elements
    • Database Model
  • 8. other
    • FAQ
      • Handling Internal Database
      • Delete Your Account
      • Table Form Data Without Intermediates
      • Not all schema exported from Oracle
      • Lineage Customization
    • Roadmap
    • SQL Samples
      • Exchange table partition
      • Generate relationship for renamed table
      • Snowflake table function lineage detection
    • Change Logs
    • SQLFlow with Oracle XML functions
    • Major Organizations Utilizing SQLFlow
Powered by GitBook
On this page
  • The Docker Image
  • Create the SQLFlow Container
  • Invoke the SQLFlow API from Docker Container
  • Generate token
  • Build Lineage based on SQL text
  • Upload SQL and Retrieve Lineage in CSV
  • Troubleshooting
  • 1. Get License fail
  1. 1. Introduction
  2. Installation

Docker

https://e.gitee.com/gudusoft/docs/824458/file/1969050?sub_id=5806941

PreviousWindowsNextClickhouse Installation

Last updated 10 months ago

If you are keen on virtualization and you have installed on your machine, you can also pull SQLFlow's docker image. However, do please note that:

  • The SQLFlow docker version is for testing purposes only

  • After getting the SQLFlow docker version installed, contact with your SQLFlow Id to obtain a 1-month temporary license.

  • The docker version uses the same user management logic as SQLFlow On-Premise. It has the admin account and the basic account.

Admin Account
username: admin@local.gudusoft.com
password: admin

Basic Account
username: user@local.gudusoft.com
password: user

The Docker Image

Pull the sqlflow docker image:

docker pull gudusqlflow/sqlflow-simple-trial:6.1.0.0

Create the SQLFlow Container

docker run -d -p 7090:8165 --name mysqlflow gudusqlflow/sqlflow-simple-trial:6.1.0.0

The 7090 in the above command will be the port to visit SQLFlow UI. You can change the port if 7090 is occupied in your machine.

Use http://<your ip>:<port> to reach sqlflow UI.

Invoke the SQLFlow API from Docker Container

The SQLFlow API will be available once you have uploaded the license file and get the docker container up running.

There's no difference between invoking SQLFlow API from your docker container and from SQLFlow Cloud/On-Premise. Please check our python demo if you need any samples:

Following are some samples to invoke the SQLFlow API from the docker container in Python:

# the user id of sqlflow web or client, required true
userId = ''

# the secret key of sqlflow user for webapi request, required true
screctKey = ''

# sqlflow server, For the cloud version, the value is https://api.gudusoft.com
server = 'http://127.0.0.1'

# sqlflow api port, For the cloud version, the value is 443
port = '8165'

# For the cloud version
# server = 'https://api.gudusoft.com'
# port = '443'

# The token is generated from userid and usersecret. It is used in every Api invocation.
token = GenerateToken.getToken(userId, server, port, screctKey)

# delimiter of the values in CSV, default would be ',' string
delimiter = ','

# export_include_table, string
export_include_table = ''

# showConstantTable, boolean
showConstantTable = 'true'

# Whether treat the arguments in COUNT function as direct Dataflow, boolean
treatArgumentsInCountFunctionAsDirectDataflow = ''

# database type,
# dbvazuresql
# dbvbigquery
# dbvcouchbase
# dbvdb2
# dbvgreenplum
# dbvhana
# dbvhive
# dbvimpala
# dbvinformix
# dbvmdx
# dbvmysql
# dbvnetezza
# dbvopenedge
# dbvoracle
# dbvpostgresql
# dbvredshift
# dbvsnowflake
# dbvmssql
# dbvsparksql
# dbvsybase
# dbvteradata
# dbvvertica
dbvendor = 'dbvoracle'

# sql text
# sqltext = 'select * from table'
# data = GenerateLineageParam.buildSqltextParam(userId, token, delimiter, export_include_table, showConstantTable, treatArgumentsInCountFunctionAsDirectDataflow, dbvendor, sqltext)
# resp = getResult(server, port, data, '')

# sql file
sqlfile = 'test.sql'

Generate token

def getToken(userId, server, port,screctKey):

	if userId == 'gudu|0123456789':
		return 'token'

	url = '/api/gspLive_backend/user/generateToken'
    	if 'api.gudusoft.com' in server:
        	url = '/gspLive_backend/user/generateToken'
	if port != '':
		url = server + ':' + port + url
	else:
		url = server + url
	mapA = {'secretKey': screctKey, 'userId': userId}
	header_dict = {"Content-Type": "application/x-www-form-urlencoded"}

	print('start get token.')
	try:
		r = requests.post(url, data=mapA, headers=header_dict, verify=False)
		print(r)
	except Exception:
		print('get token failed.')
	result = json.loads(r.text)

	if result['code'] == '200':
		print('get token successful.')
		return result['token']
	else:
		print(result['error'])

Build Lineage based on SQL text

def buildSqltextParam(userId, token, delimiter, export_include_table, showConstantTable,
                      treatArgumentsInCountFunctionAsDirectDataflow, dbvendor, sqltext):
    data = {'dbvendor': dbvendor, 'token': token, 'userId': userId}
    if delimiter != '':
        data['delimiter'] = delimiter
    if export_include_table != '':
        data['export_include_table'] = export_include_table
    if showConstantTable != '':
        data['showConstantTable'] = showConstantTable
    if treatArgumentsInCountFunctionAsDirectDataflow != '':
        data['treatArgumentsInCountFunctionAsDirectDataflow'] = treatArgumentsInCountFunctionAsDirectDataflow
    if sqltext != '':
        data['sqltext'] = sqltext
    return data

Upload SQL and Retrieve Lineage in CSV

def getResult(userId, token, server, port, delimiter, export_include_table, showConstantTable,
              treatArgumentsInCountFunctionAsDirectDataflow, dbvendor, sqltext, sqlfile):
    url = "/api/gspLive_backend/sqlflow/generation/sqlflow/exportFullLineageAsCsv"
    if port != '':
        url = server + ':' + port + url
    else:
        url = server + url

    files = ''
    if sqlfile != '':
        if os.path.isdir(sqlfile):
            print('The SQL file cannot be a directory.')
            sys.exit(0)
        files = {'sqlfile': open(sqlfile, 'rb')}

    data = {'dbvendor': dbvendor, 'token': token, 'userId': userId}
    if delimiter != '':
        data['delimiter'] = delimiter
    if export_include_table != '':
        data['export_include_table'] = export_include_table
    if showConstantTable != '':
        data['showConstantTable'] = showConstantTable
    if treatArgumentsInCountFunctionAsDirectDataflow != '':
        data['treatArgumentsInCountFunctionAsDirectDataflow'] = treatArgumentsInCountFunctionAsDirectDataflow
    if sqltext != '':
        data['sqltext'] = sqltext
    datastr = json.dumps(data)

    print('start get csv result from sqlflow.')
    try:
        if sqlfile != '':
            response = requests.post(url, data=eval(datastr), files=files)
        else:
            response = requests.post(url, data=eval(datastr))
    except Exception as e:
        print('get csv result from sqlflow failed.', e)
        sys.exit(0)

    print('get csv result from sqlflow successful. result : ')
    print()
    return response.text

Troubleshooting

The following issue only occurs in Centos stream9, we don't foresee the error in Centos 7, Centos stream8, Ubuntu20 or Debian11.

1. Get License fail

If you got this error after launching the docker image, check firstly whether the docker image is running correctly:

docker ps -a

In case of the docker status is up, the try to go into the container with:

docker exec -it mysqlflow /bin/bash

Go to the SQLFlow jar folder:

cd wings/sqlflow/backend/lib

Try directly launch the jar file:

java -jar eureka.jar

The error in the following capture means that there is no enough memory for the docker.

You can assign more memory to the docker with:

mkdir /etc/systemd/system/docker.service.d
sudo vi /etc/systemd/system/docker.service.d/override.conf

and then enter

[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --default-ulimit nofile=65536:65536 -H fd://

Save and run sudo systemctl daemon-reload and sudo systemctl restart docker.

sudo systemctl daemon-reload
sudo systemctl restart docker

The mysqlflow is the name of the container. For more information of the container creation, you can check .

the official Docker Doc
https://github.com/sqlparser/sqlflow_public/blob/master/api/python/basic/GenerateDataLineageDemo.py
User Interface
/sqlflow/exportFullLineageAsCsv
Docker Engine
support@gudusoft.com
https://github.com/sqlparser/sqlflow_public/tree/master/api/python/basicgithub.com