Gudu SQLFlow Product Docs
  • 1. Introduction
    • What is Gudu SQLFlow?
      • What SQLFlow can do
      • Architecture Overview
    • Getting Started
      • Sign up a new account
        • Team Management
        • Delete My Account
        • Activate by entering a coupon
      • How to use SQLFlow
      • Different modes in Gudu SQLFlow
        • Query mode
        • Job mode
      • Basic Usage
      • Convert SQL to E-R Diagram
      • Colors in SQLFlow Diagram
      • Show call relationship
    • Installation
      • Version and Users
        • Cloud and On-Premise version
        • SQLFlow before Version 6
          • For older version SQLFlow under Linux
          • For older version SQLFlow under MacOS
          • For older version SQLFlow under Windows
      • Linux
      • MacOS
      • Windows
      • Docker
      • Clickhouse Installation
        • Clickhouse For CentOs
        • Clickhouse For Ubuntu/Debian/RHEL
      • Troubleshooting
      • Upgrade
      • Third Party Components
      • Renew License File
    • UI
      • SQLText Editor
      • Schema Explorer
      • Diagram Panel
      • Settings
      • Job Management
        • Job Sources
    • Dlineage Tool
      • Overview
      • Usage
        • Analyze data linege from SQL files
        • Analyze data linege from a database
        • Resolve the ambiguous columns in SQL query
        • Map the DataFlowAnalyzer and the settings on SQLFlow UI
        • Settings
      • Dataflow.xml structure
      • FAQ
  • 2. CONCEPTS
    • Data Lineage Basics
      • Dataflow
        • Relations generated by SQLFlow
      • Direct Dataflow
      • Indirect Dataflow
      • Aggregate function and Dataflow
      • Dataflow chain
    • Data Lineage Format Reference
  • 3. API Docs
    • Prerequisites
    • Using the Rest API
    • SQLFlow Rest API reference
      • User Interface
      • Generation Interface
        • /sqlflow
        • /sqlflow/selectedgraph/table_level_lineage
        • /sqlflow/selectedgraph/image
        • /sqlflow/graph
        • /sqlflow/graph/table_level_lineage
        • /sqlflow/graph/image
        • /sqlflow/downstreamGraph
        • /sqlflow/upstreamGraph
        • /sqlflow/erdiagramSelectGraph
        • /sqlflow/leftMostSourceTableGraph
      • Job Interface
        • /submitUserJob
        • /displayUserJobSummary
        • /displayUserJobsSummary
        • /exportLineageAsJson
        • /exportFullLineageAsJson
        • /exportLineageAsGraphml
        • /submitPersistJob
        • /displayUserLatestJobTableLevelLineage
      • Export Image
      • Export CSV
        • /sqlflow/exportFullLineageAsCsv
        • /job/exportFullLineageAsCsv
    • Swagger UI
    • Export the data lineage result
    • Python
      • Basic Usage
      • Advanced Usage
    • SQL Parser API
      • checkSyntax
  • 4. SQLFlow Widget
    • Widget Get started
    • Usages
    • Widget API Reference
  • 5. Databases
    • Database Objects
      • Azure
      • DB2
  • 6. SQLFlow-ingester
    • Introduction
      • SQLFlow-Exporter
      • SQLFlow-Extractor
      • SQLFlow-Submitter
    • Get Started
      • SQL Server
    • SQLFlow-Ingester Java API Usage
    • Understand the format of exported data
      • Oracle
      • Microsoft SQL Server
      • MySQL
      • PostgreSQL
    • List of Supported dbVendors
    • Git Repo
    • Third Party Components
  • 7. Reference
    • Lineage Model
      • Json Format Lineage Model
      • XML Format Lineage Model
      • Data Lineage Elements
    • Database Model
  • 8. other
    • FAQ
      • Handling Internal Database
      • Delete Your Account
      • Table Form Data Without Intermediates
      • Not all schema exported from Oracle
      • Lineage Customization
    • Roadmap
    • SQL Samples
      • Exchange table partition
      • Generate relationship for renamed table
      • Snowflake table function lineage detection
    • Change Logs
    • SQLFlow with Oracle XML functions
    • Major Organizations Utilizing SQLFlow
Powered by GitBook
On this page
  • Job type
  • Job parameters and UI Settings parameters
  • Summary Result
  1. 1. Introduction
  2. Getting Started
  3. Different modes in Gudu SQLFlow

Job mode

https://e.gitee.com/gudusoft/projects/151613/docs/655632/file/1546243?sub_id=5928451

PreviousQuery modeNextBasic Usage

Last updated 1 month ago

SQLFlow Job mode is dedicated for handling large amounts of SQL Scripts or directly connecting to the target database server for the data linenage analysis. Config parameters for the analysis should be given when creating a new SQLFlow Job and the parameters cannot be changed once submitted.

Job type

There are two job types:

  • Simple Job (Standard Job)

  • Regular Job (Incremental Job)

Both Simple Job and Regular Job support reading large amounts of SQL files or analysis through DB directly. There are some differences between Simple Job and Regular Job.

Simple Job

  • Possible to add some configs. Once submitted, the configs cannot be changed

  • Result will be persisted in the file system as files

  • Possible to do the Left Most analysis (Left Most: given a->b->c, show a->c )

  • Data lineage result is saved in dataflow.xml file

dataflowOfAggregateFunction: direct (Configurable Parameter)
ignoreFunction:true (Configurable Parameter)
ignoreRecordSet:true (Configurable Parameter)
showConstantTable:(Configurable Parameter)
showRelationType:fdd (Configurable Parameter)
showTransform: false (Configurable Parameter)

simpleOutput: false (Not configurable)
hideColumn: false (Frontend UI doesn't support)
normalizeIdentifier:true (Not configurable)
showLinkOnly:true (Not configurable)

Configurable Parameter: The parameters which can be set before the job creation. There's no way to change such parameters once the job is created. You have to create a new job with the new parameter value if needed.

Regular Job

  • Unable to set any data lineage configs

  • Support table level lineage

  • Result will be in database

  • Support if incremental, possible to anaylze SQL scripts or database in batches

  • Possible to do the Left Most analysis (Left Most: given a->b->c, show a->c )

  • Possible to do the Upstream and Downstream analysis (given a->b->c, Upstream: a->b, Downstream: b->c)

  • Data lineage result is saved in database

default distance:(The parameter can be modified even after the job is created)
showConstantTable:(Configurable Parameter)
dataflowOfAggregateFunction: direct (Not configurable)
ignoreFunction:true (Not configurable)
ignoreRecordSet:true (Not configurable)
showRelationType:fdd (Not configurable)
showTransform: false (Not configurable)

hideColumn: false (Frontend UI doesn't support)
normalizeIdentifier:true (Not configurable)
showLinkOnly:true (Not configurable)
simpleOutput: false (Not configurable)

Configurable Parameter: The parameters which can be set before the job creation. There's no way to change such parameters once the job is created. You have to create a new job with the new parameter value if needed.

Job parameters and UI Settings parameters

showConstantTable has the same effect as show constant

ignoreFunction has the same effect as show function

ignoreRecordSet has the same effect as show intermediate recordset

showTransform has the same effect as show transform

dataflowOfAggregateFunction has the similar effect as dataflow of count function

Summary Result

User can request specific database, schema, table or view but result will be in Summary mode if the result number is more than 1,000.

Some of the above parameters cover the parameters in

When using , all data lineage will be returned if the number of relationship is less than 1,000. However, only database, schema, table, view data and the number of above DB units will be returned if the number is more than 1,000. No relationship data will be returned in the case and this is called Summary result.

There's a performance limitation in the frontend page rendering and moreover the graph will be complex to understand so we impose a restriction on the UI endpoint (/sqlflow/generation/sqlflow/graph). However, request to the for SQLFLow Rest Api doesn't have such limitation.

UI Settings section
SQLFlow Front UI
/sqlflow/generation/sqlflow