Gudu SQLFlow Product Docs
  • 1. Introduction
    • What is Gudu SQLFlow?
      • What SQLFlow can do
      • Architecture Overview
    • Getting Started
      • Sign up a new account
        • Team Management
        • Delete My Account
        • Activate by entering a coupon
      • How to use SQLFlow
      • Different modes in Gudu SQLFlow
        • Query mode
        • Job mode
      • Basic Usage
      • Convert SQL to E-R Diagram
      • Colors in SQLFlow Diagram
      • Show call relationship
    • Installation
      • Version and Users
        • Cloud and On-Premise version
        • SQLFlow before Version 6
          • For older version SQLFlow under Linux
          • For older version SQLFlow under MacOS
          • For older version SQLFlow under Windows
      • Linux
      • MacOS
      • Windows
      • Docker
      • Clickhouse Installation
        • Clickhouse For CentOs
        • Clickhouse For Ubuntu/Debian/RHEL
      • Troubleshooting
      • Upgrade
      • Third Party Components
      • Renew License File
    • UI
      • SQLText Editor
      • Schema Explorer
      • Diagram Panel
      • Settings
      • Job Management
        • Job Sources
    • Dlineage Tool
      • Overview
      • Usage
        • Analyze data linege from SQL files
        • Analyze data linege from a database
        • Resolve the ambiguous columns in SQL query
        • Map the DataFlowAnalyzer and the settings on SQLFlow UI
        • Settings
      • Dataflow.xml structure
      • FAQ
  • 2. CONCEPTS
    • Data Lineage Basics
      • Dataflow
        • Relations generated by SQLFlow
      • Direct Dataflow
      • Indirect Dataflow
      • Aggregate function and Dataflow
      • Dataflow chain
    • Data Lineage Format Reference
  • 3. API Docs
    • Prerequisites
    • Using the Rest API
    • SQLFlow Rest API reference
      • User Interface
      • Generation Interface
        • /sqlflow
        • /sqlflow/selectedgraph/table_level_lineage
        • /sqlflow/selectedgraph/image
        • /sqlflow/graph
        • /sqlflow/graph/table_level_lineage
        • /sqlflow/graph/image
        • /sqlflow/downstreamGraph
        • /sqlflow/upstreamGraph
        • /sqlflow/erdiagramSelectGraph
        • /sqlflow/leftMostSourceTableGraph
      • Job Interface
        • /submitUserJob
        • /displayUserJobSummary
        • /displayUserJobsSummary
        • /exportLineageAsJson
        • /exportFullLineageAsJson
        • /exportLineageAsGraphml
        • /submitPersistJob
        • /displayUserLatestJobTableLevelLineage
      • Export Image
      • Export CSV
        • /sqlflow/exportFullLineageAsCsv
        • /job/exportFullLineageAsCsv
    • Swagger UI
    • Export the data lineage result
    • Python
      • Basic Usage
      • Advanced Usage
    • SQL Parser API
      • checkSyntax
  • 4. SQLFlow Widget
    • Widget Get started
    • Usages
    • Widget API Reference
  • 5. Databases
    • Database Objects
      • Azure
      • DB2
  • 6. SQLFlow-ingester
    • Introduction
      • SQLFlow-Exporter
      • SQLFlow-Extractor
      • SQLFlow-Submitter
    • Get Started
      • SQL Server
    • SQLFlow-Ingester Java API Usage
    • Understand the format of exported data
      • Oracle
      • Microsoft SQL Server
      • MySQL
      • PostgreSQL
    • List of Supported dbVendors
    • Git Repo
    • Third Party Components
  • 7. Reference
    • Lineage Model
      • Json Format Lineage Model
      • XML Format Lineage Model
      • Data Lineage Elements
    • Database Model
  • 8. other
    • FAQ
      • Handling Internal Database
      • Delete Your Account
      • Table Form Data Without Intermediates
      • Not all schema exported from Oracle
      • Lineage Customization
    • Roadmap
    • SQL Samples
      • Exchange table partition
      • Generate relationship for renamed table
      • Snowflake table function lineage detection
    • Change Logs
    • SQLFlow with Oracle XML functions
    • Major Organizations Utilizing SQLFlow
Powered by GitBook
On this page
  • Upload file
  • From database
  • Upload file + database metadata
  • Dbt
  • RedShift log
  • Snowflake query history
  1. 1. Introduction
  2. UI
  3. Job Management

Job Sources

PreviousJob ManagementNextDlineage Tool

Last updated 1 month ago

Job source can be from one of the following sources:

​

Upload file

Note: The uploaded file must be less than 200MB.

From database

from database: SQLFLow is able to directly read information from the database server and analyze your data lineage. Connection information is required in this mode.

Upload file + database metadata

Note: The uploaded file must be less than 200MB

Dbt

dbt: Read data lineage from your dbt. dbt is an ETL tool for data transformation. Data lineage can be retrieved from dbt data.

The manifest.json and the catalog.json will be required. Both files can be found under the target folder of your dbt tranformation project.

RedShift log

redshift log: Read from your redshift log. Amazon Redshift is a cloud data warehouse. You should give your redshift log and the file should be in .gz or .zip format. You may compress multiple .gz files into one .zip file and folder nesting structure is supported in the .zip file. You can also give your database's metadata.json as a supplementary source.

Snowflake query history

snowflake query history: Read from your snowflake query history. Snowflake is a SaaS platform database. We can get the data lineage from your snowflake query history. However, the query history continues to grow with the use of the database so it would be better to check your query history data regularly. You can use enableQueryHistory flag to choose werther fetch from the query history.

enableQueryHistory: Fetch SQL queries from the query history if set to true.

blockOfTimeInMinutes: When enableQueryHistoryis set to true, the interval at which the SQL query was extracted in the query History. Default is 30 mins.

queryHistorySqlType: Specify what's kind of SQL statements need to be sent to the SQLFlow for furhter processing after fetch the queries from the Snowflake query history. If queryHistorySqlType is specified, will only pickup those SQL statement type and send it the SQLFlow for furhter processing. This can be useful when you only need to discover lineage from a specific type of SQL statements. If queryHistorySqlType is empty, all queries fetched from the query history will be sent to the SQLFlow server.

excludedHistoryDbsSchemas: List of databases and schemas to exclude from extraction, separated by commas database1.schema1,database2.

duplicateQueryHistory: Whether filter out the duplicate query history.

snowflakeDefaultRole: The role when connecting the snowflake database. You must define a role that has access to the SNOWFLAKE database and assign WAREHOUSE permission to this role.

upload file: The data lineage will be generated from your sql meta file(DDL file for an example). You can upload multiple sql files by compressing them into one zip file and upload the zip file. You can also upload the DDL file or the json result generated by Ingester in the zip file to resolve the ambiguities(Check for more details about the ambiguity in the data lineage).

Check our if you would like to take this approach but don't have any compatible sql metadata file.

Read more about the setting section .

Read more about the setting section .

The schemas should be in full path. In Oracle, schema name is same as DB name so only give schema name is good for Oracle. Read more about the advanced section .

upload file + database metadata: Use the above two methods to create the Job together. The database metadata will be used to resolve the ambiguities in the uploaded file (Check for more details about the ambiguity in the data lineage). Queries in the database metadata will not be analyzed because the main purpose of using database metadata is eliminating the ambiguity in this scenario.

Read more about the setting section .

Read more about the advanced section .

here
Ingester tool
here
here
here
here
here
here