Gudu SQLFlow Product Docs
  • 1. Introduction
    • What is Gudu SQLFlow?
      • What SQLFlow can do
      • Architecture Overview
    • Getting Started
      • Sign up a new account
        • Team Management
        • Delete My Account
        • Activate by entering a coupon
      • How to use SQLFlow
      • Different modes in Gudu SQLFlow
        • Query mode
        • Job mode
      • Basic Usage
      • Convert SQL to E-R Diagram
      • Colors in SQLFlow Diagram
      • Show call relationship
    • Installation
      • Version and Users
        • Cloud and On-Premise version
        • SQLFlow before Version 6
          • For older version SQLFlow under Linux
          • For older version SQLFlow under MacOS
          • For older version SQLFlow under Windows
      • Linux
      • MacOS
      • Windows
      • Docker
      • Clickhouse Installation
        • Clickhouse For CentOs
        • Clickhouse For Ubuntu/Debian/RHEL
      • Troubleshooting
      • Upgrade
      • Third Party Components
      • Renew License File
    • UI
      • SQLText Editor
      • Schema Explorer
      • Diagram Panel
      • Settings
      • Job Management
        • Job Sources
    • Dlineage Tool
      • Overview
      • Usage
        • Analyze data linege from SQL files
        • Analyze data linege from a database
        • Resolve the ambiguous columns in SQL query
        • Map the DataFlowAnalyzer and the settings on SQLFlow UI
        • Settings
      • Dataflow.xml structure
      • FAQ
  • 2. CONCEPTS
    • Data Lineage Basics
      • Dataflow
        • Relations generated by SQLFlow
      • Direct Dataflow
      • Indirect Dataflow
      • Aggregate function and Dataflow
      • Dataflow chain
    • Data Lineage Format Reference
  • 3. API Docs
    • Prerequisites
    • Using the Rest API
    • SQLFlow Rest API reference
      • User Interface
      • Generation Interface
        • /sqlflow
        • /sqlflow/selectedgraph/table_level_lineage
        • /sqlflow/selectedgraph/image
        • /sqlflow/graph
        • /sqlflow/graph/table_level_lineage
        • /sqlflow/graph/image
        • /sqlflow/downstreamGraph
        • /sqlflow/upstreamGraph
        • /sqlflow/erdiagramSelectGraph
        • /sqlflow/leftMostSourceTableGraph
      • Job Interface
        • /submitUserJob
        • /displayUserJobSummary
        • /displayUserJobsSummary
        • /exportLineageAsJson
        • /exportFullLineageAsJson
        • /exportLineageAsGraphml
        • /submitPersistJob
        • /displayUserLatestJobTableLevelLineage
      • Export Image
      • Export CSV
        • /sqlflow/exportFullLineageAsCsv
        • /job/exportFullLineageAsCsv
    • Swagger UI
    • Export the data lineage result
    • Python
      • Basic Usage
      • Advanced Usage
    • SQL Parser API
      • checkSyntax
  • 4. SQLFlow Widget
    • Widget Get started
    • Usages
    • Widget API Reference
  • 5. Databases
    • Database Objects
      • Azure
      • DB2
  • 6. SQLFlow-ingester
    • Introduction
      • SQLFlow-Exporter
      • SQLFlow-Extractor
      • SQLFlow-Submitter
    • Get Started
      • SQL Server
    • SQLFlow-Ingester Java API Usage
    • Understand the format of exported data
      • Oracle
      • Microsoft SQL Server
      • MySQL
      • PostgreSQL
    • List of Supported dbVendors
    • Git Repo
    • Third Party Components
  • 7. Reference
    • Lineage Model
      • Json Format Lineage Model
      • XML Format Lineage Model
      • Data Lineage Elements
    • Database Model
  • 8. other
    • FAQ
      • Handling Internal Database
      • Delete Your Account
      • Table Form Data Without Intermediates
      • Not all schema exported from Oracle
      • Lineage Customization
    • Roadmap
    • SQL Samples
      • Exchange table partition
      • Generate relationship for renamed table
      • Snowflake table function lineage detection
    • Change Logs
    • SQLFlow with Oracle XML functions
    • Major Organizations Utilizing SQLFlow
Powered by GitBook
On this page
  • 1. Top level elements
  • 2. Summary payload
  • 3. Sqlflow payload
  • 4. Dbobjs payload
  • DB Server Type
  • Procedure, Trigger and Function
  • 5. Relationship payload
  • 6. Graph payload
  • Dataflow.xml Structure
  1. 7. Reference
  2. Lineage Model

Json Format Lineage Model

PreviousLineage ModelNextXML Format Lineage Model

Last updated 2 years ago

This page gives a detail reference of the data lineage response in Json format. Json lineage model can be returned by SQLFlow server or Dlineage tool with /json flag. The SQLFlow UI also get this result from the /sqlflow/generation/sqlflow/graph endpoint. Please refer to for the XML response.

Let's get into details and check the data lineage json resposne:

1. Top level elements

{
	"code": 200,
	"data": {
		"mode": "global",
		"summary": {
		},
		"sqlflow": {
		},
		"graph": {
		}
	},
	"sessionId": "3bf4a61f5d41a016d0f91f28c2c1791d7100e1c159c31dab4e1f3bce603afd1c_1663684880303",
  	"jobId": "JobId",  
	"error": "error msgs",
}
  • code: Http Status code, 200 for OK. 4XX **** for cases in which the client seems to have erred such as no authorization or bad request. 500 for internal server error. Error messages would be present in error If the code is not 200 (request is not a success).

  • data: data payload

    • mode: data mode. Could be global or summary. Will be set to _ summary _ mode when the relation number exceeds the relation_limit

      • global show all data

      • summary only share the statics information and there's no graph information. No field data in the table and only table info. Users need to invoke to get the field data in detail.

    • : payload for statics information in summary mode

    • : data model of the analysis result

    • : graph model of the analysis result

  • sessionId: session id, used to get the cache information in

  • jobId: job id, used to get the cache informaion in

  • error: contains error messages if the status code is not 200

2. Summary payload

"summary": {
	"schema": 1,
	"process": 2,
	"database": 0,
	"view": 1,
	"mostRelationTables": [{
		"table": "SMALL_ORDERS"
	}, {
		"table": "MEDIUM_ORDERS"
	}, {
		"table": "LARGE_ORDERS"
	}],
	"column": 43,
	"relationship": 41,
	"table": 7
}
  • database: database number

  • schema: schema number

  • table: table number

  • view: view number

  • column: column number

  • relationship: relationship number

  • process: process number

  • mostRelationTables: the top three tables which contain the most relationships

3. Sqlflow payload

"sqlflow": {
	"dbobjs": {
		"createdBy": "grabit v1.7.0",
		"servers": [{}]
	},
	"relationships": [{}]
}

sqlflow payload contains two nodes. dbojbs and relationship.

4. Dbobjs payload

{
    "servers": [
        {
            "name": "default_server",
            "dbVendor": "dbvmysql",
            "supportsCatalogs": true,
            "supportsSchemas": false,
            "databases": [{
                "name": "DEFAULT",
                "tables": [{
                    "columns": [{
                        "name": "field9",
                        "id": "110"
                    }, {
                        "name": "field1",
                        "id": "111"
                    }, {
                        "name": "field5",
                        "id": "116"
                    }],
                    "id": "96",
                    "name": "table1",
                    "type": "table"
                }]
            }]
        }, {
            "dbVendor": "dbvmssql",
            "name": "default_server",
            "supportsCatalogs": true,
            "supportsSchemas": true,
            "databases": [{
                "name": "DEFAULT",
                "schemas": [{
                    "name": "HumanResources",
                    "tables": [{
                        "columns": [{
                            "name": "BusinessEntityID",
                            "id": "13"
                        }, {
                            "name": "HireDate",
                            "id": "14"
                        }],
                        "id": "12",
                        "name": "Employee",
                        "type": "table"
                    }]
                }]
            }]
        }, {
            "dbVendor": "dbvsupportschemaonly",
            "name": "DB_SERVER_SCHEMA_ONLY",
            "supportsCatalogs": false,
            "supportsSchemas": true,
            "schemas": [{
                "name": "OE",
                "tables": [{
                    "columns": [{
                        "name": "ORDER_ID",
                        "id": "57"
                    }],
                    "id": "56",
                    "name": "ORDERS",
                    "type": "table"
                }],
                "packages": [{
                    "id": "26",
                    "name": "package1",
                    "procedures": [{
                        "name": "procedure1",
                        "type": "procedure"
                    }]
                }, {
                    "id": "42",
                    "name": "package2",
                    "procedures": [{
                        "name": "procedure1",
                        "type": "procedure"
                    }]
                }]
            }]
        }, 
    ]
}

The top element of the dbobjs payload is an array and the array representing different server instances. For each server instance, we will have:

  1. dbVendor: database type

  2. name: instance name

  3. databases: present if support database

  4. schemas: present if database is not supported and only schema is supported.

  5. dbLinks: dbLinks will be present if the resposne json is generated from metadata. Will not be present If the response json is generated from dataflow

  6. queries: present if the response json is generated from metadata

DB Server Type

  1. if supportsCatalogs=true,supportsSchemas=true:

    • server-->database-->schema-->tables/views/others/packages/procedures/functions/triggers

  2. if supportsCatalogs=true,supportsSchemas=false:

    • server-->database-->tables/views/others/packages/procedures/functions/triggers

  3. if supportsCatalogs = false, supportsSchemas = true:

    • server --> schema --> tables/views/others/packages/procedures/functions/triggers

Procedure, Trigger and Function

Database node and Schema node may contain other information indicating the data of procedure, trigger, function

"others": [
    {
        "id": "85",
        "name": "INSERT-SELECT-1",
        "type": "insert-select",
        "columns": [
            {
                "id": "84",
                "name": "RelationRows",
                "coordinates": [
                    {
                        "hashCode": "0",
                        "x": 30,
                        "y": 8
                    },
                    {
                        "hashCode": "0",
                        "x": 31,
                        "y": 56
                    }
                ],
                "source": "system",
                "uiVisible": false
            }, ...
    ]

5. Relationship payload

Relationship is the atom unit of the data lineage. Relationship builds a link between the source and target column (column-level lineage).

A relation includes the type, target, sources and other attributes.

"relationships": [{
	"id": "56",
	"type": "fdd",
	"effectType": "insert",
	"target": {
		"id": "102",
		"column": "CL",
		"parentId": "52",
		"parentName": "MEDIUM_ORDERS",
		"coordinates": [{
			"hashCode": "0",
			"x": 31,
			"y": 36
		}, {
			"hashCode": "0",
			"x": 31,
			"y": 38
		}]
	},
	"sources": [{
		"id": "90",
		"column": "CL",
		"parentId": "85",
		"parentName": "INSERT-SELECT-1",
		"coordinates": [{
			"hashCode": "0",
			"x": 31,
			"y": 21
		}, {
			"hashCode": "0",
			"x": 31,
			"y": 38
		}]
	}],
	"processId": "48"
}]
  • id: relation id

  • type: relation type, could be fdd, fdr, join, or call

  • function: present if the relationship is about function

  • effectType: effect type of the relation, based on STMT

  • processId: process id by which the relation is generated

  • timestampMin: the earliest time when the relationship is generated

  • timestampMax: the latest time when the relationship is generated

RelationshipElement

  • id

  • name

  • column

  • columnType

  • sourceId

  • sourceName

  • parentId

  • parentName

  • clauseType

  • function

  • type

  • coordinates

Check here for more details

Transform

6. Graph payload

"graph": {
    "relationshipIdMap": {
        "e0": ["5519", "fdd"],
    },
    "elements": {
        "tables": [{
            "columns": [{
                "height": 16.0,
                "id": "n0::n0",
                "label": {
                    "content": "oid",
                    "fontFamily": "Segoe UI Symbol",
                    "fontSize": "12",
                    "height": 13.96875,
                    "width": 35.0,
                    "x": 0.0,
                    "y": 0.0
                },
                "width": 160.0,
                "x": 575.0,
                "y": -565.9073
            }],
	    "id": "n4",
	    "label": {
		"content": "customers",
		"fontFamily": "Segoe UI Symbol",
		"fontSize": "12",
		"height": 17.96875,
		"width": 162.0,
		"x": 0.0,
		"y": 0.0
	    },
	    "width": 162.0,
	    "height": 57.96875,
        }],
        "edges": [{
            "id": "e0",
            "sourceId": "n9::n5",
            "targetId": "n5::n5"
        }
    },
    "tooltip": {},
    "listIdMap": {
        "n0": ["68"],
        "n0::n0": ["110"],
    }
}
  • relationIdMap:

    • Mapping list between the graph ui id and the relationship id

  • listIdMap:

    • Mapping list between graph ui id and the graph model id

  • elements:

    • tables

      • id: table id, will be generated into UI model by mappings in the listIdMap

      • table: table name

      • width: table width

      • height: table height

      • x:table x-axis (horizontal) coordinate

      • y:table y-axis (vertical) coordinate

      • columns

        • id: columns id, will be generated into UI model by mappings in the listIdMap

        • x:column x-axis (horizontal) coordinate

        • y:column y-axis (vertical) coordinate

    • edges

      • id: edge id, mapped with relationship id and type through relationIdMap

      • sourceId: source column id

      • targetId: target column id

Dataflow.xml Structure

: metadata, contains information of instance, db, schema, table, view, storage procedure, function, trigger, dblink, sequence, ddl etc..

: relationships after analyzing sql

supportsCatalogs: whether support database (check for more details on this flag)

supportsSchemas: whether support schema (check for more details on this flag)

tables, columns, package, prcedure, argument, process: check for more details

tips: the above structure is same as the servers part of the metadata result from as well as .

There are tree types for the server instance (same logic ):

Check to get a full database list and the type details.

target: relation target, of structure

sources: relation sources, belongs to structure

caller: caller if the type is call, belongs to structure

callees: callees if the type is call, is an array of objects

transforms: array of

here
here
Dlineage tool
Ingester
here
here
Dataflow.xml structure
here
REST Api
Query mode
Job mode
summary
sqlflow
graph
dbojbs
relationships
RelationshipElement
RelationshipElement
RelationshipElement
RelationshipElement
Transform
here
Target Fields Data
Transform Fields Data