Json Format Lineage Model
This page gives a detail reference of the data lineage response in Json format. Json lineage model can be returned by SQLFlow server or Dlineage tool with /json
flag. The SQLFlow UI also get this result from the /sqlflow/generation/sqlflow/graph endpoint.
Please refer to here for the XML response.
Let's get into details and check the data lineage json resposne:
1. Top level elements
code: Http Status code, 200 for OK. 4XX **** for cases in which the client seems to have erred such as no authorization or bad request. 500 for internal server error. Error messages would be present in
error
If the code is not 200 (request is not a success).data: data payload
mode: data mode. Could be
global
orsummary
. Will be set to _summary
_ mode when the relation number exceeds therelation_limit
global
show all datasummary
only share the statics information and there's no graph information. No field data in the table and only table info. Users need to invoke REST Api to get the field data in detail.
summary: payload for statics information in
summary
modesqlflow: data model of the analysis result
graph: graph model of the analysis result
sessionId: session id, used to get the cache information in Query mode
jobId: job id, used to get the cache informaion in Job mode
error: contains error messages if the status code is not 200
2. Summary payload
database: database number
schema: schema number
table: table number
view: view number
column: column number
relationship: relationship number
process: process number
mostRelationTables: the top three tables which contain the most relationships
3. Sqlflow payload
sqlflow payload contains two nodes. dbojbs and relationship.
dbojbs: metadata, contains information of instance, db, schema, table, view, storage procedure, function, trigger, dblink, sequence, ddl etc..
relationships: relationships after analyzing sql
4. Dbobjs payload
The top element of the dbobjs payload is an array and the array representing different server instances. For each server instance, we will have:
dbVendor: database type
name: instance name
supportsCatalogs: whether support database (check here for more details on this flag)
supportsSchemas: whether support schema (check here for more details on this flag)
databases: present if support database
schemas: present if database is not supported and only schema is supported.
dbLinks: dbLinks will be present if the resposne json is generated from metadata. Will not be present If the response json is generated from dataflow
queries: present if the response json is generated from metadata
tables, columns, package, prcedure, argument, process: check here for more details
tips: the above structure is same as the servers
part of the metadata result from Dlineage tool as well as Ingester.
DB Server Type
There are tree types for the server instance (same logic here):
if supportsCatalogs=true,supportsSchemas=true:
server-->database-->schema-->tables/views/others/packages/procedures/functions/triggers
if supportsCatalogs=true,supportsSchemas=false:
server-->database-->tables/views/others/packages/procedures/functions/triggers
if supportsCatalogs = false, supportsSchemas = true:
server --> schema --> tables/views/others/packages/procedures/functions/triggers
Check here to get a full database list and the type details.
Procedure, Trigger and Function
Database node and Schema node may contain other information indicating the data of procedure
, trigger
, function
5. Relationship payload
Relationship is the atom unit of the data lineage. Relationship builds a link between the source and target column (column-level lineage).
A relation includes the type
, target
, sources
and other attributes.
id: relation id
type: relation type, could be
fdd
,fdr
,join
, orcall
function: present if the relationship is about function
effectType: effect type of the relation, based on STMT
target: relation target, of RelationshipElement structure
sources: relation sources, belongs to RelationshipElement structure
caller: caller if the type is
call
, belongs to RelationshipElement structurecallees: callees if the type is
call
, is an array of RelationshipElement objectsprocessId: process id by which the relation is generated
timestampMin: the earliest time when the relationship is generated
timestampMax: the latest time when the relationship is generated
RelationshipElement
id
name
column
columnType
sourceId
sourceName
transforms: array of Transform
parentId
parentName
clauseType
function
type
coordinates
Check here for more details
Transform
6. Graph payload
relationIdMap:
Mapping list between the graph ui id and the relationship id
listIdMap:
Mapping list between graph ui id and the graph model id
elements:
tables
id: table id, will be generated into UI model by mappings in the listIdMap
table: table name
width: table width
height: table height
x:table x-axis (horizontal) coordinate
y:table y-axis (vertical) coordinate
columns
id: columns id, will be generated into UI model by mappings in the listIdMap
x:column x-axis (horizontal) coordinate
y:column y-axis (vertical) coordinate
edges
id: edge id, mapped with relationship id and type through relationIdMap
sourceId: source column id
targetId: target column id
Dataflow.xml Structure
Dataflow.xml structureLast updated