SQLFlow enables the customization of the data lineage in case users need to build their own visual representaiton of the query making it more comprehensible for non DBA people.
Let's take a simple query as an example:
TABLE1.A -> TABLE2.B
Get Lineage Model
All SQLFlow lineages can be represented under the lineage model structure. It can either be in Json format or in XML format. Please refer to to read the details information for the lineage mode.
There are several ways to retrieve the SQLFlow lineage mode:
Invoking REST APIs, for example will give the lineage model.
Using GSP
DIrectly copy the lineage model value from SQLFlow Web
A Json/XML format SQLFlow lineage model is now available:
dbobjs lists all data objects, and it is the data source of relationships. The target and source in relationships are all in the dbobjs.
{
"createdBy": "sqlflow",//hard coded value to distinguish the json version
"servers": [
{
"name": "DEFAULT_SERVER",//server name
"dbVendor": "dbvoracle"//db type
"supportsCatalogs": true,
"supportsSchemas": true,
"databases": [
{
"name": "DEFAULT",//database type, use default value if no database
"schemas": [
{
"name": "DEFAULT",//schema name, use default value if no schema
"tables": [
{
"id": "10",//table ID
"name": "table1",//table name
"type": "table",//type, table or view
"columns": [
{
"id": "11",// column ID
"name": "a"//column name
}
]
},
{
"id": "4",
"name": "table2",
"type": "table",
"columns": [
{
"id": "5",
"name": "b"
}
]
}
]
}
]
}
]
}
]
}
The Relationships
relationships has all data lineage relationships. The neighborhoring lineage of each field is regarded as one item. The target and source of each relationship can be found in dbobjs. You can customize your own lineage by updating this object.
[
{
"id": "1",//Unique identifier
"type": "fdd",//lineage type
"target": {//target object in the lineage, only 1 target in one lineage
"id": "5",//field identifier, column.id of the dbobjs
"column": "B",//column.name
"parentId": "4",//The table/view ID where the data field is located, corresponding to table.id of dbobjs
"parentName": "TABLE2"//The name of the table/view where the data field is located, corresponding to table.name in dbobjs
},
"sources": [//source of the lineage, can be multiple
{
"id": "11",//columns.id
"column": "A",//Data field name, corresponding to the column.name in dbobjs
"parentId": "10",//The table/view ID where the data field is located, corresponding to table.id of dbobjs
"parentName": "TABLE1"//The name of the table/view where the data field is located, corresponding to table.name in dbobjs
}
]
}
]
Note that:
Id type of the object should be int
Ensure the uniqueness of the id. Duplicate id makes error in the syntax.
Visualization
Paste the updated lineage mode(Json/XML) in the query editor of the SQLFlow Web and it generates the updated graph:
You can change the display configuration and export the image files with SQLFlow Web.
Please check if more detailed information is required.