Job Interface
https://github.com/sqlparser/sqlflow_public/blob/master/api/sqlflow_api.md#sqlflow-user-job-interface
Last updated
https://github.com/sqlparser/sqlflow_public/blob/master/api/sqlflow_api.md#sqlflow-user-job-interface
Last updated
Call this API by sending the SQL files and get the result includes the data lineage. SQLFlow job supports both of multiple files and zip archive file.
Example in Curl
Note:
-H "Content-Type:multipart/form-data" is required
Add @ before the file path
Sample response:
Please records the jobId field.
2. Get job status
Example in Curl
Sample response:
Example in Curl
Sample Response:
3. Export data lineage
When the job status is success, you can export the data lineage in json, csv, graphml formats
Example in Curl
Note:
If you want to get table to table relation, please add option -F "tableToTable=true"
Sample Response is a file in Json format:
Example in Curl
Note:
If you want to get table to table relation, please add option -F "tableToTable=true"
If you want to change csv delimiter, please add option -F "delimiter=<delimiter char>"
Example in Curl
Note:
If you want to get table to table relation, please add option -F "tableToTable=true"
1. Submit a regular job
Call this API by sending the SQL files and get the result includes the data lineage. SQLFlow job supports both of multiple files and zip archive file.
Set incremental=true If the job is incremental.
jobId should be null for the first submit and please note down the jobId field from response message
jobId cannot be null for next submit. Give the jobId which is returned in the first submit response.
Example in Curl
Incremental submit in Curl
Note:
-H "Content-Type:multipart/form-data" is required
Add @ before the file path
Return data:
Please records the jobId field for the further usage.
Submit a simple sqlflow job. Send the SQL files and get the data lineage result. SQLFlow job supports both of multiple files and zip archive file.
/sqlflow/job/submitUserJob
account name of the target database
Csv Format code, Format of a CSV file. used to represent the CSV in the Catalog, Schema, ObjectType, ObjectName, ObjectCode, Notes each column is the number of columns in the CSV file, does not exist it is 0. check https://www.gudusoft.com/blog/2021/09/05/sqlflow-csv/ for more detail
database metadata in the jdbc string
database vendor
dbvazuresql
, dbvbigquery
, dbvcouchbase
, dbvdb2
, dbvgreenplum
, dbvhana
, dbvhive
, dbvimpala
, dbvinformix
, dbvmdx
, dbvmysql
, dbvnetezza
, dbvopenedge
, dbvoracle
, dbvpostgresql
, dbvredshift
, dbvsnowflake
, dbvmssql
, dbvsparksql
, dbvsybase
, dbvteradata
, dbvvertica
default databse when there's no metadata
default schema
This parameters works under the resultset filtered by extractedDbsSchemas. List of databases and schemas to exclude from extraction, separated by commas database1/schema1,database2 or database1.schema1,database2 When parameter database is filled in, this parameter is considered a schema. And support wildcard characters such as database1/,/schema,/.
List of databases and schemas to extract, separated by commas, which are to be provided in the format database/schema; Or blank to extract all databases. database1/schema1,database2/schema2,database3 or database1.schema1,database2.schema2,database3 When parameter database is filled in, this parameter is considered a schema. And support wildcard characters such as database1/,/schema,/.
A list of stored procedures under the specified database and schema to extract, separated by commas, which are to be provided in the format database.schema.procedureName or schema.procedureName; Or blank to extract all databases, support expression. database1.schema1.procedureName1,database2.schema2.procedureName2,database3.schema3,database4 or database1/schema1/procedureName1,database2/schema2
A list of stored views under the specified database and schema to extract, separated by commas, which are to be provided in the format database.schema.viewName or schema.viewName. Or blank to extract all databases, support expression. database1.schema1.procedureName1,database2.schema2.procedureName2,database3.schema3,database4 or database1/schema1/procedureName1,database2/schema2
db hostname
db password
db port
whether ignore Function
whether ignore Record Set
jobName
Character to specify the SQL content, default is "
specifies the string to escape in case of objectCodeEncloseChar in the SQL, default is ".
whether set the job on the top
whether parallelly process the job. Parallel processing will lead to a high performance while decreasing the lineage accuracy.
whether show constant table
show relation type, required false, default value is 'fdd', multiple values seperated by comma like fdd,frd,fdr,join. Availables are 'fdd' value of target column from source column, 'frd' the recordset count of target column which is affected by value of source column, 'fdr' value of target column which is affected by the recordset count of source column, 'join' combines rows from two or more tables, based on a related column between them
whether show Transform
sql source
The token is generated from userid and usersecret. It is used in every Api invocation.
Whether treat the arguments in COUNT function as direct Dataflow
the user id of sqlflow web or client
display the specific user job summary
/sqlflow/job/displayUserJobSummary
job id
job name
The token is generated from userid and usersecret. It is used in every Api invocation.
the user id of sqlflow web or client
Get all jobs (include history jobs) status and summary
/sqlflow/job/displayUserJobsSummary
The token is generated from userid and usersecret. It is used in every Api invocation.
user id
export sqlflow lineage as json format
/sqlflow/job/exportLineageAsJson
column to export
source database
schema
table
whether ignore function
false
whether ignore record set
false
job to export. will return user's latest job if empty
whether show constant table
whether show link only
show relation type, required false, default value is 'fdd', multiple values seperated by comma like fdd,frd,fdr,join. Availables are 'fdd' value of target column from source column, 'frd' the recordset count of target column which is affected by value of source column, 'fdr' value of target column which is affected by the recordset count of source column, 'join' combines rows from two or more tables, based on a related column between them
whether show Transform
simple output, ignore the intermediate results, defualt is false.
false
whether show table to table relation only
false
The token is generated from userid and usersecret. It is used in every Api invocation.
Whether treat the arguments in COUNT function as direct Dataflow
the user id of sqlflow web or client
export full sqlflow lineage as csv format
/sqlflow/job/exportFullLineageAsCsv
delimiter of the values in CSV, default would be ','
job to export. will return user's latest job if empty
whether show table to table relation only
The token is generated from userid and usersecret. It is used in every Api invocation.
the user id of sqlflow web or client
export sqlflow lineage as graphml format
/sqlflow/job/exportLineageAsGraphml
column
database
schema
table
whether ignore function
false
whether ignore record set
false
job to export. will return user's latest job if empty
whether show constant table
whether show link only
show relation type, required false, default value is 'fdd', multiple values seperated by comma like fdd,frd,fdr,join. Availables are 'fdd' value of target column from source column, 'frd' the recordset count of target column which is affected by value of source column, 'fdr' value of target column which is affected by the recordset count of source column, 'join' combines rows from two or more tables, based on a related column between them
whether show Transform
simple output, ignore the intermediate results, defualt is false.
false
whether show table to table relation only
false
The token is generated from userid and usersecret. It is used in every Api invocation.
Whether treat the arguments in COUNT function as direct Dataflow
the user id of sqlflow web or client
submit persist or incremental job
/sqlflow/job/submitPersistJob
account name of the target database
db password
db port
source database
whether schedulable
cron expression to schedule time for the job to be executed
0 1 * * *
database dbvendor
default databse when there's no metadata
default schema
extra
whether this is the first submit
db hostname
whether incremental job
jobId should be null for the first submit and please note down the jobId field from response message. jobId cannot be null for next submit. Give the jobId which is returned in the first submit response.
jobName
whether set the job on the top
sqlsource
The token is generated from userid and usersecret. It is used in every Api invocation.
userId