当前位置:   article > 正文

基于alibaba druid的血缘解析工具_datablau 血缘解析工具

datablau 血缘解析工具

基于alibaba druid的血缘解析

1、前言

SQL血缘解析指的是通过sql语句来分析该语句中表与表之间的依赖关系和字段和字段之间的依赖关系。

本血缘解析工具仅仅对mysql数据库的select查询语句进行了血缘解析(后续会持续扩展),该血缘解析包含了原始表字段、临时表字段和目标表字段的关联关系。

2、涉及到技术

主要使用了druid对sql的语法进行解析,涉及到的语法解析接口如下:

(1)SQLStatement

该接口定义了数据库DML(insert、update、delete)和DQL(select)的相关规范。

  • SQLSelectStatement
(2)sqlselectquery

改接口可分为联合查询和单表查询两种类型,通过该阶段可以获得字段信息(包括字段名、字段别名、字段所属表:通过sqlexpr获取)和字段来源信息即SQLTableSource

  • SQLUnionQuery
    表示sql为 联合查询,例如:
select name from student union select name from employees; 
  • 1
  • SQLSelectQueryBlock
    表示sql为单表查询,例如:
select * from student;
  • 1
  • MySqlSelectQueryBlock
    表示sql为单表查询,例如:
select * from student;
  • 1
(3)SQLTableSource

通过sqlselectquery的from方法获取,有如下形式:

  • SQLExprTableSource
    表示from后面是一个真实的表名(非虚拟表),可以理解为该类表示单表查询。在sql解析中可以在该类中获取sql中真实的表名(sql语句若干单表查询的终结类)。例如:
select name from student
  • 1
  • SQLJoinTableSource
    表示from后面是一个join的关联查询,例如:
selec * from student s left join employees e on s.id = e.id; 
  • 1
  • SQLSubqueryTableSource
    表示from后面是一个子查询,例如:
select * from (select id, name from student)s
  • 1
  • SQLUnionQueryTableSource
    表示from后面是一个联合查询,例如:
select name from (select name from student union select name from employees)
  • 1
(4)sqlexpr
  • SQLAggregateExpr
  • SQLAllColumnExpr
  • SQLIdentifierExpr
  • SQLPropertyExpr
  • SQLQueryExpr

3、成果展示

主要和datablau血缘解析工具进行对比
(1)select语句中包含关联查询和字段中包含子查询

SELECT
  e.employee_id,
  e.name,
  e.gender,
  e.date_of_birth,
  e.date_of_hire,
  d.department_name,
  ts.total_salary,
  b.bonus,
  (
    SELECT
      B.id,
      B.CGFS,
      B.WLZ,
      B.CPMC,
      B.CGSL,
      B.ZXCJ,
      B.ZXJE,
      B.HTGLXTHTH,
      B.CGPZH,
      B.WZBM,
      B.HSZJ,
      B.BEDNR JHBH,
      B.WCXTCGDDID,
      B.ZHTDAT,
      D.WERKS_NAME XQGCMC,
      D.WERKS XQGCBM
    FROM
      (
        SELECT
          A.id,
          A.ZCGFS CGFS,
          A.MATKL WLZ,
          A.TXZ01 CPMC,
          A.MENGE CGSL,
          A.NAME1 ZXCJ,
          A.ZHSDJ ZXJE,
          A.ZHTNUM HTGLXTHTH,
          A.EBELN CGPZH,
          A.MATNR WZBM,
          A.ZHSZJ HSZJ,
          A.ZHTDAT,
          C.EMATN,
          C.BEDNR,
          C.ZCGDDID WCXTCGDDID
        FROM
          TB_ERP_ZP11MMRP008 A,
          TB_ERP_ME2M C
        WHERE
          A.EBELN = C.EBELN
          and A.MATNR = C.EMATN
      ) B,
      TB_ERP_ZP10MMDG030 D
    WHERE
      B.BEDNR = D.BEDNR
      AND B.EMATN = D.MATNR
      and B.ZHTDAT > to_date('2023/05/01 00:00:00', 'YYYY/MM/DD HH24:MI:SS')
  ) as sfs,
  (
    select
      s.d
    from
      student s
    where
      student_id = e.employee_id
  ) as saf12
FROM
  employee e
  INNER JOIN department d ON e.department_id = d.department_id
  INNER JOIN tmp_salary ts ON e.employee_id = ts.employee_id
  LEFT OUTER JOIN bonus b ON e.employee_id = b.employee_id;
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71

本血缘解析工具分析结果:

-- TB_ERP_ZP11MMRP008.NAME1 -> B.ZXCJ -> SELECT_STATEMENT_2.ZXCJ -> SELECT_STATEMENT_0.sfs
-- employee.date_of_birth -> SELECT_STATEMENT_0.date_of_birth
-- TB_ERP_ZP11MMRP008.ZHTDAT -> B.ZHTDAT -> SELECT_STATEMENT_2.ZHTDAT -> SELECT_STATEMENT_0.sfs
-- department.department_name -> SELECT_STATEMENT_0.department_name
-- bonus.bonus -> SELECT_STATEMENT_0.bonus
-- student.d -> SELECT_STATEMENT_5.d -> SELECT_STATEMENT_0.saf12
-- TB_ERP_ZP11MMRP008.MATNR -> B.WZBM -> SELECT_STATEMENT_2.WZBM -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.EBELN -> B.CGPZH -> SELECT_STATEMENT_2.CGPZH -> SELECT_STATEMENT_0.sfs
-- employee.name -> SELECT_STATEMENT_0.name
-- TB_ERP_ZP11MMRP008.ZHTNUM -> B.HTGLXTHTH -> SELECT_STATEMENT_2.HTGLXTHTH -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.MATKL -> B.WLZ -> SELECT_STATEMENT_2.WLZ -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP10MMDG030.WERKS -> SELECT_STATEMENT_2.XQGCBM -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.ZCGFS -> B.CGFS -> SELECT_STATEMENT_2.CGFS -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.MENGE -> B.CGSL -> SELECT_STATEMENT_2.CGSL -> SELECT_STATEMENT_0.sfs
-- employee.gender -> SELECT_STATEMENT_0.gender
-- TB_ERP_ZP11MMRP008.TXZ01 -> B.CPMC -> SELECT_STATEMENT_2.CPMC -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.id -> B.id -> SELECT_STATEMENT_2.id -> SELECT_STATEMENT_0.sfs
-- tmp_salary.total_salary -> SELECT_STATEMENT_0.total_salary
-- TB_ERP_ZP11MMRP008.ZHSZJ -> B.HSZJ -> SELECT_STATEMENT_2.HSZJ -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ME2M.ZCGDDID -> B.WCXTCGDDID -> SELECT_STATEMENT_2.WCXTCGDDID -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.ZHSDJ -> B.ZXJE -> SELECT_STATEMENT_2.ZXJE -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ME2M.EMATN -> B.EMATN
-- TB_ERP_ME2M.BEDNR -> B.BEDNR -> SELECT_STATEMENT_2.JHBH -> SELECT_STATEMENT_0.sfs
-- employee.date_of_hire -> SELECT_STATEMENT_0.date_of_hire
-- employee.employee_id -> SELECT_STATEMENT_0.employee_id
-- TB_ERP_ZP10MMDG030.WERKS_NAME -> SELECT_STATEMENT_2.XQGCMC -> SELECT_STATEMENT_0.sfs
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

datablua血缘解析工具结果:
在这里插入图片描述
(2)select联合查询

select s.id from student s union select e.id from employees e
  • 1

本血缘解析工具分析结果:

-- student.id -> SELECT_STATEMENT_0.id
-- employees.id -> SELECT_STATEMENT_1.id
  • 1
  • 2

datablua血缘解析工具结果:
在这里插入图片描述

4 待优化点

  • 不支持没有别名的select查询
  • 联合查询没有结果汇总,有待优化
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家自动化/article/detail/502996
推荐阅读
相关标签
  

闽ICP备14008679号