Type: New Feature
Affects Version/s: None
Fix Version/s: None
This tracks adding lineage support to Kudu and Apache Atlas.
A few notes based on some initial research:
- It probably makes sense to generate a generic lineage file which can be consumed by Apache Atlas for lineage.
- This avoids the need for Java interaction in the server
- This is the approach Impala uses
- Creating lineage entries for table "DDL" initially makes sense
- CREATE, ALTER, DELETE
- This is what Hbase seems to do: https://atlas.apache.org/Hook-HBase.html
- "Only the namespace, table and column-family create/update/ delete operations are captured by Atlas HBase hook"
- The need for lineage information by scans in unclear
- It would be super fine grained and difficult to interpret.
- Instead lineage from other tools doing the scanning would be more interpretable (Impala, Spark, etc).