Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
This tracks adding lineage support to Kudu and Apache Atlas.
A few notes based on some initial research:
- It probably makes sense to generate a generic lineage file which can be consumed by Apache Atlas for lineage.
- This avoids the need for Java interaction in the server
- This is the approach Impala uses
- See
ATLAS-3183and https://impala.apache.org/docs/build3x/html/topics/impala_lineage.html#lineage
- Creating lineage entries for table "DDL" initially makes sense
- CREATE, ALTER, DELETE
- This is what Hbase seems to do: https://atlas.apache.org/Hook-HBase.html
- "Only the namespace, table and column-family create/update/ delete operations are captured by Atlas HBase hook"
- The need for lineage information by scans in unclear
- It would be super fine grained and difficult to interpret.
- Instead lineage from other tools doing the scanning would be more interpretable (Impala, Spark, etc).
Attachments
Issue Links
- is related to
-
KUDU-3109 Log administrative operations
- Open