Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
There are some huge environments where the warehouse has a thousand databases and hundred thousand tables with many columns and most of them are dropped, created, updated at a fast pace. In these environments, the Atlas processing time can slow down increasing the backlog as it starts moving slower than the changes in the warehouse and the prune.pattern e/o ignore.pattern it is not suitable.
It will be nice to have the opportunity to have a default deny behaviour for all the tables and then to 'allow' the import of a subset of tables specified in a parameter regex (in order to process only some important tables): basically that works in the opposite way to the prune.pattern and ignore.pattern.
As far as I know, there is a similar feature for S3 and ADLS but not for hive.
If this is the case, will be nice to get the feature onboarded in your backlog.