Description
Background
The new transform framework looks at the entity holistically and computes attributes as it sees fit.
Updating the hive_storagedesc.location if entity-level transform is present, does not apply the transform.
Steps to Duplicate
- Attempt to import a zip file with the following transform.
{ "options": { "transformers": "[{\"conditions\":{\"__entity\":\"topLevel: \"},\"action\":{\"__entity\":\"ADD_CLASSIFICATION: mycluster0_replicated\"}},{\"action\":{\"__entity.replicatedTo\":\"CLEAR:\",\"__entity.replicatedFrom\":\"CLEAR:\"}},{\"conditions\":{\"hive_db.clusterName\":\"EQUALS: mycluster0\"},\"action\":{\"hive_db.clusterName\":\"SET: mycluster1\"}},{\"conditions\":{\"hive_storagedesc.location\":\"STARTS_WITH_IGNORE_CASE: hdfs://localhost.localdomain\"},\"action\":{\"hive_storagedesc.location\":\"REPLACE_PREFIX: = :hdfs://localhost.localdomain=hdfs://mycluster1\"}}]", "replicatedFrom": "SFO$clx" } }
Locate any entity of type hive_storagedesc and note the location. It does not apply the updated location.
Root Cause
- The new transform framework re-computes the location based on cached data.
- Solution: Prevent caching of this attribute so that it honors the externally set transformation.