Originally Reported : https://stackoverflow.com/questions/52480160/dynamic-udf-in-apache-drill-cluster
When using a 4-node Drill 1.14 cluster, UDF jars registered on one node are not usable on other nodes despite the /registry and ZK showing the UDFs as registered.
This was previously working on 1.13.0
1. VersionedDelegatingStore was starting with version -1 and local function registry with version 0. This caused issues when LocalPersistentStore already existed on the file system. When adding jars into remote registry its versioned was bumped to 0 and synchronization did not happen since local registry had the same version.
Fix: start VersionedDelegatingStore with version 0, local function registry with undefined version (-2) thus first sync will always happen.
2. PersistentStoreProvider.getOrCreateVersionedStore was wrapping stores into VersionedDelegatingStore for those store providers that did not override this method. Only Zookeeper store was overriding it. But VersionedDelegatingStore is only keeps versioning locally and thus can be applied only for local stores, i.e. Hbase, Mongo cannot use it.
CachingPersistentStoreProvider did not override getOrCreateVersionedStore either. Mostly all stores in Drill are created using CachingPersistentStoreProvider. Thus all stores where wrapped into VersionedDelegatingStore, even Zookeeper one which caused function registries version synchronization issues.
Fix: Add UndefinedVersionDelegatingStore for those stores that do not support versioning and wrap into it by default in PersistentStoreProvider.getOrCreateVersionedStore if this method is not overriden. UndefinedVersionDelegatingStore will return UNDEFINED version (-2). During sync between remote and local registries if remote registry has UNDEFINED version sync will be done immediately, on the contrary with NOT_AVAILABLE version (-1) which indicates that remote function registry is not accessible.