Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6762

Dynamic UDFs registered on one Drillbit are not visible on other Drillbits

    XMLWordPrintableJSON

Details

    Description

      Originally Reported : https://stackoverflow.com/questions/52480160/dynamic-udf-in-apache-drill-cluster

      When using a 4-node Drill 1.14 cluster, UDF jars registered on one node are not usable on other nodes despite the /registry and ZK showing the UDFs as registered.

      This was previously working on 1.13.0

      Root cause
      1. VersionedDelegatingStore was starting with version -1 and local function registry with version 0. This caused issues when LocalPersistentStore already existed on the file system. When adding jars into remote registry its versioned was bumped to 0 and synchronization did not happen since local registry had the same version.
      Fix: start VersionedDelegatingStore with version 0, local function registry with undefined version (-2) thus first sync will always happen.

      2. PersistentStoreProvider.getOrCreateVersionedStore was wrapping stores into VersionedDelegatingStore for those store providers that did not override this method. Only Zookeeper store was overriding it. But VersionedDelegatingStore is only keeps versioning locally and thus can be applied only for local stores, i.e. Hbase, Mongo cannot use it.
      CachingPersistentStoreProvider did not override getOrCreateVersionedStore either. Mostly all stores in Drill are created using CachingPersistentStoreProvider. Thus all stores where wrapped into VersionedDelegatingStore, even Zookeeper one which caused function registries version synchronization issues.
      Fix: Add UndefinedVersionDelegatingStore for those stores that do not support versioning and wrap into it by default in PersistentStoreProvider.getOrCreateVersionedStore if this method is not overriden. UndefinedVersionDelegatingStore will return UNDEFINED version (-2). During sync between remote and local registries if remote registry has UNDEFINED version sync will be done immediately, on the contrary with NOT_AVAILABLE version (-1) which indicates that remote function registry is not accessible.

      Attachments

        1. Dynamic UDFs issue.pdf
          334 kB
          Anton Dziavitsyn

        Issue Links

          Activity

            People

              arina Arina Ielchiieva
              kkhatua Kunal Khatua
              Vova Vysotskyi Vova Vysotskyi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: