Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Future
    • Component/s: None
    • Labels:
      None

      Description

      This is a tracking item to make Drill work with YARN.
      Below are few requirements/needs to consider.

      • Drill should run as an YARN based application, side by side with other YARN enabled applications (on same nodes or different nodes). Both memory and CPU resources of Drill should be controlled in this mechanism.
      • As an YARN enabled application, Drill resource consumption should be adaptive to the load on the cluster. For ex: When there is no load on the Drill , Drill should consume no resources on the cluster. As the load on Drill increases, resources permitting, usage should grow proportionally.
      • Low latency is a key requirement for Apache Drill along with support for multiple users (concurrency in 100s-1000s). This should be supported when run as YARN application as well.

        Activity

        Hide
        Paul Rogers added a comment - - edited

        Attached is a short three-page, high-level overview of the design of the Drill-on-YARN feature planned for Drill 1.8. Outlines the design approach, user experience and major components.

        If you are a YARN user, and want to run Drill under YARN, please give this outline a read to see if the feature will suit your needs. What is missing? What additional features are needed now or in a later release?

        Show
        Paul Rogers added a comment - - edited Attached is a short three-page, high-level overview of the design of the Drill-on-YARN feature planned for Drill 1.8. Outlines the design approach, user experience and major components. If you are a YARN user, and want to run Drill under YARN, please give this outline a read to see if the feature will suit your needs. What is missing? What additional features are needed now or in a later release?
        Hide
        Josh Elser added a comment -

        "substantial time" is definitely hard to sign up for, but I'd be happy to try to help out where/when at all possible.

        Show
        Josh Elser added a comment - "substantial time" is definitely hard to sign up for, but I'd be happy to try to help out where/when at all possible.
        Hide
        Matt Pollock added a comment -

        Thanks much.

        Show
        Matt Pollock added a comment - Thanks much.
        Hide
        Paul Rogers added a comment -

        Good progress is being made. Our tentative goal is the Drill 1.8 release for an initial integration. The goal is:

        YARN support in Drill 1.8 enables admins to migrate their existing Drill cluster to run under YARN. The admin simply identifies the nodes on which Drill should run, identifies the required container sizes, and brings up the Drill cluster under YARN. YARN manages resource allocations for Drill alongside those of other YARN applications. Drill-on-YARN monitors Drill-bits and automatically restarts any that fail.

        We'll have "experimental" support for starting/stopping Drill-bits. Starting bits is easy. Stopping is a bit of a challenge because we lack DRILL-2656.

        Show
        Paul Rogers added a comment - Good progress is being made. Our tentative goal is the Drill 1.8 release for an initial integration. The goal is: YARN support in Drill 1.8 enables admins to migrate their existing Drill cluster to run under YARN. The admin simply identifies the nodes on which Drill should run, identifies the required container sizes, and brings up the Drill cluster under YARN. YARN manages resource allocations for Drill alongside those of other YARN applications. Drill-on-YARN monitors Drill-bits and automatically restarts any that fail. We'll have "experimental" support for starting/stopping Drill-bits. Starting bits is easy. Stopping is a bit of a challenge because we lack DRILL-2656 .
        Hide
        Paul Rogers added a comment -

        Worth a discussion. Is Slider still the "go to" option, or has effort shifted to Twill?

        As it turns out, the actual YARN integration was not a big effort. Rather, most of the effort is around modifying Drill itself to play well with YARN, and implementing the management aspects unique to YARN.

        Show
        Paul Rogers added a comment - Worth a discussion. Is Slider still the "go to" option, or has effort shifted to Twill? As it turns out, the actual YARN integration was not a big effort. Rather, most of the effort is around modifying Drill itself to play well with YARN, and implementing the management aspects unique to YARN.
        Hide
        Jacques Nadeau added a comment -

        Hey Paul & Billie, if the Slider community co-implemented this with the Drill folk, it would probably allow Slider to support more use cases and bring us to a shared approach rather than two separate codebases. Do you think that anyone from the Slider community would be able to spend substantial time against this to address the Drill needs?

        Show
        Jacques Nadeau added a comment - Hey Paul & Billie, if the Slider community co-implemented this with the Drill folk, it would probably allow Slider to support more use cases and bring us to a shared approach rather than two separate codebases. Do you think that anyone from the Slider community would be able to spend substantial time against this to address the Drill needs?
        Hide
        Matt Pollock added a comment -

        Any progress update? My organization won't support use of Drill until this is done.

        Show
        Matt Pollock added a comment - Any progress update? My organization won't support use of Drill until this is done.
        Hide
        Paul Rogers added a comment -

        We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN:

        1. Slider has much documentation, but it is incomplete and out-of-date in important places.
        2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly.
        3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc.
        4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins.
        5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage.

        In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps.

        On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds.

        Show
        Paul Rogers added a comment - We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN: 1. Slider has much documentation, but it is incomplete and out-of-date in important places. 2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly. 3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc. 4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins. 5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage. In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps. On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds.
        Hide
        Billie Rinaldi added a comment -

        Have you considered using Slider instead of writing a new AM for Drill? Slider would take care of most of these bullets already, without requiring you to write new Java code. If there end up being new features Drill would need, I imagine the Slider folks would be receptive to adding those.

        Show
        Billie Rinaldi added a comment - Have you considered using Slider instead of writing a new AM for Drill? Slider would take care of most of these bullets already, without requiring you to write new Java code. If there end up being new features Drill would need, I imagine the Slider folks would be receptive to adding those.
        Hide
        Paul Rogers added a comment -

        A brief "starter set" of requirements:

        • Configuration file to gather the cluster configuration (memory, cores, number of nodes and so on.)
        • Launcher to start/stop Drill within YARN
        • Drill-specific Application Master (AM)
        • AM requests YARN Node Manager (AM) to launch drill-bits.
        • Use YARN localization feature to depoy Drill files to each node.
        • Add nodes (drill-bits) to a running Drill cluster
        • Remove nodes from a running Drill cluster (see DRILL-2656)
        • Detect and restart failed drill-bits
        • Status/statistics about the cluster as a whole (number of active nodes, number of restarts, etc.)
        • Allow existing users to run "unmanaged" Drill clusters (YARN is optional)
        • Possibly allow multiple "Drill clusters" (independent clusters of drill bits) on the same YARN-managed physical cluster.
        Show
        Paul Rogers added a comment - A brief "starter set" of requirements: Configuration file to gather the cluster configuration (memory, cores, number of nodes and so on.) Launcher to start/stop Drill within YARN Drill-specific Application Master (AM) AM requests YARN Node Manager (AM) to launch drill-bits. Use YARN localization feature to depoy Drill files to each node. Add nodes (drill-bits) to a running Drill cluster Remove nodes from a running Drill cluster (see DRILL-2656 ) Detect and restart failed drill-bits Status/statistics about the cluster as a whole (number of active nodes, number of restarts, etc.) Allow existing users to run "unmanaged" Drill clusters (YARN is optional) Possibly allow multiple "Drill clusters" (independent clusters of drill bits) on the same YARN-managed physical cluster.
        Hide
        Jianshi Huang added a comment -

        Any progress on YARN support?

        Jianshi

        Show
        Jianshi Huang added a comment - Any progress on YARN support? Jianshi

          People

          • Assignee:
            Paul Rogers
            Reporter:
            Neeraja
          • Votes:
            4 Vote for this issue
            Watchers:
            17 Start watching this issue

            Dates

            • Created:
              Updated:

              Development