Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Future
    • Component/s: None
    • Labels:
      None

      Description

      This is a tracking item to make Drill work with YARN.
      Below are few requirements/needs to consider.

      • Drill should run as an YARN based application, side by side with other YARN enabled applications (on same nodes or different nodes). Both memory and CPU resources of Drill should be controlled in this mechanism.
      • As an YARN enabled application, Drill resource consumption should be adaptive to the load on the cluster. For ex: When there is no load on the Drill , Drill should consume no resources on the cluster. As the load on Drill increases, resources permitting, usage should grow proportionally.
      • Low latency is a key requirement for Apache Drill along with support for multiple users (concurrency in 100s-1000s). This should be supported when run as YARN application as well.
      1. Drill-on-YARNDesignOverview.pdf
        91 kB
        Paul Rogers
      2. Drill-on-YARNUserGuide.pdf
        600 kB
        Paul Rogers

        Issue Links

          Activity

          Hide
          Jianshi Huang added a comment -

          Any progress on YARN support?

          Jianshi

          Show
          Jianshi Huang added a comment - Any progress on YARN support? Jianshi
          Hide
          Paul Rogers added a comment -

          A brief "starter set" of requirements:

          • Configuration file to gather the cluster configuration (memory, cores, number of nodes and so on.)
          • Launcher to start/stop Drill within YARN
          • Drill-specific Application Master (AM)
          • AM requests YARN Node Manager (AM) to launch drill-bits.
          • Use YARN localization feature to depoy Drill files to each node.
          • Add nodes (drill-bits) to a running Drill cluster
          • Remove nodes from a running Drill cluster (see DRILL-2656)
          • Detect and restart failed drill-bits
          • Status/statistics about the cluster as a whole (number of active nodes, number of restarts, etc.)
          • Allow existing users to run "unmanaged" Drill clusters (YARN is optional)
          • Possibly allow multiple "Drill clusters" (independent clusters of drill bits) on the same YARN-managed physical cluster.
          Show
          Paul Rogers added a comment - A brief "starter set" of requirements: Configuration file to gather the cluster configuration (memory, cores, number of nodes and so on.) Launcher to start/stop Drill within YARN Drill-specific Application Master (AM) AM requests YARN Node Manager (AM) to launch drill-bits. Use YARN localization feature to depoy Drill files to each node. Add nodes (drill-bits) to a running Drill cluster Remove nodes from a running Drill cluster (see DRILL-2656 ) Detect and restart failed drill-bits Status/statistics about the cluster as a whole (number of active nodes, number of restarts, etc.) Allow existing users to run "unmanaged" Drill clusters (YARN is optional) Possibly allow multiple "Drill clusters" (independent clusters of drill bits) on the same YARN-managed physical cluster.
          Hide
          Billie Rinaldi added a comment -

          Have you considered using Slider instead of writing a new AM for Drill? Slider would take care of most of these bullets already, without requiring you to write new Java code. If there end up being new features Drill would need, I imagine the Slider folks would be receptive to adding those.

          Show
          Billie Rinaldi added a comment - Have you considered using Slider instead of writing a new AM for Drill? Slider would take care of most of these bullets already, without requiring you to write new Java code. If there end up being new features Drill would need, I imagine the Slider folks would be receptive to adding those.
          Hide
          Paul Rogers added a comment -

          We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN:

          1. Slider has much documentation, but it is incomplete and out-of-date in important places.
          2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly.
          3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc.
          4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins.
          5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage.

          In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps.

          On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds.

          Show
          Paul Rogers added a comment - We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN: 1. Slider has much documentation, but it is incomplete and out-of-date in important places. 2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly. 3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc. 4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins. 5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage. In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps. On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds.
          Hide
          Matt Pollock added a comment -

          Any progress update? My organization won't support use of Drill until this is done.

          Show
          Matt Pollock added a comment - Any progress update? My organization won't support use of Drill until this is done.
          Hide
          Jacques Nadeau added a comment -

          Hey Paul & Billie, if the Slider community co-implemented this with the Drill folk, it would probably allow Slider to support more use cases and bring us to a shared approach rather than two separate codebases. Do you think that anyone from the Slider community would be able to spend substantial time against this to address the Drill needs?

          Show
          Jacques Nadeau added a comment - Hey Paul & Billie, if the Slider community co-implemented this with the Drill folk, it would probably allow Slider to support more use cases and bring us to a shared approach rather than two separate codebases. Do you think that anyone from the Slider community would be able to spend substantial time against this to address the Drill needs?
          Hide
          Paul Rogers added a comment -

          Worth a discussion. Is Slider still the "go to" option, or has effort shifted to Twill?

          As it turns out, the actual YARN integration was not a big effort. Rather, most of the effort is around modifying Drill itself to play well with YARN, and implementing the management aspects unique to YARN.

          Show
          Paul Rogers added a comment - Worth a discussion. Is Slider still the "go to" option, or has effort shifted to Twill? As it turns out, the actual YARN integration was not a big effort. Rather, most of the effort is around modifying Drill itself to play well with YARN, and implementing the management aspects unique to YARN.
          Hide
          Paul Rogers added a comment -

          Good progress is being made. Our tentative goal is the Drill 1.8 release for an initial integration. The goal is:

          YARN support in Drill 1.8 enables admins to migrate their existing Drill cluster to run under YARN. The admin simply identifies the nodes on which Drill should run, identifies the required container sizes, and brings up the Drill cluster under YARN. YARN manages resource allocations for Drill alongside those of other YARN applications. Drill-on-YARN monitors Drill-bits and automatically restarts any that fail.

          We'll have "experimental" support for starting/stopping Drill-bits. Starting bits is easy. Stopping is a bit of a challenge because we lack DRILL-2656.

          Show
          Paul Rogers added a comment - Good progress is being made. Our tentative goal is the Drill 1.8 release for an initial integration. The goal is: YARN support in Drill 1.8 enables admins to migrate their existing Drill cluster to run under YARN. The admin simply identifies the nodes on which Drill should run, identifies the required container sizes, and brings up the Drill cluster under YARN. YARN manages resource allocations for Drill alongside those of other YARN applications. Drill-on-YARN monitors Drill-bits and automatically restarts any that fail. We'll have "experimental" support for starting/stopping Drill-bits. Starting bits is easy. Stopping is a bit of a challenge because we lack DRILL-2656 .
          Hide
          Matt Pollock added a comment -

          Thanks much.

          Show
          Matt Pollock added a comment - Thanks much.
          Hide
          Josh Elser added a comment -

          "substantial time" is definitely hard to sign up for, but I'd be happy to try to help out where/when at all possible.

          Show
          Josh Elser added a comment - "substantial time" is definitely hard to sign up for, but I'd be happy to try to help out where/when at all possible.
          Hide
          Paul Rogers added a comment - - edited

          Attached is a short three-page, high-level overview of the design of the Drill-on-YARN feature planned for Drill 1.8. Outlines the design approach, user experience and major components.

          If you are a YARN user, and want to run Drill under YARN, please give this outline a read to see if the feature will suit your needs. What is missing? What additional features are needed now or in a later release?

          Show
          Paul Rogers added a comment - - edited Attached is a short three-page, high-level overview of the design of the Drill-on-YARN feature planned for Drill 1.8. Outlines the design approach, user experience and major components. If you are a YARN user, and want to run Drill under YARN, please give this outline a read to see if the feature will suit your needs. What is missing? What additional features are needed now or in a later release?
          Hide
          Paul Rogers added a comment -

          Near-final version of the Drill-on-YARN user guide. The material here will be merged into Drill's formal documentation. But, until then, this material explains what DoY is and how to use it.

          Show
          Paul Rogers added a comment - Near-final version of the Drill-on-YARN user guide. The material here will be merged into Drill's formal documentation. But, until then, this material explains what DoY is and how to use it.
          Hide
          ASF GitHub Bot added a comment -

          GitHub user paul-rogers opened a pull request:

          https://github.com/apache/drill/pull/542

          DRILL 4581 Final

          Extensive revisions to the Drill launch scripts to fix a number of
          bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests
          will be merged as part of the Drill-on-YARN (DRILL-1170) work.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/paul-rogers/drill DRILL-4581-Final

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/drill/pull/542.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #542


          commit abbfe84e35517c37f59a507694a4f0224137d2b8
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-04-08T21:04:40Z

          Merge remote-tracking branch 'apache/master'

          commit e68ab2d8c50ce4d64631f410cbcbe7f4e98aeb8c
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-04-13T23:27:06Z

          Merge remote-tracking branch 'apache/master'

          commit ce7d43be1808690f80f6a3c0a826ffb2725ef817
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-07-11T02:55:09Z

          Merge remote-tracking branch 'apache/master'

          commit be6564522fe46003d5f5c760ee41fcf71c432a56
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-07-11T03:24:46Z

          DRILL-4581

          Extensive revisions to the Drill launch scripts to fix a number of
          bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests
          will be merged as part of the Drill-on-YARN (DRILL-1170) work.


          Show
          ASF GitHub Bot added a comment - GitHub user paul-rogers opened a pull request: https://github.com/apache/drill/pull/542 DRILL 4581 Final Extensive revisions to the Drill launch scripts to fix a number of bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests will be merged as part of the Drill-on-YARN ( DRILL-1170 ) work. You can merge this pull request into a Git repository by running: $ git pull https://github.com/paul-rogers/drill DRILL-4581 -Final Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/542.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #542 commit abbfe84e35517c37f59a507694a4f0224137d2b8 Author: Paul Rogers <progers@maprtech.com> Date: 2016-04-08T21:04:40Z Merge remote-tracking branch 'apache/master' commit e68ab2d8c50ce4d64631f410cbcbe7f4e98aeb8c Author: Paul Rogers <progers@maprtech.com> Date: 2016-04-13T23:27:06Z Merge remote-tracking branch 'apache/master' commit ce7d43be1808690f80f6a3c0a826ffb2725ef817 Author: Paul Rogers <progers@maprtech.com> Date: 2016-07-11T02:55:09Z Merge remote-tracking branch 'apache/master' commit be6564522fe46003d5f5c760ee41fcf71c432a56 Author: Paul Rogers <progers@maprtech.com> Date: 2016-07-11T03:24:46Z DRILL-4581 Extensive revisions to the Drill launch scripts to fix a number of bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests will be merged as part of the Drill-on-YARN ( DRILL-1170 ) work.
          Hide
          ASF GitHub Bot added a comment -

          Github user paul-rogers closed the pull request at:

          https://github.com/apache/drill/pull/542

          Show
          ASF GitHub Bot added a comment - Github user paul-rogers closed the pull request at: https://github.com/apache/drill/pull/542

            People

            • Assignee:
              Paul Rogers
              Reporter:
              Neeraja
            • Votes:
              5 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated:

                Development