Details

    • Type: New Feature
    • Status: Reviewable
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Future
    • Component/s: None
    • Labels:
      None

      Description

      This is a tracking item to make Drill work with YARN.
      Below are few requirements/needs to consider.

      • Drill should run as an YARN based application, side by side with other YARN enabled applications (on same nodes or different nodes). Both memory and CPU resources of Drill should be controlled in this mechanism.
      • As an YARN enabled application, Drill resource consumption should be adaptive to the load on the cluster. For ex: When there is no load on the Drill , Drill should consume no resources on the cluster. As the load on Drill increases, resources permitting, usage should grow proportionally.
      • Low latency is a key requirement for Apache Drill along with support for multiple users (concurrency in 100s-1000s). This should be supported when run as YARN application as well.
      1. Drill-on-YARNDesignOverview.pdf
        91 kB
        Paul Rogers
      2. Drill-on-YARNUserGuide.pdf
        600 kB
        Paul Rogers

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user paul-rogers opened a pull request:

          https://github.com/apache/drill/pull/1011

          Drill 1170: Drill-on-YARN

          Provides Drill integration with YARN. Runs Drill as a long-running task under YARN. Monitors the Drill cluster, restarting failed Drillbits. Provides a command-line UI to start, stop and resize the cluster. Provides a web-based UI to monitor the cluster.

          The Drill-on-YARN (DoY) code has been in use by commercial users for over a year, since Drill 1.8 and has proven quite stable. Usage has been on MapR's version of YARN, we seek feedback from users of the Apache and other versions of YARN.

          See DRILL-1170(https://issues.apache.org/jira/browse/DRILL-1170) for design information. See the included `README.md`` for internals information and `USAGE.md` for a detailed user guide.

          This is a large PR; it will take time to review. The key goal at this moment is to allow interested users to download the PR, build DoY, and try it out in their environments. The DoY code is mostly independent of Drill itself. The DoY code can be used to launch any version of Drill since 1.8. See the usage guide for information.

          It has been suggested that the code move to the `contrib` directory. That change will be made. But, since the code works successfully in its current location; we'll leave it their for now to ensure users are successful if they choose to try it.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/paul-rogers/drill DRILL-1170

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/drill/pull/1011.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #1011


          commit e26012bc56cad3bf2819dff3bbdf70664de34955
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:24:00Z

          DRILL-1170: YARN integration for Drill

          This commit includes documentation files.

          commit 3a6ffe78d9fe0e9a5beacd100e2e0ee6b40c7f34
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:25:34Z

          Client app

          commit 509410c9710fc1ff23a4a13f51320a4c153f9328
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:26:50Z

          Files common to several modules

          commit 36b8d323118077115ef1097ba8b23f6fc4a5390a
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:42:17Z

          Application master

          commit 7fbc387634bb1acaf7807b02348b40364c81d282
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:44:33Z

          App Master web UI

          commit 21fb93792290625b719899a4742573b8c3d4a7ce
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:45:44Z

          Distribution and project files

          commit 567d36787b9ada60dd2141077e629158c53fc0c4
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2017-10-26T07:47:54Z

          Test files


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user paul-rogers opened a pull request: https://github.com/apache/drill/pull/1011 Drill 1170: Drill-on-YARN Provides Drill integration with YARN. Runs Drill as a long-running task under YARN. Monitors the Drill cluster, restarting failed Drillbits. Provides a command-line UI to start, stop and resize the cluster. Provides a web-based UI to monitor the cluster. The Drill-on-YARN (DoY) code has been in use by commercial users for over a year, since Drill 1.8 and has proven quite stable. Usage has been on MapR's version of YARN, we seek feedback from users of the Apache and other versions of YARN. See DRILL-1170 ( https://issues.apache.org/jira/browse/DRILL-1170 ) for design information. See the included `README.md`` for internals information and `USAGE.md` for a detailed user guide. This is a large PR; it will take time to review. The key goal at this moment is to allow interested users to download the PR, build DoY, and try it out in their environments. The DoY code is mostly independent of Drill itself. The DoY code can be used to launch any version of Drill since 1.8. See the usage guide for information. It has been suggested that the code move to the `contrib` directory. That change will be made. But, since the code works successfully in its current location; we'll leave it their for now to ensure users are successful if they choose to try it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/paul-rogers/drill DRILL-1170 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1011.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1011 commit e26012bc56cad3bf2819dff3bbdf70664de34955 Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:24:00Z DRILL-1170 : YARN integration for Drill This commit includes documentation files. commit 3a6ffe78d9fe0e9a5beacd100e2e0ee6b40c7f34 Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:25:34Z Client app commit 509410c9710fc1ff23a4a13f51320a4c153f9328 Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:26:50Z Files common to several modules commit 36b8d323118077115ef1097ba8b23f6fc4a5390a Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:42:17Z Application master commit 7fbc387634bb1acaf7807b02348b40364c81d282 Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:44:33Z App Master web UI commit 21fb93792290625b719899a4742573b8c3d4a7ce Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:45:44Z Distribution and project files commit 567d36787b9ada60dd2141077e629158c53fc0c4 Author: Paul Rogers <progers@maprtech.com> Date: 2017-10-26T07:47:54Z Test files
          Hide
          paul-rogers Paul Rogers added a comment -

          DoY is quasi-closed source: only in the sense that we have no PR, the goal remains that we want to contribute it to Apache Drill. DoY is is hosted in my own personal, open repo. I will go ahead and rebase the code onto the latest master, then post the repo here. I like your idea of doing a PR. Once I have the rebased branch, I'll post a PR for it.

          Show
          paul-rogers Paul Rogers added a comment - DoY is quasi-closed source: only in the sense that we have no PR, the goal remains that we want to contribute it to Apache Drill. DoY is is hosted in my own personal, open repo. I will go ahead and rebase the code onto the latest master, then post the repo here. I like your idea of doing a PR. Once I have the rebased branch, I'll post a PR for it.
          Hide
          oae Johannes Zillmann added a comment -

          So DoY is currently closed source ?
          Sure i can code-review, but since i'm not part of the Drill project and do know the guidelines, etc.. i'm not sure if that would be of help.
          Anyway, even having DoY as part of a pull request or anyway available would help me to try it our on YARN quickly!

          Show
          oae Johannes Zillmann added a comment - So DoY is currently closed source ? Sure i can code-review, but since i'm not part of the Drill project and do know the guidelines, etc.. i'm not sure if that would be of help. Anyway, even having DoY as part of a pull request or anyway available would help me to try it our on YARN quickly!
          Hide
          paul-rogers Paul Rogers added a comment -

          Our original intent was to contribute DoY to the Apache Drill project. However, it is ~12K lines of code, which was too much for the project committers to review. So, we postponed the contribution until we could find someone sufficiently motivated to do a code review. Of course, the task of the reviewer should be simpler now; DoY has been in use for about 9+ months so there should not be too many obvious bugs to find...

          That said, if you can act as a first-line code reviewer, I can restart the submission process so that DoY becomes part of Apache Drill.

          Show
          paul-rogers Paul Rogers added a comment - Our original intent was to contribute DoY to the Apache Drill project. However, it is ~12K lines of code, which was too much for the project committers to review. So, we postponed the contribution until we could find someone sufficiently motivated to do a code review. Of course, the task of the reviewer should be simpler now; DoY has been in use for about 9+ months so there should not be too many obvious bugs to find... That said, if you can act as a first-line code reviewer, I can restart the submission process so that DoY becomes part of Apache Drill.
          Hide
          oae Johannes Zillmann added a comment -

          Curious, seems like for an MapR distro one can run Drill on top of YARN http://maprdocs.mapr.com/51/Drill/drill_on_yarn.html
          Is that available for the open source version on top of other Hadoop distros ?

          Show
          oae Johannes Zillmann added a comment - Curious, seems like for an MapR distro one can run Drill on top of YARN http://maprdocs.mapr.com/51/Drill/drill_on_yarn.html Is that available for the open source version on top of other Hadoop distros ?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user paul-rogers closed the pull request at:

          https://github.com/apache/drill/pull/542

          Show
          githubbot ASF GitHub Bot added a comment - Github user paul-rogers closed the pull request at: https://github.com/apache/drill/pull/542
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user paul-rogers opened a pull request:

          https://github.com/apache/drill/pull/542

          DRILL 4581 Final

          Extensive revisions to the Drill launch scripts to fix a number of
          bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests
          will be merged as part of the Drill-on-YARN (DRILL-1170) work.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/paul-rogers/drill DRILL-4581-Final

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/drill/pull/542.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #542


          commit abbfe84e35517c37f59a507694a4f0224137d2b8
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-04-08T21:04:40Z

          Merge remote-tracking branch 'apache/master'

          commit e68ab2d8c50ce4d64631f410cbcbe7f4e98aeb8c
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-04-13T23:27:06Z

          Merge remote-tracking branch 'apache/master'

          commit ce7d43be1808690f80f6a3c0a826ffb2725ef817
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-07-11T02:55:09Z

          Merge remote-tracking branch 'apache/master'

          commit be6564522fe46003d5f5c760ee41fcf71c432a56
          Author: Paul Rogers <progers@maprtech.com>
          Date: 2016-07-11T03:24:46Z

          DRILL-4581

          Extensive revisions to the Drill launch scripts to fix a number of
          bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests
          will be merged as part of the Drill-on-YARN (DRILL-1170) work.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user paul-rogers opened a pull request: https://github.com/apache/drill/pull/542 DRILL 4581 Final Extensive revisions to the Drill launch scripts to fix a number of bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests will be merged as part of the Drill-on-YARN ( DRILL-1170 ) work. You can merge this pull request into a Git repository by running: $ git pull https://github.com/paul-rogers/drill DRILL-4581 -Final Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/542.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #542 commit abbfe84e35517c37f59a507694a4f0224137d2b8 Author: Paul Rogers <progers@maprtech.com> Date: 2016-04-08T21:04:40Z Merge remote-tracking branch 'apache/master' commit e68ab2d8c50ce4d64631f410cbcbe7f4e98aeb8c Author: Paul Rogers <progers@maprtech.com> Date: 2016-04-13T23:27:06Z Merge remote-tracking branch 'apache/master' commit ce7d43be1808690f80f6a3c0a826ffb2725ef817 Author: Paul Rogers <progers@maprtech.com> Date: 2016-07-11T02:55:09Z Merge remote-tracking branch 'apache/master' commit be6564522fe46003d5f5c760ee41fcf71c432a56 Author: Paul Rogers <progers@maprtech.com> Date: 2016-07-11T03:24:46Z DRILL-4581 Extensive revisions to the Drill launch scripts to fix a number of bugs, and to prepare the scripts for use in Drill-on-YARN. Unit tests will be merged as part of the Drill-on-YARN ( DRILL-1170 ) work.
          Hide
          paul-rogers Paul Rogers added a comment -

          Near-final version of the Drill-on-YARN user guide. The material here will be merged into Drill's formal documentation. But, until then, this material explains what DoY is and how to use it.

          Show
          paul-rogers Paul Rogers added a comment - Near-final version of the Drill-on-YARN user guide. The material here will be merged into Drill's formal documentation. But, until then, this material explains what DoY is and how to use it.
          Hide
          paul-rogers Paul Rogers added a comment - - edited

          Attached is a short three-page, high-level overview of the design of the Drill-on-YARN feature planned for Drill 1.8. Outlines the design approach, user experience and major components.

          If you are a YARN user, and want to run Drill under YARN, please give this outline a read to see if the feature will suit your needs. What is missing? What additional features are needed now or in a later release?

          Show
          paul-rogers Paul Rogers added a comment - - edited Attached is a short three-page, high-level overview of the design of the Drill-on-YARN feature planned for Drill 1.8. Outlines the design approach, user experience and major components. If you are a YARN user, and want to run Drill under YARN, please give this outline a read to see if the feature will suit your needs. What is missing? What additional features are needed now or in a later release?
          Hide
          elserj Josh Elser added a comment -

          "substantial time" is definitely hard to sign up for, but I'd be happy to try to help out where/when at all possible.

          Show
          elserj Josh Elser added a comment - "substantial time" is definitely hard to sign up for, but I'd be happy to try to help out where/when at all possible.
          Hide
          mpollock Matt Pollock added a comment -

          Thanks much.

          Show
          mpollock Matt Pollock added a comment - Thanks much.
          Hide
          paul-rogers Paul Rogers added a comment -

          Good progress is being made. Our tentative goal is the Drill 1.8 release for an initial integration. The goal is:

          YARN support in Drill 1.8 enables admins to migrate their existing Drill cluster to run under YARN. The admin simply identifies the nodes on which Drill should run, identifies the required container sizes, and brings up the Drill cluster under YARN. YARN manages resource allocations for Drill alongside those of other YARN applications. Drill-on-YARN monitors Drill-bits and automatically restarts any that fail.

          We'll have "experimental" support for starting/stopping Drill-bits. Starting bits is easy. Stopping is a bit of a challenge because we lack DRILL-2656.

          Show
          paul-rogers Paul Rogers added a comment - Good progress is being made. Our tentative goal is the Drill 1.8 release for an initial integration. The goal is: YARN support in Drill 1.8 enables admins to migrate their existing Drill cluster to run under YARN. The admin simply identifies the nodes on which Drill should run, identifies the required container sizes, and brings up the Drill cluster under YARN. YARN manages resource allocations for Drill alongside those of other YARN applications. Drill-on-YARN monitors Drill-bits and automatically restarts any that fail. We'll have "experimental" support for starting/stopping Drill-bits. Starting bits is easy. Stopping is a bit of a challenge because we lack DRILL-2656 .
          Hide
          paul-rogers Paul Rogers added a comment -

          Worth a discussion. Is Slider still the "go to" option, or has effort shifted to Twill?

          As it turns out, the actual YARN integration was not a big effort. Rather, most of the effort is around modifying Drill itself to play well with YARN, and implementing the management aspects unique to YARN.

          Show
          paul-rogers Paul Rogers added a comment - Worth a discussion. Is Slider still the "go to" option, or has effort shifted to Twill? As it turns out, the actual YARN integration was not a big effort. Rather, most of the effort is around modifying Drill itself to play well with YARN, and implementing the management aspects unique to YARN.
          Hide
          jnadeau Jacques Nadeau added a comment -

          Hey Paul & Billie, if the Slider community co-implemented this with the Drill folk, it would probably allow Slider to support more use cases and bring us to a shared approach rather than two separate codebases. Do you think that anyone from the Slider community would be able to spend substantial time against this to address the Drill needs?

          Show
          jnadeau Jacques Nadeau added a comment - Hey Paul & Billie, if the Slider community co-implemented this with the Drill folk, it would probably allow Slider to support more use cases and bring us to a shared approach rather than two separate codebases. Do you think that anyone from the Slider community would be able to spend substantial time against this to address the Drill needs?
          Hide
          mpollock Matt Pollock added a comment -

          Any progress update? My organization won't support use of Drill until this is done.

          Show
          mpollock Matt Pollock added a comment - Any progress update? My organization won't support use of Drill until this is done.
          Hide
          paul-rogers Paul Rogers added a comment -

          We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN:

          1. Slider has much documentation, but it is incomplete and out-of-date in important places.
          2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly.
          3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc.
          4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins.
          5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage.

          In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps.

          On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds.

          Show
          paul-rogers Paul Rogers added a comment - We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN: 1. Slider has much documentation, but it is incomplete and out-of-date in important places. 2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly. 3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc. 4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins. 5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage. In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps. On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds.
          Hide
          billie.rinaldi Billie Rinaldi added a comment -

          Have you considered using Slider instead of writing a new AM for Drill? Slider would take care of most of these bullets already, without requiring you to write new Java code. If there end up being new features Drill would need, I imagine the Slider folks would be receptive to adding those.

          Show
          billie.rinaldi Billie Rinaldi added a comment - Have you considered using Slider instead of writing a new AM for Drill? Slider would take care of most of these bullets already, without requiring you to write new Java code. If there end up being new features Drill would need, I imagine the Slider folks would be receptive to adding those.
          Hide
          paul-rogers Paul Rogers added a comment -

          A brief "starter set" of requirements:

          • Configuration file to gather the cluster configuration (memory, cores, number of nodes and so on.)
          • Launcher to start/stop Drill within YARN
          • Drill-specific Application Master (AM)
          • AM requests YARN Node Manager (AM) to launch drill-bits.
          • Use YARN localization feature to depoy Drill files to each node.
          • Add nodes (drill-bits) to a running Drill cluster
          • Remove nodes from a running Drill cluster (see DRILL-2656)
          • Detect and restart failed drill-bits
          • Status/statistics about the cluster as a whole (number of active nodes, number of restarts, etc.)
          • Allow existing users to run "unmanaged" Drill clusters (YARN is optional)
          • Possibly allow multiple "Drill clusters" (independent clusters of drill bits) on the same YARN-managed physical cluster.
          Show
          paul-rogers Paul Rogers added a comment - A brief "starter set" of requirements: Configuration file to gather the cluster configuration (memory, cores, number of nodes and so on.) Launcher to start/stop Drill within YARN Drill-specific Application Master (AM) AM requests YARN Node Manager (AM) to launch drill-bits. Use YARN localization feature to depoy Drill files to each node. Add nodes (drill-bits) to a running Drill cluster Remove nodes from a running Drill cluster (see DRILL-2656 ) Detect and restart failed drill-bits Status/statistics about the cluster as a whole (number of active nodes, number of restarts, etc.) Allow existing users to run "unmanaged" Drill clusters (YARN is optional) Possibly allow multiple "Drill clusters" (independent clusters of drill bits) on the same YARN-managed physical cluster.
          Hide
          huangjs Jianshi Huang added a comment -

          Any progress on YARN support?

          Jianshi

          Show
          huangjs Jianshi Huang added a comment - Any progress on YARN support? Jianshi

            People

            • Assignee:
              Paul.Rogers Paul Rogers
              Reporter:
              Neeraja Neeraja
            • Votes:
              6 Vote for this issue
              Watchers:
              22 Start watching this issue

              Dates

              • Created:
                Updated:

                Development