Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-1360

Rework Distributed Shell to be a better model of how people should write YARN applications

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Labels:
      None

      Description

      Distributed Shell works as an example, but it's not the right architecture for something you'd want in production -instead its design runs a risk of setting a bad example for others to follow.

      Instead it should

      • be decomposed into a set of services each with their own responsibilities
      • split the 'model' of its cluster into its own classes, a model that can then be unit tested outside of the AM
      • factor out all container launching into its own service -and use a thread pool to avoid scalability limits
      • provide a demonstration (classic) RPC service to show how to implement this
      • tests

        Activity

        Hide
        stevel@apache.org Steve Loughran added a comment -

        I'm not volunteering to do this, though I now know what it should look like -having just reworked my YARN application to the extent it no longer resembles the Distributed Shell -and instead has all the features I've just listed above as requirements.

        The problem with the dist shell today is that it is "the simple" example, with mapreduce being way to complicated to go near. Yet a lot of the "real" requirements of a YARN app lurk in the MR code, not in distributed shell, while the architecture of the shell is exactly what you don't want for testing and maintenance - everything in the AM class.

        Show
        stevel@apache.org Steve Loughran added a comment - I'm not volunteering to do this, though I now know what it should look like -having just reworked my YARN application to the extent it no longer resembles the Distributed Shell -and instead has all the features I've just listed above as requirements. The problem with the dist shell today is that it is "the simple" example, with mapreduce being way to complicated to go near. Yet a lot of the "real" requirements of a YARN app lurk in the MR code, not in distributed shell, while the architecture of the shell is exactly what you don't want for testing and maintenance - everything in the AM class.

          People

          • Assignee:
            Unassigned
            Reporter:
            stevel@apache.org Steve Loughran
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development