Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4079

Retrospect on the decision of making yarn.dispatcher.exit-on-error as true explicitly in daemons

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.7.1
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:
      None

      Description

      Currently in all daemons this config is explicitly set to true so that daemons can crash instead of hanging around. While this seems to be correct, as a recoverable exception should be caught and handled and NOT leaked through to AsyncDispatcher. And a non recoverable one should lead to a crash anyways.

      But this can make system more fragile in case we miss to catch all recoverable exceptions.

      Currently we do not even have an option of setting it to false in configuration, even if we would want.

      Probably we can read this value from configuration and set it to true in daemons if not configured.
      This way in production clusters if there is an exception which is leading to the daemon crashing frequently and we find that its unavoidable but not a very big issue(i.e daemon can still work normally for most part), we can atleast set the configuration to false in config file.

        Attachments

          Activity

            People

            • Assignee:
              varun_saxena Varun Saxena
              Reporter:
              varun_saxena Varun Saxena
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: