Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7486

Handling Cluster Storage Capacity Exceeded Exception with Enhanced Logging

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.6
    • None
    • mapreduce-client
    • None

    Description

      The existing reportError method in YarnChild.java is responsible for handling exceptions during job execution. However, when the exception is due to the cluster storage capacity being exceeded, the method lacks sufficient logging, especially in cases where the job is not configured to fast fail. This can make it difficult for users to understand why a job did not fail immediately when the storage capacity was exceeded. The enhancement adds detailed logging to inform users about the configuration that prevents fast failure.
       
      Expected Behavior:
      When a ClusterStorageCapacityExceededException is encountered, the system should log whether the job is configured to fail fast. If fast fail is disabled, the log should advise users on how to enable it.
       
      How-to-Fix:
      We propose to expose such a relationship by logging.

      Attachments

        1. original-vs-log-enhanced.md
          4 kB
          LoggingResearch
        2. TestYarnChild.java
          0.7 kB
          LoggingResearch

        Activity

          People

            Unassigned Unassigned
            loggingresearch LoggingResearch
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: