Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7148

Fast fail jobs when exceeds dfs quota limitation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.0, 2.8.0, 2.9.0
    • 3.3.0
    • task
    • None
    • hadoop 2.7.3

    • Reviewed

    Description

      We are running hive jobs with a DFS quota limitation per job(3TB). If a job hits DFS quota limitation, the task that hit it will fail and there will be a few task reties before the job actually fails. The retry is not very helpful because the job will always fail anyway. In some worse cases, we have a job which has a single reduce task writing more than 3TB to HDFS over 20 hours, the reduce task exceeds the quota limitation and retries 4 times until the job fails in the end thus consuming a lot of unnecessary resource. This ticket aims at providing the feature to let a job fail fast when it writes too much data to the DFS and exceeds the DFS quota limitation. The fast fail feature is introduced in MAPREDUCE-7022 and MAPREDUCE-6489 .

      Attachments

        1. MAPREDUCE-7148.010.patch
          13 kB
          Wang Yan
        2. MAPREDUCE-7148.009.patch
          13 kB
          Wang Yan
        3. MAPREDUCE-7148.008.patch
          13 kB
          Wang Yan
        4. MAPREDUCE-7148.007.patch
          15 kB
          Wang Yan
        5. MAPREDUCE-7148.006.patch
          15 kB
          Wang Yan
        6. MAPREDUCE-7148.005.patch
          9 kB
          Wang Yan
        7. MAPREDUCE-7148.004.patch
          8 kB
          Wang Yan
        8. MAPREDUCE-7148.003.patch
          8 kB
          Wang Yan
        9. MAPREDUCE-7148.002.patch
          8 kB
          Wang Yan
        10. MAPREDUCE-7148.001.patch
          5 kB
          Wang Yan

        Issue Links

          Activity

            People

              tiana528 Wang Yan
              tiana528 Wang Yan
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: