Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5186

mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.4-alpha, 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: job submission
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      CombineFileInputFormat can easily create splits that can come from many different locations (during the last pass of creating "global" splits). However, we observe that this often runs afoul of the mapreduce.job.max.split.locations check that's done by JobSplitWriter.

      The default value for mapreduce.job.max.split.locations is 10, and with any decent size cluster, CombineFileInputFormat creates splits that are well above this limit.

        Attachments

        1. MAPREDUCE-5186v1.patch
          5 kB
          Robert Parker
        2. MAPREDUCE-5186v2.patch
          10 kB
          Robert Parker
        3. MAPREDUCE-5186v3.patch
          14 kB
          Jason Darrell Lowe
        4. MAPREDUCE-5186v3.patch
          14 kB
          Jason Darrell Lowe

          Issue Links

            Activity

              People

              • Assignee:
                robsparker Robert Parker
                Reporter:
                sjlee0 Sangjin Lee
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: