Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2779

JobSplitWriter.java can't handle large job.split file

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0, 0.22.0, 0.23.0
    • Fix Version/s: 0.22.0, 0.23.0, 0.24.0
    • Component/s: job submission
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We use cascading MultiInputFormat. MultiInputFormat sometimes generates big job.split used internally by hadoop, sometimes it can go beyond 2GB.

      In JobSplitWriter.java, the function that generates such file uses 32bit signed integer to compute offset into job.split.

      writeNewSplits
      ...
      int prevCount = out.size();
      ...
      int currCount = out.size();

      writeOldSplits
      ...
      long offset = out.size();
      ...
      int currLen = out.size();

      1. MAPREDUCE-2779-0.22.patch
        2 kB
        Ming Ma
      2. MAPREDUCE-2779-trunk.patch
        2 kB
        Konstantin Shvachko
      3. MAPREDUCE-2779-trunk.patch
        2 kB
        Ming Ma

        Issue Links

          Activity

          Ming Ma created issue -
          Ming Ma made changes -
          Field Original Value New Value
          Attachment MAPREDUCE-2779-trunk.patch [ 12489431 ]
          Ming Ma made changes -
          Affects Version/s 0.22.0 [ 12314184 ]
          Affects Version/s 0.23.0 [ 12315570 ]
          Ming Ma made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 0.20.205.0 [ 12316391 ]
          Arun C Murthy made changes -
          Assignee Arun C Murthy [ acmurthy ]
          Arun C Murthy made changes -
          Assignee Arun C Murthy [ acmurthy ] Ming Ma [ mingma ]
          Konstantin Shvachko made changes -
          Fix Version/s 0.22.0 [ 12314184 ]
          Ming Ma made changes -
          Attachment MAPREDUCE-2779-0.22.patch [ 12497098 ]
          Konstantin Shvachko made changes -
          Attachment MAPREDUCE-2779-trunk.patch [ 12497108 ]
          Konstantin Shvachko made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s 0.23.0 [ 12315570 ]
          Fix Version/s 0.24.0 [ 12317654 ]
          Resolution Fixed [ 1 ]
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Karthik Kambatla (Inactive) made changes -
          Link This issue is cloned as MAPREDUCE-4434 [ MAPREDUCE-4434 ]

            People

            • Assignee:
              Ming Ma
              Reporter:
              Ming Ma
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development