Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4927

Part files on the output filesystem are created irrespective of whether the corresponding task has anything to write there

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      All output part files are created regardless of whether the corresponding task has output.

      Description

      When OutputFormat.getRecordWriter is invoked, a part file is created on the output filesystem. But the created RecordWriter is not used until the OutputCollector.collect call is made by the task (user's code). This results in empty part files even if the OutputCollector.collect is never invoked by the corresponding tasks.

        Attachments

        1. hadoop-4927.patch
          15 kB
          Jothi Padmanabhan
        2. hadoop-4927-v1.patch
          17 kB
          Jothi Padmanabhan
        3. hadoop-4927-v2.patch
          18 kB
          Jothi Padmanabhan
        4. hadoop-4927-v3.patch
          24 kB
          Jothi Padmanabhan
        5. hadoop-4927-v4.patch
          33 kB
          Jothi Padmanabhan
        6. hadoop-4927-v5.patch
          34 kB
          Jothi Padmanabhan
        7. hadoop-4927-v6.patch
          34 kB
          Jothi Padmanabhan
        8. hadoop-4927-y20.patch
          34 kB
          Jothi Padmanabhan

          Activity

            People

            • Assignee:
              jothipn Jothi Padmanabhan
              Reporter:
              devaraj Devaraj Das
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: